Date of Award

8-30-2022

Document Type

Open Access Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Duc A. Tran

Second Advisor

Marc Pomplun

Third Advisor

Dan Simovici

Abstract

Federated Learning (FL) is a recent Machine Learning method for training with private data separately stored in local machines without gathering them into one place for central learning. It was born to address the following challenges when applying Machine Learning in practice: (1) Communication cost: Most real-world data that can be useful for training are locally collected; to bring them all to one place for central learning can be expensive, especially in real-time learning applications when time is of the essence, for example, predicting the next word when texting on a smartphone; and (2) Privacy protection: Many applications must protect data privacy, such as those in the healthcare field; the private data can only be seen by its local owner and as such the learning may only use a content-hiding representation of this data, which is much less informative. To fulfill FL’s promise, this dissertation addresses three important problems regarding the need for good training data, system scalability, and uncertainty robustness:

1. The effectiveness of FL depends critically on the quality of the local training data. We should not only incentivize participants who have good training data but also minimize the effect of bad training data on the overall learning procedure. The first problem of my research is to determine a score to value a participant’s contribution. My approach is to compute such a score based on Shapley Value (SV), a concept of cooperative game theory for profit allocation in a coalition game. In this direction, the main challenge is due to the exponential time complexity of the SV computation, which is further complicated by the iterative manner of the FL learning algorithm. I propose a fast and effective valuation method that overcomes this challenge.

2. On scalability, FL depends on a central server for repeated aggregation of local training models, which is prone to become a performance bottleneck. A reasonable approach is to combine FL with Edge Computing: introduce a layer of edge servers to each serve as a regional aggregator to offload the main server. The scalability is thus improved, however at the cost of learning accuracy. The second problem of my research is to optimize this tradeoff. This dissertation shows that this cost can be alleviated with a proper choice of edge server assignment: which edge servers should aggregate the training models from which local machines. Specifically, I propose an assignment solution that is especially useful for the case of non-IID training data which is well-known to hinder today’s FL performance.

3. FL participants may decide on their own what devices they run on, their computing capabilities, and how often they communicate the training model with the aggregation server. The workloads incurred by them are therefore time-varying, and unpredictably. The server capacities are finite and can vary too. The third problem of my research is to compute an edge server assignment that is robust to such dynamics and uncertainties. I propose a stochastic approach to solving this problem.

Share

COinS