The plethora of data and increasing computational complexity of deep neural networks have led to deep learning schemes that require the use of multiple nodes in a cluster setup. Yet, distributing a computation over multiple machines has two main flaws: (1) it induces a higher risk of failures and (2) it induces heavy communication costs, which can sometimes outweigh the computational gains from using distributed learning. We study methods to tackle both problems by borrowing ideas from sketching and byzantine-worker literature. We show that our algorithm, SketchedRobustAgg, achieves similar runtime (measured by number of iterations) as without using sketching, even though the algorithm sends s-dimensional (where s << d) vectors between worker and parameter server.

At the same time, the plethora of data induces another challenge in privacy. Motivated by recent work around attacking federated learning schemes that demonstrate keeping training data on clients' devices do not provide sufficient privacy, we introduce FastSecAgg. We show that FastSecAgg, a secure aggregation protocol, is efficient in computation and communication, and also robust to client dropouts. FastSecAgg achieves significantly smaller computation cost, while achieving same communication cost asymptotically. We finally show that FastSecAgg performs well against benchmark federated learning datasets, even with aggressive quantization and sketching, and furthermore show empirically that it is possible to control tradeoff between computation/communication complexities and test accuracies.




Download Full History