This thesis examines the design of systems and algorithms to support machine learning in the distributed setting. The distributed computing landscape today consists of many domain-specific tools. We argue that these tools underestimate the generality of many modern machine learning applications and hence struggle to support them. We examine the requirements of a system capable of supporting modern machine learning workloads and present a general purpose distributed system architecture for doing so. In addition, we examine several examples of specific distributed learning algorithms. We explore the theoretical properties of these algorithms and see how they can leverage such a system.