System Design for Large Scale Machine Learning

Venkataraman, Shivaram

PDF

Description

The last decade has seen two main trends in the large scale computing: on the one hand we have seen the growth of cloud computing where a number of big data applications are deployed on shared cluster of machines. On the other hand there is a deluge of machine learning algorithms used for applications ranging from image classification, machine translation to graph processing, and scientific analysis on large datasets. In light of these trends, a number of challenges arise in terms of how we program, deploy and achieve high performance for large scale machine learning applications. In this dissertation we study the execution properties of machine learning applications and based on these properties we present the design and implementation of systems that can address the above challenges. We first identify how choosing the appropriate hardware can affect the performance of applications and describe Ernest, an efficient performance prediction scheme that uses experiment design to minimize the cost and time taken for building performance models. We then design scheduling mechanisms that can improve performance using two approaches: first by improving data access time by accounting for locality using data-aware scheduling and then by using scalable scheduling techniques that can reduce coordination overheads.

Details

Title

System Design for Large Scale Machine Learning

Creator

Venkataraman, Shivaram, Author

Published

EECS Department, University of California, University of California at Berkeley, Berkeley, California, December 15, 2017

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Other Identifiers

EECS-2017-219

Type

Text

Format

technical reports

Extent

109 p

Archive

The Engineering Library

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket