Measuring Empirical Computational Complexity

Goldsmith, Simon Fredrick; EECS Department, University of California

PDF

Description

Scalability is a fundamental problem in computer science. Computer scientists often describe the scalability of algorithms in the language of theoretical computational complexity, bounding the number of operations an algorithm performs as a function of the size of its input. The main contribution of this dissertation is to provide an analogous description of the scalability of actual software implementations run on realistic workloads.

We propose a method for describing the asymptotic behavior of programs in practice by measuring their empirical computational complexity. Our method involves running a program on workloads spanning several orders of magnitude in size, measuring their performance, and fitting these observations to a model that predicts performance as a function of workload size. Comparing these models to the programmer's expectations or to theoretical asymptotic bounds can reveal performance bugs or confirm that a program's performance scales as expected.

We develop our methodology for constructing these models of empirical complexity as we describe and evaluate two techniques. Our first technique, BB-TrendProf, constructs models that predict how many times each basic block runs as a linear (y = a + b*x) or a powerlaw (y = a*x^b) function of user-specified features of the program's workloads. To present output succinctly and focus attention on scalability-critical code, BB-TrendProf groups and ranks program locations based on these models. We demonstrate the power of BB-TrendProf compared to existing tools by running it on several large programs and reporting cases where its models show (1) an implementation of a complex algorithm scaling as expected, (2) two complex algorithms beating their worst-case theoretical complexity bounds when run on realistic inputs, and (3) a performance bug.

Our second technique, CF-TrendProf, models performance of loops and functions both per-function-invocation and per-workload. It improves upon the precision of BB-TrendProf's models by using control flow to generate candidates from a richer family of models and a novel model selection criteria to select among these candidates. We show that CF-TrendProf's improvements to model generation and selection allow it to correctly characterize or closely approximate the empirical scalability of several well-known algorithms and data structures and to diagnose several synthetic, but realistic, scalability problems without observing an egregiously expensive workload. We also show that CF-TrendProf deals with multiple workload features better than BB-TrendProf. We qualitatively compare the output of BB-TrendProf and CF-TrendProf and discuss their relative strengths and weaknesses.

Details

Title

Measuring Empirical Computational Complexity

Creator

Goldsmith, Simon Fredrick, Author
EECS Department, University of California, Publisher

Published

2009-04-27

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Other Identifiers

EECS-2009-52

Type

Text

Format

technical reports

Extent

200 p

Archive

The Engineering Library

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket