Avoiding Communication in Computing Krylov Subspaces

Demmel, James; Yelick, Katherine A.; Hoemmen, Mark Frederick; Mohiyuddin, Marghoob; EECS Department, University of California

PDF

Description

Our goal is to minimize the communication costs of Krylov Subspace Methods (KSMs) to solve either Ax = b or Ax = lambda x, when A is a large sparse matrix. By communication costs we mean both bandwidth and latency costs, either between processors on a parallel computer, or between levels of a memory hierarchy on a sequential computer. As the cost of communication is growing exponentially relative to computation on modern computers, reducing communication is becoming ever more important. It is possible to reduce communication costs to near their theoretical minima: On a parallel computer this means latency costs are independent of the dimension k of the Krylov subspace, as opposed to growing proportionally to k for a conventional implementation. On a sequential computer, this mean latency and bandwidth costs are independent of k. Achieving this speedup requires a new algorithmic formulation of KSMs.

In this paper we present just part of this new formulation, namely computing a basis of the Krylov subspace spanned by [x,Ax, A^2x, ... A^kx]; other papers will present the rest of the KSM formulation. We present theory, performance models, and computational results: Our parallel performance models of 2D and 3D model problems predict speedups over a conventional algorithm of up to 15x on a model Petaflop machine, and up to 22x on a model of the Grid. Our sequential performance models of the same model problems predict speedups over a conventional algorithm of up to 10x on an out-of-core implementation, and up to 2.5x on Intel Clovertown, where we use our ideas to reduce off-chip latency and bandwidth to DRAM. Finally, we measured a speedup of over 3x on an actual out-of-core implementation.

We also consider the related kernel [Ax,MAx,AMAx,...,A(MA)^{k-1}x,(MA)^kx], which arises in preconditioned KSMs. Under certain mathematical conditions on A and the preconditioner M, we show how to avoid latency and bandwidth for this kernel as well.

Details

Title

Avoiding Communication in Computing Krylov Subspaces

Creator

Demmel, James, Author
Yelick, Katherine A., Author
Hoemmen, Mark Frederick, Author
Mohiyuddin, Marghoob, Author
EECS Department, University of California, Publisher

Published

2007-10-09

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Other Identifiers

EECS-2007-123

Type

Text

Format

technical reports

Extent

95 p

Archive

The Engineering Library

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket