Factoring Matrices into Linear Neural Networks

Bhattacharya, Sagnik; Shewchuk, Jonathan

PDF

Description

We characterize the topology and geometry of the set of all weight vectors for which a linear neural network computes the same linear transformation W. This set of weight assignments is called the fiber of W, and it is embedded in a Euclidean weight space of all possible weight vectors. The fiber is an algebraic variety with singular points, hence it is not a manifold. We show a way to stratify the fiber---that is, to partition the algebraic variety into a finite set of manifolds of varying dimensions called strata. We derive the dimensions of these strata and the relationships by which they adjoin each other. (Although they are disjoint, some strata lie in the closures of other, higher-dimensional strata.) Each stratum is smoothly embedded in weight space, so it has a well-defined tangent space (which is a subspace of weight space) at every point. We show how to determine the subspace tangent to a specified stratum at a specified point on the stratum, and we construct an elegant basis for that subspace.

To help achieve these goals, we first derive a Fundamental Theorem of Linear Neural Networks, analogous to Gilbert Strang's Fundamental Theorem of Linear Algebra. We show how to decompose each layer of a linear neural network into a set of subspaces that show how information flows through the neural network---in particular, tracing which information is annihilated at which layers of the network, and identifying subspaces that carry no information but might become available to carry information as training modifies the network weights. We summarize properties of these information flows in "basis flow diagrams" that reveal a rich and occasionally surprising structure. Each stratum of the fiber represents a different pattern by which information flows (or fails to flow) through the neural network.

We use this knowledge to find transformations in weight space called moves that allow us to modify the neural network's weights without changing the linear transformation that the network computes. Some moves stay on the same stratum, and some move from one stratum to another stratum of the fiber. In this way, we can visit different weight assignments for which the neural network computes the same transformation. These moves help us to construct a useful basis for the weight space and a useful basis for each space tangent to a stratum.

Details

Title

Factoring Matrices into Linear Neural Networks

Creator

Bhattacharya, Sagnik, Author
Shewchuk, Jonathan, Author

Published

EECS Department, University of California at Berkeley, Berkeley, California, 5/13/2022

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Type

Text

Format

technical reports

Extent

57 p

Language

eng

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket