Description
To help achieve these goals, we first derive a Fundamental Theorem of Linear Neural Networks, analogous to Gilbert Strang's Fundamental Theorem of Linear Algebra. We show how to decompose each layer of a linear neural network into a set of subspaces that show how information flows through the neural network---in particular, tracing which information is annihilated at which layers of the network, and identifying subspaces that carry no information but might become available to carry information as training modifies the network weights. We summarize properties of these information flows in "basis flow diagrams" that reveal a rich and occasionally surprising structure. Each stratum of the fiber represents a different pattern by which information flows (or fails to flow) through the neural network.
We use this knowledge to find transformations in weight space called moves that allow us to modify the neural network's weights without changing the linear transformation that the network computes. Some moves stay on the same stratum, and some move from one stratum to another stratum of the fiber. In this way, we can visit different weight assignments for which the neural network computes the same transformation. These moves help us to construct a useful basis for the weight space and a useful basis for each space tangent to a stratum.