The answer to many questions in the artificial intelligence realm has been a data-dependent solution, supplemented by deep neural networks (DNNs) as data processors. With an optimal combination of settings, dataset, and architecture, deep networks were introduced to mimic the human brain through artificial neurons, performing a variety of tasks in computer vision (CV) and natural language processing (NLP). They serve as tools in data analytics applications including self-driving, language translation services, medical diagnosis, stock market trading signals and more. It is natural to assume that a network’s representational power must scale in complexity with the tasks or dataset it processes. In practice, however, increasing the amount of data or number of layers and parameters is not always the answer. In certain resource-constrained settings, training deep networks for an extended period of time is not only intractable but also unfavorable; redundancies in the network architecture also have the potential to negatively impact test time performance.
This motivates a more comprehensive view of the inner workings of a deep neural network, taking a deep dive into each of its components. A common approach is to directly examine its weights, but the tradeoff is potentially missing out on information about the network structure. To take the middle ground, we analyze structural characteristics arising from layerwise spectral distributions in order to explain network performance and inform training procedures. We find that (1) allocating learning rate across layers based on measurements of their spectral distribution results in more improvement on “vanilla” architectures such as VGG19, i.e. networks without built-in interactions among layers; and (2) using the same measurements to inform channel pruning on DenseNet40 leads to our model implicitly gaining self-awareness of its “bottleneck” layers to maintain higher accuracies.
Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).