Exploring the Limits of Small Language Models

Lee, Nicholas; Keutzer, Kurt; Anumanchipalli, Gopala Krishna

PDF

Description

With the emergence of a plethora of Large Language Models (LLMs) to date, the future of having LLMs run locally at the edge has come closer and closer with every passing day. However, there has not been as much work on smaller language models that can potentially solve tasks where it would be inefficient to run a full LLM at scale. In this paper, we explore Small Language Models (SLMs) and how we can make them more efficient at the edge without sacrificing performance. Pruning or simplifying SLMs can cause a significant degradation of downstream performance. To this end, we investigate weight reparameterization and knowledge distillation as two avenues for these small language models to mitigate these pitfalls. This study investigates the structure of the FFN module in the transformer architecture in order to improve the inference speed of these language models for short sequence length tasks. We also investigate whether we can distill from these LLMs into significantly smaller SLMs in order to take advantage of the plethora of pretrained models available to the public. We find that when simplifying the FFN module, one can use weight reparameterization at training time to help the model converge and improve downstream accuracy. We also find that knowledge distillation may not be a surefire way to improve the downstream model performance as the difference between the model capacities of these LLMs and small language models may be difficult to overcome.

Details

Title

Exploring the Limits of Small Language Models

Creator

Lee, Nicholas, Author
Keutzer, Kurt, Author
Anumanchipalli, Gopala Krishna, Author

Published

EECS Department, University of California at Berkeley, Berkeley, California, 05/12/23

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Other Identifiers

EECS-2023-141

Type

Text

Format

technical reports

Extent

25 p

Language

eng

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket