PDF

Description

We aimed to study the training dynamics and the internal interpretability of transformer models by formulating an algorithmically generated in-context learning task and training small models that can learn to generalize the task with 100% test accuracy. We found clear indications of phase change behavior that are indicative of emergent abilities, invariant attention patterns across different one-attention-head models, and early determination during training of convergence probability. We found promising future work directions for further studying transformer models, both small models and generalizations on larger models.

Details

Files

Statistics

from
to
Export
Download Full History