Sunday 02 February 2025
A new approach to training language models has been proposed, one that incorporates elements of human learning and cognition into the process. The researchers behind this effort, led by Xudong Hong, Sharid Loáiciga, and Asad Sayeed, have developed a system called Active Curriculum Language Modeling (ACLM), which uses a dynamic curriculum to guide the training process.
The goal of ACLM is to simulate how humans learn language, by selecting the most informative examples for the model to focus on at each stage. This is achieved through the use of a surprise-based criterion, where the model is trained on the most surprising or unexpected examples in the dataset. The researchers found that this approach led to significant improvements in performance on certain tasks, particularly those requiring common sense and world knowledge.
The ACLM system consists of two main components: an initialization phase and an iteration phase. During the initialization phase, a small subset of the training data is used to train an initial model, which is then fine-tuned using a larger dataset. The iteration phase involves iteratively selecting the most surprising examples from the training data and adding them to the active training set.
The researchers tested ACLM on several benchmarks, including the BabyLM Challenge, a competition designed to evaluate language models’ ability to learn from small amounts of data. They found that their system outperformed the baseline models on certain tasks, such as world knowledge-based inferences, while underperforming on fine-grained grammatical inference tasks.
One of the key innovations of ACLM is its ability to dynamically adapt to the model’s uncertainty about what it has learned so far. This is achieved through the use of a surprise-based criterion, which selects examples that are most likely to challenge the model’s existing knowledge and force it to learn new concepts. The researchers found that this approach led to significant improvements in performance on certain tasks.
The ACLM system also has implications for our understanding of human language learning. By simulating how humans learn language through interaction with their environment, the system provides insights into the cognitive processes underlying language acquisition. For example, the researchers found that the system’s ability to adapt to uncertainty mirrors the way humans learn from their mistakes and adjust their expectations about what they will learn next.
Overall, the ACLM system represents a significant step forward in the development of language models that can learn from small amounts of data.
Cite this article: “Active Curriculum Language Modeling: A Dynamic Approach to Training Language Models”, The Science Archive, 2025.
Language Models, Active Curriculum Language Modeling, Aclm, Human Learning, Cognitive Processes, Surprise-Based Criterion, Uncertainty, World Knowledge, Babylm Challenge, Fine-Grained Grammatical Inference Tasks
Reference: Xudong Hong, Sharid Loáiciga, Asad Sayeed, “A surprisal oracle for when every layer counts” (2024).







