Thursday 27 March 2025
Scientists have long struggled to accurately predict the intent behind citations in academic papers. Citations are essential for providing context and supporting claims, but they can also be used for other purposes, such as criticizing or acknowledging prior work. Understanding citation intent is crucial for various applications, including measuring scientific impact.
Recently, researchers have explored the potential of large language models (LLMs) to predict citation intent. These models are trained on vast amounts of text data and are designed to understand human language. In a new study, scientists tested the ability of LLMs to identify citation intent using in-context learning and fine-tuning.
The team used 12 prominent open-source LLMs, including LLaMA, Mistral Nemo, and Gemma, to evaluate their performance on two benchmark datasets: SciCite and ACL-ARC. The models were tested with different prompting methods, such as single-sentence prompts, multiple-choice questions, and inline examples.
The results showed that the top-performing model, Qwen 2.5, achieved an F1-score of 78.33% on the SciCite dataset using a few-shot learning approach. This means that for every citation, the model correctly identified its intent about three-quarters of the time. The team also found significant correlations between specific parameter settings and F1-scores, indicating that certain model configurations are more effective than others.
The study’s findings have important implications for researchers seeking to understand citation intent. LLMs can be fine-tuned on minimal task-specific data to achieve high accuracy, making them a viable option for this challenging task. The results also highlight the potential of in-context learning and fine-tuning for adapting general-purpose language models to specific applications.
The team’s approach has several advantages over traditional methods, which rely on pre-trained models or specialized architectures. LLMs can be easily adapted to new tasks and datasets, making them a flexible tool for researchers. Additionally, their ability to learn from context provides a more nuanced understanding of citation intent than simple keyword-based approaches.
While the study’s results are promising, there is still much work to be done. The team plans to investigate further the relationships between parameter settings and performance outcomes to better understand what drives the models’ accuracy. They also aim to explore the applicability of their approach to other natural language processing tasks.
The potential applications of this research are vast.
Cite this article: “Predicting Citation Intent with Large Language Models”, The Science Archive, 2025.
Language Models, Citation Intent, Natural Language Processing, Academic Papers, Scientific Impact, Fine-Tuning, In-Context Learning, Few-Shot Learning, F1-Score, Benchmark Datasets







