Monday 31 March 2025
The quest to predict how difficult a reading comprehension test question will be has long been a challenge for educators and researchers. Now, a team of scientists has developed a new method that uses a combination of linguistic features and artificial intelligence language models to accurately forecast item difficulty.
For years, teachers have relied on their own judgment to determine which questions are most suitable for their students. But this approach is time-consuming and prone to bias. A more efficient solution is needed, especially in today’s era of standardized testing and data-driven instruction.
The researchers used a dataset of reading passages and student responses from US standardized tests in grades 3-8 to develop their model. They extracted over 30 linguistic features from the text, including measures of sentence complexity, word frequency, and cohesion. These features were then combined with embeddings generated by three different artificial intelligence language models: BERT, LlAMA, and ModernBERT.
The team found that using these linguistic features and AI embeddings together improved the accuracy of their model significantly. They were able to predict item difficulty with a root mean square error (RMSE) of 0.52, compared to an RMSE of 0.92 for a baseline model that used only item-level metadata.
But what does this mean in practical terms? For educators, it means being able to identify which questions are most likely to challenge their students and adjust their instruction accordingly. It could also help teachers develop more targeted interventions for struggling students.
The researchers also explored the robustness of their model by testing its performance with different text input formats and vertical scales. They found that while the model was sensitive to some changes, it remained relatively stable overall.
One potential limitation of the study is its reliance on a single dataset. To generalize these findings more widely, future research will need to validate the model using additional datasets from different contexts.
Despite this limitation, the results are promising and suggest that machine learning can be a powerful tool in the quest for more effective teaching and assessment. As education continues to evolve in response to advances in technology and our understanding of how students learn, it’s likely that we’ll see even more innovative applications of AI in the classroom.
Cite this article: “AI-Powered Model Accurately Predicts Reading Comprehension Test Difficulty”, The Science Archive, 2025.
Reading Comprehension, Test Question Difficulty, Linguistic Features, Artificial Intelligence, Language Models, Standardized Testing, Data-Driven Instruction, Sentence Complexity, Word Frequency, Cohesion, Machine Learning, Education, Teaching, Assessment