Wednesday 19 March 2025
A new study has shed light on the often-toxic world of peer review, where scientists scrutinize each other’s work in a quest for excellence. The research reveals that detecting toxicity in these reviews is a challenging task, even for sophisticated language models.
Peer review is an essential part of the scientific process, allowing experts to provide feedback and improve the quality of published research. However, toxic behavior – such as personal attacks, unconstructive criticism, or excessive negativity – can undermine this process, causing emotional distress and stifling innovation.
To tackle this issue, researchers have developed a dataset of peer reviews from six conferences, annotated with labels indicating whether each sentence is toxic or not. The dataset is the first of its kind, providing a valuable resource for developing algorithms that can detect toxicity in these reviews.
The study found that large language models, such as those used by chatbots and virtual assistants, struggled to align with human judgments about which sentences were toxic. This suggests that the semantic meaning of toxicity in peer review is distinct from that in other domains, where language models have been trained.
In contrast, closed-source models like GPT-3.5 and GPT-4 showed better alignment with human judgment, achieving a Cohen’s Kappa score of 0.56 – a measure of agreement between the model’s predictions and human annotations. This suggests that these models may be more effective in detecting toxicity in peer review.
The researchers also experimented with rewriting toxic sentences to make them more constructive and friendly. They found that GPT-3.5 was able to revise some sentences, but struggled with others that required a deeper understanding of the research being reviewed.
The study highlights the importance of developing algorithms that can detect toxicity in peer review, as well as providing support for reviewers who may be struggling to provide constructive feedback. By improving the quality and tone of peer reviews, scientists can foster a more positive and productive environment for collaboration and innovation.
The researchers’ dataset provides a valuable resource for future studies on this topic, and could inform the development of new tools and techniques for detecting toxicity in peer review. As the scientific community continues to grapple with issues of diversity, equity, and inclusion, understanding and addressing toxic behavior in peer review will be crucial for promoting a culture of respect and excellence.
Cite this article: “Detecting Toxicity in Peer Review: A Challenging Task for AI Models”, The Science Archive, 2025.
Peer Review, Toxicity, Language Models, Scientific Research, Dataset, Annotation, Gpt-3.5, Gpt-4, Cohen’S Kappa Score, Constructive Feedback







