Tracking Open Science: A New Monitoring System Harnesses AI and Machine Learning

Sunday 02 March 2025


The quest for a more transparent and open scientific community has taken a significant step forward with the development of an innovative monitoring system. By harnessing the power of artificial intelligence, data analysis, and machine learning, researchers have created a tool that can track the progress of open science policies across various institutions.


At its core, the system uses a vast corpus of publications to identify key indicators of scientific output, including research data, software, and code. These metrics are then used to assess the openness and accessibility of these outputs, providing a comprehensive picture of an institution’s commitment to open science.


One of the most impressive aspects of this system is its ability to detect mentions of datasets, software, and code within publications. This is achieved through advanced natural language processing techniques that can accurately identify and categorize these mentions. The results are then used to calculate document-level indicators, such as the percentage of publications that mention sharing data or software.


The implications of this technology are significant. For instance, it could help funders and institutions make more informed decisions about how to allocate resources for open science initiatives. By providing a quantitative and objective measure of an institution’s progress towards open science goals, the system can facilitate better communication and collaboration between stakeholders.


Another benefit is that it can help researchers identify areas where they may be falling short in terms of openness and accessibility. By analyzing their own publication record, scientists can pinpoint specific gaps or bottlenecks in the production and sharing of research outputs.


The system has already been deployed at a national level in France, with plans to expand its reach to other countries and institutions. Its potential for impact is significant, as it could help create a more transparent and accountable scientific community that prioritizes openness and collaboration.


One of the key challenges facing the development of this technology is the complexity of the data itself. With millions of publications to analyze, the system must be able to accurately identify and categorize mentions of datasets, software, and code within these documents. This requires advanced machine learning algorithms and natural language processing techniques that can handle the sheer scale and diversity of the data.


Despite these challenges, the researchers behind this technology are optimistic about its potential for impact. As they continue to refine and improve the system, it’s likely that we’ll see even more innovative applications of AI and machine learning in the pursuit of open science.


Cite this article: “Tracking Open Science: A New Monitoring System Harnesses AI and Machine Learning”, The Science Archive, 2025.


Artificial Intelligence, Data Analysis, Machine Learning, Open Science, Scientific Community, Research Output, Natural Language Processing, Publication Record, Accountability, Transparency.


Reference: Laetitia Bracco, Eric Jeangirard, Anne L’Hôte, Laurent Romary, “How to build an Open Science Monitor based on publications? A French perspective” (2025).


Leave a Reply