Revealing the Software Landscape: A Study on Reproducibility in Empirical Research

Tuesday 22 April 2025


The paper in question is a fascinating exploration of how researchers across different disciplines share materials for replication of empirical statistical analyses, and what software they use. The study gathered data by collecting metadata from various online sources, including journals and repositories, to see which types of files were most commonly used.


The results are quite telling. It turns out that R is the clear winner when it comes to statistics research in fields like economics and political science. In fact, a whopping 60% of the papers reviewed used R for their analyses. Stata, on the other hand, was more popular among economists, with around 30% of the papers using this software.


But what about Python? You might be surprised to learn that it’s actually not as widely used in these fields as you might think. Only around 10% of the papers reviewed used Python for their analyses. And when it comes to MATLAB and SAS, they’re even less common, with only a few percent of papers using them.


So why is R so dominant? One possible reason is that it’s incredibly versatile and can be used for everything from data visualization to machine learning. It’s also free and open-source, which makes it more accessible to researchers who might not have the budget for commercial software like Stata or MATLAB.


Another interesting finding from the study is that many papers don’t actually provide replication materials at all. In fact, around 40% of the papers reviewed didn’t include any code or data files that would allow other researchers to reproduce their results. This can make it difficult for others to verify the findings and build on them, which is a major problem in science.


The study also highlights the importance of reproducibility in research. When researchers share their code and data, they’re not just helping each other out – they’re also making it easier for others to build on their work and make new discoveries. And that’s what science is all about.


Overall, this paper provides a fascinating glimpse into the world of statistical analysis and how researchers are using different software tools to get the job done. Whether you’re an economist or a political scientist, R is likely to be a familiar sight – but it’s worth remembering that there are plenty of other options out there too.


Cite this article: “Revealing the Software Landscape: A Study on Reproducibility in Empirical Research”, The Science Archive, 2025.


Statistics, Research, R, Stata, Python, Matlab, Sas, Reproducibility, Data Analysis, Empirical Methods


Reference: Elizabeth Upton, Xizhen Cai, Pamela Jakiela, Owen Ozier, Shyam Raman, “The Software Behind the Stats: A Student Exploration of Software Trends Across Disciplines” (2025).


Leave a Reply