Tuesday 11 March 2025
A treasure trove of mathematical knowledge has been unlocked, allowing researchers to easily access and reuse data on small phylogenetic trees. This breakthrough is the result of a collaborative effort between mathematicians, computer scientists, and data stewards who have transformed an early 2000s database into a modern, sustainable repository.
The Small Phylogenetic Trees database was originally compiled by a team of researchers in algebraic statistics to provide tables of algebraic invariants that characterize geometric properties of phylogenetic trees. These tables were supplemented with code in Maple and Singular, along with text files of computational output and explanations of terminology and notation.
However, as the internet evolved and new technologies emerged, the database became outdated and difficult to navigate. The team behind Small Phylogenetic Trees realized that it was time to update their creation to ensure its longevity and usability for future generations of researchers.
The solution came in the form of a three-fold strategy: creating a software package that enables users to reproduce results from the database; setting up a user-friendly new website with cross-links to theoretical publications, code snippets, and serialized output of computations; and documenting every step of the process to derive lessons learned that can be applied to other similar projects.
The resulting Algebraic-Phylogenetics database is now available online at algebraicphylogenetics.org. It features a big table gathering all algebraic knowledge about small phylogenetic trees, as well as links to computational content and theory. The website also includes extensive documentation, making it easy for researchers to understand the full scope of the data.
One of the key innovations is the use of the MaRDI (Mathematical Research Data Initiative) file format, which ensures that the data is findable, accessible, interoperable, and reusable – the FAIR principles. This allows researchers to easily integrate the database into their own work, whether they are studying phylogenetics, algebraic statistics, or computer science.
The team behind Algebraic-Phylogenetics has also made a concerted effort to ensure the sustainability of the database. They have set up a collaborative software project, with a main maintainer responsible for updates and maintenance. The website is hosted by a trust-worthy institution, and the data is backed up on Zenodo, a digital repository that provides permanent preservation.
The implications of this work are far-reaching. Researchers in phylogenetics can now easily access and reuse data to accelerate their own research.
Cite this article: “Unlocking the Power of Small Phylogenetic Trees: A New Era of Data Accessibility and Reusability”, The Science Archive, 2025.
Mathematics, Phylogenetics, Algebraic Statistics, Computational Biology, Data Repository, Database Management, Research Collaboration, Sustainability, Fair Principles, Mathematical Research







