Transforming Categorical Data with Optimal Transport: A Breakthrough in Fairness and Decision-Making

Saturday 15 March 2025


The quest for fairness in decision-making has long been a contentious issue, particularly when it comes to algorithms that aim to predict outcomes based on complex data sets. One such algorithm, optimal transport, has emerged as a promising solution to this problem, but its application has been limited by the challenges of working with categorical data.


That was until now. A team of researchers has developed a new approach that transforms categorical variables into compositional data, allowing for the use of optimal transport to construct counterfactuals – hypothetical versions of reality that allow us to test what would have happened if certain conditions had been different.


The problem with traditional approaches is that they often rely on arbitrary assumptions about label ordering, which can lead to biased results. By transforming categorical variables into compositional data, the researchers were able to avoid this pitfall and create a more accurate representation of reality.


The team applied their method to two real-world datasets: the German Credit dataset, which contains information on loan applicants, and the Adult dataset, which contains demographic and employment data from the US Census. By using optimal transport to construct counterfactuals for categorical variables such as loan purpose and marital status, they were able to gain insights into how decisions would have changed under different conditions.


For example, in the German Credit dataset, the team found that if a woman had been given a loan with a certain purpose, she would likely have chosen a different category of loan. This suggests that women may be more likely to choose loans for specific purposes than men. Similarly, in the Adult dataset, the team discovered that if a person had been married or separated under different circumstances, their marital status might have changed.


The potential applications of this research are vast. By using optimal transport to construct counterfactuals, policymakers and businesses can gain valuable insights into how decisions would change under different conditions, allowing them to make more informed choices about policy and investment.


Moreover, the approach has implications for fairness in decision-making. By constructing counterfactuals that take into account the complex interplay between variables, researchers can identify biases and discrimination in algorithms and develop strategies to mitigate them.


The team’s work also highlights the importance of interdisciplinary collaboration. By combining insights from statistics, computer science, and economics, they were able to develop a novel approach that has the potential to transform our understanding of decision-making processes.


Cite this article: “Transforming Categorical Data with Optimal Transport: A Breakthrough in Fairness and Decision-Making”, The Science Archive, 2025.


Algorithm, Optimal Transport, Categorical Data, Compositional Data, Counterfactuals, Decision-Making, Fairness, Machine Learning, Statistics, Economics


Reference: Agathe Fernandes Machado, Arthur Charpentier, Ewen Gallic, “Optimal Transport on Categorical Data for Counterfactuals using Compositional Data and Dirichlet Transport” (2025).


Leave a Reply