Unlocking Transparency in Large Language Models for Improved Software Engineering

Wednesday 19 March 2025


A team of researchers has developed a new approach to improve the performance and transparency of large language models (LLMs) used in software engineering tasks. These LLMs are trained on vast amounts of code data, allowing them to generate code snippets and identify patterns that humans may miss.


However, their black-box nature makes it difficult for developers to understand how they arrive at certain conclusions or make predictions. This lack of transparency can lead to mistrust in the models and hinder their adoption in critical software development processes.


The researchers have addressed this issue by introducing a neurosymbolic approach, which combines the strengths of neural networks with symbolic reasoning techniques. They’ve developed a framework called Neurosymbolic Program Comprehension (NsPC), which uses SHAP values to identify patterns in model predictions and formalize them into symbolic rules.


In essence, NsPC is designed to provide developers with a more transparent understanding of how LLMs work, allowing them to build trust in their outputs. By analyzing the SHAP values, the framework can pinpoint specific code elements that contribute to a particular prediction or pattern, enabling developers to refine their understanding of the model’s behavior.


The researchers have tested NsPC on a vulnerability detection task using Java code snippets and achieved promising results. Their approach identified meaningful patterns in the data, which were then formalized into symbolic rules that improved the accuracy of the LLM’s predictions.


This breakthrough has significant implications for software engineering, as it could enable developers to create more reliable and maintainable code. By providing a clearer understanding of how LLMs work, NsPC can help bridge the gap between human intuition and machine learning capabilities, ultimately leading to better decision-making in software development.


The team’s approach is also scalable and can be applied to various programming languages and tasks, making it a valuable tool for developers working on complex software projects. As the use of LLMs continues to grow, the need for transparent and interpretable models will become increasingly important. NsPC has taken a significant step towards addressing this challenge, paving the way for more trustworthy and effective AI-powered software engineering tools.


Cite this article: “Unlocking Transparency in Large Language Models for Improved Software Engineering”, The Science Archive, 2025.


Large Language Models, Software Engineering, Neurosymbolic Approach, Transparency, Interpretability, Shap Values, Symbolic Rules, Vulnerability Detection, Code Analysis, Ai-Powered Tools


Reference: Alejandro Velasco, Aya Garryyeva, David N. Palacio, Antonio Mastropaolo, Denys Poshyvanyk, “Toward Neurosymbolic Program Comprehension” (2025).


Leave a Reply