Unlocking Private Knowledge: A Novel Approach to Collaborative Reasoning with Local Data Protection

Wednesday 16 April 2025


The art of collaboration between small and large language models has been revolutionized by a team of researchers who have developed a new framework that ensures data protection while maintaining accuracy in numerical reasoning tasks.


Traditionally, local models deployed on devices have struggled to solve complex problems due to their limited capacity. To overcome this challenge, remote black-box models like GPT-4 are often accessed via API calls. However, this approach raises significant concerns about data leakage and privacy breaches. Existing methods for mitigating these risks involve generating problem descriptions or examples for remote assistance, but these approaches have inherent limitations.


The new framework proposed by the researchers shifts the focus towards context-aware synthesis strategies that preserve logical consistency while protecting local data. The team has developed a tool-based answer reconstruction approach that reuses the remote model’s problem-solving patterns with code snippets. This innovative approach ensures accurate solving and keeps privacy concerns at bay.


One of the key innovations is the use of evidence localization, which helps to reduce the burden of information hiding for lengthy texts. By prompting the local model to explicitly display the original sentence from the context as evidence, the researchers have been able to improve the accuracy of local inference and retrieval.


Another significant advancement is the topic rewriter, which can change the context’s topic while maintaining numerical values unchanged. This feature allows for a wider range of applications and opens up new possibilities for data analysis and insights.


The framework has been tested on two datasets: FinQA and MultiHiertt. The results show that the method outperforms existing approaches in terms of accuracy, with enhancements in 16.2% to 43.6% compared to local self-consistency. Moreover, the method reduces data leakage by 2.3% to 44.6%.


The implications of this research are far-reaching. For instance, it has the potential to revolutionize the way we approach numerical reasoning tasks in finance, healthcare, and other industries where data protection is paramount. The framework’s ability to maintain accuracy while protecting local data sets a new standard for model collaboration.


In practical terms, the framework can be used to develop more secure and accurate language models that are capable of solving complex problems while keeping sensitive information confidential. This has significant implications for organizations that rely on data analysis and insights to inform their decision-making processes.


The researchers’ work represents a major step forward in the development of collaborative language models that balance accuracy, security, and privacy concerns.


Cite this article: “Unlocking Private Knowledge: A Novel Approach to Collaborative Reasoning with Local Data Protection”, The Science Archive, 2025.


Language Models, Data Protection, Numerical Reasoning, Collaboration, Framework, Local Models, Remote Black-Box Models, Gpt-4, Evidence Localization, Topic Rewriter


Reference: Min Zhang, Yuzhe Lu, Yun Zhou, Panpan Xu, Lin Lee Cheong, Chang-Tien Lu, Haozhu Wang, “Collaborative LLM Numerical Reasoning with Local Data Protection” (2025).


Leave a Reply