Data Therapist: A Novel Tool for Bridging the Gap Between Domain Experts and Data

Thursday 29 May 2025

Data Therapist, a new tool, aims to bridge the gap between domain experts and data by providing a structured mixed-initiative interaction approach. The system is designed to elicit domain knowledge about a dataset through question-answering and user-driven annotation.

The concept of Data Therapist was born out of frustration with the current state of data visualization. Despite advancements in technology, experts still struggle to effectively communicate their findings to non-experts due to the complexity of data sets and the lack of understanding about domain-specific context.

To address this issue, the Data Therapist team developed a web-based tool that combines human expertise with large language models (LLMs). The system provides users with a series of questions that are generated based on the dataset they are working with. These questions are designed to help the user understand the underlying mechanisms and relationships within the data.

The first step in using Data Therapist is to upload your dataset. The system then analyzes the data and generates a set of questions tailored to the specific domain and context. Users can interact with these questions, providing answers that are used to generate annotations about the dataset.

One of the key features of Data Therapist is its ability to recognize when a user’s answer requires further clarification. If this happens, the system will ask additional follow-up questions to help the user provide more accurate information. This iterative process allows users to gradually build a comprehensive understanding of their data and identify areas where they need more insight.

Data Therapist has been tested with four groups of domain experts, each with similar expertise in different fields such as accounting, molecular biology, political science, and usable security. The results showed that the system was effective in eliciting domain knowledge about datasets, even from users who were not familiar with data analysis.

The study also highlighted some limitations of Data Therapist. For example, while the system is able to generate questions similar to those asked by humans, it still struggles to accurately gauge the knowledge level of a lay audience. Additionally, users may require more time and guidance when working with complex datasets.

Despite these challenges, Data Therapist has the potential to revolutionize the way experts communicate their findings to non-experts. By providing a structured mixed-initiative interaction approach, the system can help bridge the gap between domain experts and data, making it easier for everyone to understand and work with complex data sets.

The development of Data Therapist is an important step towards improving data visualization and communication.

Cite this article: “Data Therapist: A Novel Tool for Bridging the Gap Between Domain Experts and Data”, The Science Archive, 2025.

Data Therapist, Mixed-Initiative, Domain Experts, Datasets, Question-Answering, Annotation, Data Visualization, Large Language Models, Clarification, Communication.

Reference: Sungbok Shin, Hyeon Jeon, Sanghyun Hong, Niklas Elmqvist, “Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models” (2025).

Leave a Reply