Learning Human Preferences from Robot Customization Interactions

Friday 28 February 2025


The quest for a more intuitive way to teach robots how to behave has long been a challenge in artificial intelligence research. One of the biggest hurdles is figuring out what humans want from their robotic companions, and how to convey those desires to the machines. A new paper proposes an innovative solution to this problem by analyzing how people interact with robots during a customization process.


The researchers developed a technique called Contrastive Learning from Exploratory Actions (CLEA), which uses data collected from users as they design custom signals for their robotic assistants. The idea is that by observing how humans explore different robot behaviors, the algorithm can learn to identify meaningful patterns and relationships between actions and preferences.


To test CLEA, the team designed an experiment where 42 participants interacted with a Kuri robot, a domestic helper bot capable of performing various tasks like finding lost items. Users were tasked with designing custom signals for the robot to perform certain behaviors, such as searching for objects or indicating when it has found something.


During this process, the researchers collected data on how users explored different robot actions and behaviors. They then used CLEA to analyze this data and generate feature representations that capture the essence of human preferences. These features were then used to evaluate the performance of various machine learning algorithms in predicting user preferences.


The results showed that CLEA-based methods outperformed self-supervised approaches in three out of four evaluation criteria: completeness, minimality, and explainability. Completeness measures how well the algorithm can accurately predict user preferences, while minimality assesses its ability to identify the most relevant features. Explainability examines how well the algorithm can provide insights into why certain behaviors are preferred.


The authors also compared CLEA with a direct reward learning approach, which involves training a neural network to learn a user’s reward function from raw data. While this method showed improvement over self-supervised approaches in some cases, it ultimately fell short of CLEA’s performance.


One of the key benefits of CLEA is its ability to learn meaningful features without requiring explicit human feedback or labeling. This could lead to more efficient and effective ways of teaching robots how to behave, especially in complex domains where user preferences are difficult to articulate.


The study’s findings have significant implications for the development of autonomous systems that can interact with humans in a more intuitive and personalized manner.


Cite this article: “Learning Human Preferences from Robot Customization Interactions”, The Science Archive, 2025.


Artificial Intelligence, Robotics, Machine Learning, Human-Computer Interaction, Customization, Robot Behavior, Pattern Recognition, Preference Prediction, Autonomous Systems, Personalized Interfaces


Reference: Nathaniel Dennler, Stefanos Nikolaidis, Maja Matarić, “Contrastive Learning from Exploratory Actions: Leveraging Natural Interactions for Preference Elicitation” (2025).


Leave a Reply