Unlocking the Meaning Behind Stickers: A New Dataset for AI Research

Thursday 26 June 2025

Stickers have become an integral part of modern digital communication, used by people all over the world to add a touch of personality and humor to their online interactions. Despite their ubiquity, however, there has been a lack of research into how we understand the meaning behind these small images.

A new study aims to change that, by creating a dataset of sticker queries that can be used to train artificial intelligence models to better comprehend the nuances of sticker use. The researchers behind the project, from Tsinghua University in China, have developed a gamified approach to gathering high-quality sticker queries, which they believe will significantly enhance our ability to generate and retrieve stickers.

The dataset, called StickerQueries, contains over 1,700 sticker queries in both English and Chinese, annotated by more than 60 contributors across 60 hours of work. The researchers used a combination of machine learning algorithms and human annotation to ensure the quality and diversity of the data.

One of the key challenges facing AI models when it comes to understanding stickers is their highly specific and intangible nature. Unlike words or sentences, which have clear meanings and grammatical structures, stickers are often used to convey complex emotions and ideas in a single image. This makes them difficult for machines to interpret accurately.

The StickerQueries dataset aims to address this challenge by providing AI models with a large and diverse set of sticker queries that can be used to learn the patterns and relationships between stickers and their meanings. The researchers believe that this will enable AI systems to better understand the context in which stickers are used, and to generate more accurate and relevant responses.

The dataset is also designed to be multilingual, allowing researchers to explore how sticker use varies across different cultures and languages. This could have important implications for the development of AI models that can communicate effectively with people from diverse backgrounds.

In addition to its potential applications in AI research, the StickerQueries dataset could also have practical uses in fields such as social media analysis and online customer service. By allowing machines to better understand the meaning behind stickers, it could improve the accuracy of sentiment analysis and enable more effective responses to customer queries.

The researchers are now making their dataset publicly available, along with two fine-tuned query generation models that can be used to generate new sticker queries. They hope that this will stimulate further research into the use of stickers in digital communication, and help to develop more advanced AI systems that can better understand and respond to human emotions.

Cite this article: “Unlocking the Meaning Behind Stickers: A New Dataset for AI Research”, The Science Archive, 2025.

Sticker Queries, Artificial Intelligence, Digital Communication, Machine Learning, Dataset, Gamification, Annotation, Emotions, Sentiment Analysis, Online Customer Service

Reference: Heng Er Metilda Chee, Jiayin Wang, Zhiqiang Guo, Weizhi Ma, Min Zhang, “Small Stickers, Big Meanings: A Multilingual Sticker Semantic Understanding Dataset with a Gamified Approach” (2025).

Leave a Reply