Addressing False Negatives in Self-Supervised Learning with GloFND

Monday 31 March 2025


As researchers continue to push the boundaries of self-supervised learning, a new approach has emerged that tackles one of its most pressing challenges: the problem of false negatives. False negatives occur when two images or texts are deemed dissimilar by an algorithm, even though they share similar semantic meanings. This can lead to embeddings being pushed apart in a way that’s detrimental to the model’s overall performance.


The solution proposed by this research is called GloFND, which stands for Global False Negative Discovery. At its core, GloFND is an optimization-based approach that automatically learns on-the-fly the threshold for each anchor data point to identify its false negatives during training. This means that instead of relying on a fixed number of negative pairs or a batch-wise method, GloFND can dynamically adapt to different datasets and tasks.


The key insight behind GloFND is that it recognizes that the optimal definition of a false negative varies depending on the task at hand. For instance, in a dataset like ImageNet, two images of dogs from different breeds might be considered dissimilar for one classification task but not for another. By allowing the threshold to adapt to each anchor data point, GloFND can capture this variability and identify more accurate false negatives.


To test GloFND’s effectiveness, the researchers conducted a series of experiments on various datasets, including ImageNet100, CIFAR-10, CIFAR-100, Food-101, Stanford Cars, Describable Textures Dataset (DTD), Oxford-IIIT Pets, Caltech-101, and Oxford 102 Flowers. They found that GloFND outperformed the baseline method in all cases, with statistically significant improvements on most datasets.


One of the standout results was seen in the semi-supervised learning scenario, where GloFND achieved statistical significance below the 1% level on both the 100% and 1% scenarios. This suggests that GloFND is able to effectively identify false negatives even when there are limited labeled examples available.


The researchers also provided visual examples of false negatives identified by GloFND for ImageNet100, showcasing how the approach can dynamically adapt to different anchor data points. These examples highlight the ability of GloFND to capture semantically similar images that would otherwise be considered dissimilar.


While self-supervised learning has made tremendous progress in recent years, the problem of false negatives remains a significant challenge.


Cite this article: “Addressing False Negatives in Self-Supervised Learning with GloFND”, The Science Archive, 2025.


Self-Supervised Learning, False Negatives, Glofnd, Optimization-Based Approach, Threshold Adaptation, Anchor Data Points, Imagenet, Cifar, Food-101, Semi-Supervised Learning, Statistical Significance.


Reference: Vicente Balmaseda, Bokun Wang, Ching-Long Lin, Tianbao Yang, “Discovering Global False Negatives On the Fly for Self-supervised Contrastive Learning” (2025).


Leave a Reply