Saturday 15 March 2025
Federated learning, a technique that enables multiple devices or organizations to jointly train an artificial intelligence model without sharing their individual data, has long been plagued by a major limitation: it struggles to generalize well across different domains. For instance, a model trained on images of dogs and cats may perform poorly when asked to classify pictures of horses.
A new approach called FedAlign aims to address this issue by introducing a mechanism that allows devices or organizations to share knowledge and learn from each other’s strengths. The method, described in a recent paper, uses a combination of techniques such as cross-client feature extension and dual-stage alignment to create a more robust and domain-invariant representation space.
The key idea behind FedAlign is to extend the local data on each device by incorporating features learned from other clients. This is achieved through a process called MixStyle, which generates new samples that mimic the style of the original images. By combining these augmented samples with the original data, the model can learn more diverse and robust representations.
In addition to feature extension, FedAlign also employs a dual-stage alignment strategy to ensure that the learned features are not only diverse but also domain-invariant. The first stage involves aligning the feature representations across different domains using a supervised contrastive loss function. This step helps to reduce the distributional shift between different domains and ensures that the model can generalize well across them.
The second stage of alignment focuses on the output predictions, rather than just the feature representations. By minimizing the Jensen-Shannon divergence between the predicted distributions of different domains, FedAlign encourages the model to produce more consistent and accurate outputs across different environments.
To evaluate the effectiveness of FedAlign, the researchers conducted experiments on four benchmark datasets: PACS, OfficeHome, miniDomainNet, and Caltech-101. The results show that FedAlign consistently outperforms existing federated learning methods in terms of accuracy and robustness, particularly when generalizing to unseen domains.
One of the most impressive aspects of FedAlign is its ability to scale up to large numbers of clients while maintaining high performance. This is crucial for real-world applications, where devices or organizations may have varying levels of computational resources and data quality.
The authors also explored the sensitivity of FedAlign to different hyperparameters and found that it was relatively robust to changes in the learning rate, batch size, and number of communication rounds. However, they did observe that the performance of FedAlign degraded slightly when the number of clients increased beyond a certain threshold.
Cite this article: “Improving Federated Learning with Domain-Invariant Representations”, The Science Archive, 2025.
Federated Learning, Domain Adaptation, Cross-Client Feature Extension, Dual-Stage Alignment, Mixstyle, Supervised Contrastive Loss, Jensen-Shannon Divergence, Pacs, Officehome, Minidomainnet, Caltech-101







