Sunday 23 February 2025
The quest for efficient and private data analysis has led researchers to develop novel approaches that balance security, scalability, and performance. Federated learning, a distributed machine learning technique, allows multiple parties to collaborate on training models without sharing their raw data. However, this comes at the cost of increased communication overhead and potential privacy risks.
A recent paper proposes three algorithms for automated feature engineering (AutoFE) in federated learning settings, addressing these challenges by minimizing data exchange while preserving model performance. The authors present solutions for horizontal, vertical, and hybrid federated learning scenarios, where data is partitioned across clients or features.
In the horizontal setting, each client performs AutoFE independently, sending only engineered feature strings to a central server. The server aggregates these strings and selects the most useful features using a resource-aware approach. This method reduces communication overhead while maintaining model performance.
The vertical setting involves homomorphic encryption, where encrypted feature vectors are combined across clients without revealing their contents. This ensures privacy while still enabling AutoFE. The authors demonstrate that this approach can be used to select the best engineered features without compromising data security.
In the hybrid setting, a combination of horizontal and vertical approaches is employed. Clients with overlapping feature sets collaborate on AutoFE, while those with non-overlapping sets communicate encrypted feature vectors.
Experimental results show that these algorithms perform similarly to centralized AutoFE, demonstrating their effectiveness in federated learning settings. The authors’ approach reduces communication overhead, preserves model performance, and safeguards data privacy – a crucial trifecta for large-scale data analysis applications.
As the demand for efficient and secure data processing continues to grow, researchers will likely build upon this work to develop even more sophisticated solutions. For now, these AutoFE algorithms offer a promising step forward in balancing the competing demands of federated learning.
Cite this article: “Federated Learning with Automated Feature Engineering”, The Science Archive, 2025.
Federated Learning, Feature Engineering, Machine Learning, Data Privacy, Communication Overhead, Homomorphic Encryption, Automated Feature Engineering, Horizontal Federated Learning, Vertical Federated Learning, Hybrid Federated Learning
Reference: Tom Overman, Diego Klabjan, “Federated Automated Feature Engineering” (2024).







