Addressing data quality decompensation in federated learning via dynamic client selection
Metadatos
Mostrar el registro completo del ítemAutor
Fei, Qinjun; Rodríguez Barroso, Nuria; Luzón García, María Victoria; Zhang, Zhongliang; Herrera Triguero, FranciscoEditorial
Elsevier
Materia
Federated learning Reputation-based client selection Data quality decompensation
Fecha
2026-03Referencia bibliográfica
Fei, Q., Rodríguez-Barroso, N., Luzón, M. V., Zhang, Z., & Herrera, F. (2026). Addressing data quality decompensation in federated learning via dynamic client selection. Future Generations Computer Systems: FGCS, 176(108138), 108138. https://doi.org/10.1016/j.future.2025.108138
Patrocinador
National Natural Science Foundation of China (Grant 72171065); Shaanxi Key Laboratory of Information Communication Network and Security (Open Fund Grant ICNS201807); Instituto Nacional de Ciberseguridad (INCIBE) – Universidad de Granada - Next Generation EU (Strategic Project IAFER-Cib C074/23); China Scholarship Council (Project ID: 202308330099); Universidad de Granada / CBUA (Open access)Resumen
In cross-silo Federated Learning (FL), client selection is critical to ensure high model performance, yet it remains challenging due to data quality decompensation, budget constraints, and incentive compatibility. As training progresses, these factors exacerbate client heterogeneity and degrade global performance. Most existing approaches treat these challenges in isolation, making it difficult to optimize multiple factors in conjunction. To address this, we propose Shapley-Bid Reputation Optimized Federated Learning (SBRO-FL), a unified framework integrating dynamic bidding, reputation modeling, and cost-aware selection. Clients submit bids based on their perceived data quality, and their contributions are evaluated using Shapley values to quantify their marginal impact on the global model. A reputation system, inspired by prospect theory, captures historical performance while penalizing inconsistency. The client selection problem is formulated as a 0–1 integer program that maximizes reputation-weighted utility under budget constraints. Experiments on four benchmark datasets demonstrate the framework’s effectiveness, improving final model accuracy by an average of 10.3 % over random selection, with gains exceeding 19 % on more complex datasets like CIFAR-10 and SVHN. Our results highlight the importance of balancing data reliability, incentive compatibility, and cost efficiency to enable scalable and trustworthy FL deployments.





