Energy-time Modelling of Distributed Multi-population Genetic Algorithms with Dynamic Workload in HPC Clusters
Metadatos
Mostrar el registro completo del ítemAutor
Escobar Pérez, Juan José; Sánchez-Cuevas, Pablo; Prieto Campos, Beatriz; Savran Kiziltepe, Rukiye; Díaz-del-Río, Fernando; Kimovski, DragiEditorial
Elsevier
Materia
Energy-time Modelling Heterogeneous Clusters Distributed Computing Parameter Optimisation Task Scheduling Genetic Algorithms
Fecha
2025-06Referencia bibliográfica
J.J. Escobar et al. 2025. Energy-time Modelling of Distributed Multi-population Genetic Algorithms with Dynamic Workload in HPC Clusters. Future Generation Computer Systems 167, (Jun 2025). https://doi.org/10.1016/j.future.2025.107753
Patrocinador
Spanish Ministry of Science, Innovation, and Universities under grants PID2022–137461NB-C32 and PID2023-151065OB-I00; University of Granada under grant PPJIA2023-025; Spanish Ministry of Universities under grant CAS22/00332; Ministry of Economic Transformation, Industry, Knowledge and Universities of the Regional Government of Andalusia under grant PREDOC_01229Resumen
Time and energy efficiency is a highly relevant objective in high-performance computing systems, with high costs for executing the tasks. Among these tasks, evolutionary algorithms are of consideration due to their inherent parallel scalability and usually costly fitness evaluation functions. In this respect, several scheduling strategies for workload balancing in heterogeneous systems have been proposed in the literature, with runtime and energy consumption reduction as their goals. Our hypothesis is that a dynamic workload distribution can be fitted with greater precision using metaheuristics, such as genetic algorithms, instead of linear regression. Therefore, this paper proposes a new mathematical model to predict the energy-time behaviour of applications based on multi-population genetic algorithms, which dynamically distributes the evaluation of individuals among the CPU-GPU devices of heterogeneous clusters. An accurate predictor would save time and energy by selecting the best resource set before running such applications. The estimation of the workload distributed to each device has been carried out by simulation, while the model parameters have been fitted in a two-phase run using another genetic algorithm and the experimental energy-time values of the target application as input. When the new model is analysed and compared with another based on linear regression, the one proposed in this work significantly improves the baseline approach, showing normalised prediction errors of 0.081 for runtime and 0.091 for energy consumption, compared to 0.213 and 0.256 shown in the baseline approach.