Real-Time Workload Prediction and Resource Optimization for Parallel Heterogeneous High-Performance Computing Systems Architectures
DOI:
https://doi.org/10.71346/utj.v1i1.11Keywords:
Adaptive resource management, Heterogeneous parallel architectures, Task scheduling, Machine learning-driven optimization, Source allocation, Data placement, Energy efficiency, Fault tolerance, Workload prediction, High-performance computing.Abstract
The rapid advancements in heterogeneous parallel architectures, consisting of CPUs, GPUs, FPGAs, have introduced significant challenges in efficient resource management for high-performance computing systems. Static and heuristic-based approaches fail to address the adaptability required for handling varying workloads and hardware configurations which results in suboptimal performance and energy inefficiency. This research proposes a machine learning-driven adaptive resource management framework that dynamically optimizes task scheduling, resource allocation, and data placement. The framework employs regression models & reinforcement learning algorithms to predict workload behaviors, resource utilization, and task execution times in real time. Experimental results on heterogeneous testbed demonstrate a 21% reduction in task execution time, 18% improvement in energy efficiency, and 38% decrease in fault recovery time compared to conventional methods. These findings highlight the framework’s ability to improve resource utilization while maintaining reliability and minimizing energy overhead. The work advances the field by introducing a unified approach that integrates machine learning for runtime optimization across heterogeneous systems. Practical implications include its applicability to large-scale scientific simulations and deep learning tasks, where adaptive resource management is critical. Future study can focus on enhancing prediction accuracy by advanced deep learning techniques and extending the framework to handle emerging hardware accelerators and edge computing environments.
References
M. De Castro, D. L. Vilariño, Y. Torres, and D. R. Llanos, "The Role of Field-Programmable Gate Arrays in the Acceleration of Modern High-Performance Computing Workloads," Computer, vol. 57, no. 7, pp. 66–76, Jun. 2024, doi: 10.1109/MC.2024.3378380.
C. A. Silva, R. Vilaça, A. Pereira, and R. J. Bessa, "A review on the decarbonization of high-performance computing centers," Renewable and Sustainable Energy Reviews, vol. 189, p. 114019, Nov. 2023, doi: 10.1016/j.rser.2023.114019.
S. Gurusamy and R. Selvaraj, "Resource allocation with efficient task scheduling in cloud computing using hierarchical auto-associative polynomial convolutional neural network," Expert Systems with Applications, vol. 249, p. 123554, Feb. 2024, doi: 10.1016/j.eswa.2024.123554.
A. H. A. Al-Jumaili, R. C. Muniyandi, M. K. Hasan, J. K. S. Paw, and M. J. Singh, "Big Data Analytics Using Cloud Computing Based Frameworks for Power Management Systems: Status, Constraints, and Future Recommendations," Sensors, vol. 23, no. 6, p. 2952, Mar. 2023, doi: 10.3390/s23062952.
T. A. Rahmani, G. Belalem, S. A. Mahmoudi, and O. R. Merad-Boudia, "Machine learning-driven energy-efficient load balancing for real-time heterogeneous systems," Cluster Computing, vol. 27, no. 4, pp. 4883–4908, Jan. 2024, doi: 10.1007/s10586-023-04215-3.
Z. Ye et al., "Deep Learning Workload Scheduling in GPU Datacenters: A Survey," ACM Computing Surveys, vol. 56, no. 6, pp. 1–38, Jan. 2024, doi: 10.1145/3638757.
B. Premalatha and P. Prakasam, "Optimal Energy-efficient Resource Allocation and Fault Tolerance scheme for task offloading in IoT-FoG Computing Networks," Computer Networks, vol. 238, p. 110080, Nov. 2023, doi: 10.1016/j.comnet.2023.110080.
N. Jafarzadeh et al., "A novel buffering fault‐tolerance approach for network on chip (NoC)," IET Circuits, Devices & Systems, vol. 17, no. 4, pp. 250–257, Aug. 2022, doi: 10.1049/cds2.12127.
B. Hu, X. Yang, and M. Zhao, "Online energy-efficient scheduling of DAG tasks on heterogeneous embedded platforms," Journal of Systems Architecture, vol. 140, p. 102894, May 2023, doi: 10.1016/j.sysarc.2023.102894.
G. Galante and R. Da Rosa Righi, "Adaptive parallel applications: from shared memory architectures to fog computing (2002–2022)," Cluster Computing, vol. 25, no. 6, pp. 4439–4461, Aug. 2022, doi: 10.1007/s10586-022-03692-2.
Z. Yang, S. Zhang, C. Li, M. Wang, H. Wang, and M. Zhang, "Efficient knowledge management for heterogeneous federated continual learning on resource-constrained edge devices," Future Generation Computer Systems, vol. 156, pp. 16–29, Feb. 2024, doi: 10.1016/j.future.2024.02.018.
Q. Zeng, Y. Du, K. Huang, and K. K. Leung, "Energy-Efficient Resource Management for Federated Edge Learning With CPU-GPU Heterogeneous Computing," IEEE Transactions on Wireless Communications, vol. 20, no. 12, pp. 7947–7962, Jun. 2021, doi: 10.1109/TWC.2021.3088910.
Y. Wang et al., "DRLCap: Runtime GPU Frequency Capping with Deep Reinforcement Learning," IEEE Transactions on Sustainable Computing, vol. 9, no. 5, pp. 712–726, Feb. 2024, doi: 10.1109/TSUSC.2024.3362697.
M. Kirti, A. K. Maurya, and R. S. Yadav, "Fault‐tolerance approaches for distributed and cloud computing environments: A systematic review, taxonomy and future directions," Concurrency and Computation: Practice and Experience, vol. 36, no. 13, Mar. 2024, doi: 10.1002/cpe.8081.
R. Kaur, A. Asad, and F. Mohammadi, "A Comprehensive Review of Processing-in-Memory Architectures for Deep Neural Networks," Computers, vol. 13, no. 7, p. 174, Jul. 2024, doi: 10.3390/computers13070174.
X. Li, Y. Li, Y. Li, T. Cao, and Y. Liu, "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices," Proceedings of the 28th Annual International Conference on Mobile Computing and Networking, vol. 54, pp. 709–723, May 2024, doi: 10.1145/3636534.3649391.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ayesha Aslam, Zhumakhanova Darya Anuarovna

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors retain copyright for all articles published in CrossLink Studies journals. These articles are made freely available under a Creative Commons CC BY 4.0 license, which allows unrestricted downloading and reading by the public.