A Runtime Approach for Dynamic Load Balancing of OpenMP Parallel Loops in LLVM
Date Issued
2019-01-01
Author(s)
Iwainsky, Christian
Doerfert, Johannes
Finkel, Hal
Kale, Vivek
Klemm, Michael
Abstract
Load imbalance is the major source of performance degradation in computationally-intensive applications that frequently consist of parallel loops. Efficient scheduling of parallel loops can improve the performance of such programs. OpenMP is the de-facto standard for parallel programming on shared-memory systems. The current OpenMP specification provides only three choices for loop scheduling which are insufficient in scenarios with irregular loops, system-induced interference, or both. Therefore, this work augments the LLVM implementation of the OpenMP runtime library with eleven state-of-the-art plus three new and ready-to-use scheduling techniques. We tested the existing and the added loop scheduling strategies on several applications from the NAS, SPEC OMP 2012, and CORAL-2 benchmark suites. The experimental results show that each newly implemented scheduling technique outperforms the other in certain application and system configurations. We measured performance gains of up to 6% compared to the fastest previously available scheduling techniques. This work establishes the importance of beyond-standard scheduling options in OpenMP for the benefit of evolving applications executing on evolving multicore architectures.