Interference-aware Multiplexing for Deep Learning in GPU Clusters A Middleware Approach

Wenyan Chen, Zizhao Mo, Huanle Xu, Kejiang Ye, Chengzhong Xu

November 2023

Abstract

A common strategy for improving efficiency in training deep learning entails multiplexing tasks on a single GPU. To mitigate the interference caused by multiplexing, existing approaches primarily employ kernel-level solutions to regulate GPU kernel execution, or harness hardware-level techniques to explicitly restrict GPU streaming multiprocessors and memory. Nevertheless, none of them perform satisfactorily in optimizing the completion time of tasks. In this paper, we present IADeep, a middleware solution designed to significantly improve multiplexing efficiency. The core concept is the co-optimization of task assignments within a cluster and interference mitigation on each device. IADeep coordinates the configuration of all co-located tasks in a less fine-grained fashion, effectively reducing interference and enhancing task training performance. Across the entire cluster, IADeep intelligently selects applications suitable for multiplexing to further amplify the advantages of optimizing task configurations. Evaluations on a 20 RTX 3090-GPU cluster demonstrate that IADeep can significantly outperform state-of-the-art multiplexing solutions.

Type

Conference paper

Publication

In International Conference for High Performance Computing, Networking, Storage, and Analysis (SC) 2023

Interference-aware Multiplexing for Deep Learning in GPU Clusters A Middleware Approach

Abstract

Wenyan Chen

2021 - Current

Zizhao Mo

2021 - Current

Huanle Xu

2021 - Current

Chengzhong Xu

2019 - Current