Cloud and Distributed Systems Lab
Cloud and Distributed Systems Lab
Overview
Research
News
Members
Publications
Contact
Paper-Conference
Hetis: Serving LLMs in Heterogeneous GPU Clusters with Fine-grained and Dynamic Parallelism
The significant resource demands in LLM serving prompts production clusters to fully utilize heterogeneous hardware by partitioning LLM …
Zizhao Mo
,
Jianxiong Liao
,
Huanle Xu
,
Zhi Zhou
,
Chengzhong Xu
Embracing Imbalance: Dynamic Load Shifting among Microservice Containers in Shared Clusters
In this paper, we utilize an alternative approach by leveraging load imbalance. The central concept involves the dynamic load shifting …
Shutian Luo
,
Jianxiong Liao
,
Chenyu Lin
,
Huanle Xu
,
Zhi Zhou
,
Chengzhong Xu
Fast and Fair Training for Deep Learning in Heterogeneous GPU Clusters
This paper presents FFT, a novel scheduling system designed for Fast and Fair deep learning Training in heterogeneous GPU clusters. …
Zizhao Mo
,
Huanle Xu
,
Wing Cheong Lau
Grad: Intelligent Microservice Scaling by Harnessing Resource Fungibility
This paper introduces Grad, an intelligent microservice scaling framework by harnessing resource fungibility between critical and …
Liao Chen
,
Chenyu Lin
,
Shutian Luo
,
Huanle Xu
,
Chengzhong Xu
Multiplexing Dynamic Deep Learning Workloads with SLO-awareness in GPU Clusters
In this paper, we introduce Mudi, a new SLO-aware system designed to optimize the utilization of GPU resources within large-scale …
Wenyan Chen
,
Chengzhi Lu
,
Huanle Xu
,
Kejiang Ye
,
Chengzhong Xu
Optimal Resource Efficiency with Fairness in Heterogeneous GPU Clusters
Ensuring the highest training throughput to maximize resource efficiency, while maintaining fairness among users, is critical for deep …
Zizhao Mo
,
Huanle Xu
,
Wing Cheong Lau
PDF
SMIless: Serving DAG-based Inference with Dynamic Invocations under Serverless Computing
The deployment of ML serving applications, featuring multiple inference functions on serverless platforms, has gained substantial …
Chengzhi Lu
,
Huanle Xu
,
Yudan Li
,
Wenyan Chen
,
Kejiang Ye
,
Chengzhong Xu
PDF
Derm: SLA-aware Resource Management for Highly Dynamic Microservices
In this paper, we present Derm, a new resource management system designed for microservice applications with highly dynamic graphs. Our …
Liao Chen
,
Shutian Luo
,
Chenyu Lin
,
Zizhao Mo
,
Huanle Xu
,
Kejiang Ye
,
Chengzhong Xu
PDF
Optimizing Dynamic Data Center Provisioning through Speed Scaling: A Primal-Dual Perspective
A significant proportion of energy consumed in modern data centers and clouds is dedicated to provisioning idle servers for maintaining …
Xiaosong Chen
,
Huanle Xu
,
Chengzhong Xu
PDF
Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters
Modern GPU clusters inherently exhibit heterogeneity, encompassing various aspects such as computation and communication. This …
Zizhao Mo
,
Huanle Xu
,
Chengzhong Xu
PDF
»
Cite
×