Cloud and Distributed Systems Lab
Cloud and Distributed Systems Lab
Overview
Research
News
Members
Publications
Contact
Paper-Conference
SMIless: Serving DAG-based Inference with Dynamic Invocations under Serverless Computing
The deployment of ML serving applications, featuring multiple inference functions on serverless platforms, has gained substantial …
Chengzhi Lu
,
Huanle Xu
,
Yudan Li
,
Wenyan Chen
,
Kejiang Ye
,
Chengzhong Xu
PDF
Derm: SLA-aware Resource Management for Highly Dynamic Microservices
In this paper, we present Derm, a new resource management system designed for microservice applications with highly dynamic graphs. Our …
Liao Chen
,
Shutian Luo
,
Chenyu Lin
,
Zizhao Mo
,
Huanle Xu
,
Kejiang Ye
,
Chengzhong Xu
PDF
Optimizing Dynamic Data Center Provisioning through Speed Scaling: A Primal-Dual Perspective
A significant proportion of energy consumed in modern data centers and clouds is dedicated to provisioning idle servers for maintaining …
Xiaosong Chen
,
Huanle Xu
,
Chengzhong Xu
PDF
Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters
Modern GPU clusters inherently exhibit heterogeneity, encompassing various aspects such as computation and communication. This …
Zizhao Mo
,
Huanle Xu
,
Chengzhong Xu
PDF
Interference-aware Multiplexing for Deep Learning in GPU Clusters A Middleware Approach
A common strategy for improving efficiency in training deep learning entails multiplexing tasks on a single GPU. To mitigate the …
Wenyan Chen
,
Zizhao Mo
,
Huanle Xu
,
Kejiang Ye
,
Chengzhong Xu
PDF
PERT-GNN: Latency Prediction for Microservice-based Cloud-Native Applications via Graph Neural Networks
Cloud-native applications using microservice architectures are rapidly replacing traditional monolithic applications. To meet …
Da Sun Handason Tam
,
Yang Liu,
,
Huanle Xu
,
Siyue Xie
,
Wing Cheong Lau
PDF
Understanding and Optimizing Workloads for Unified Resource Management in Large Cloud Platforms
To fully utilize computing resources, cloud providers such as Google and Alibaba choose to co-locate online services with batch …
Chengzhi Lu
,
Huanle Xu
,
Keying Ye
,
Guoyao Xu
,
Liping Zhang
,
Guodong Yang
,
Chengzhong Xu
PDF
Erms: Efficient Resource Management for Shared Microservices with SLA Guarantees
A common approach to improving resource utilization in data centers is to adaptively provision resources based on the actual workload. …
Shutian Luo
,
Huanle Xu
,
Kejiang Ye
,
Guoyao Xu
,
Liping Zhang
,
Jian He
,
Guodong Yang
,
Chengzhong Xu
PDF
The Power of Prediction Microservice Auto Scaling via Workload Learning
When deploying microservices in production clusters, it is critical to automatically scale containers to improve cluster utilization …
Shutian Luo
,
Huanle Xu
,
Kejiang Ye
,
Guoyao Xu
,
Liping Zhang
,
Guodong Yang
,
Chengzhong Xu
PDF
Code
Multi Resource Scheduling with Task Cloning in Heterogeneous Clusters
To mitigate the straggler effect, today’s systems and computing frameworks have adopted redundancy to launch extra copies for …
Huanle Xu
,
Yang Liu
,
Wing Cheong Lau
PDF
«
»
Cite
×