Cloud and Distributed Systems Lab
Cloud and Distributed Systems Lab
Overview
Research
News
Members
Publications
Contact
Kejiang Ye
Latest
High Throughput and Low Latency LLM Serving via Adaptive KV Caching
Multiplexing Dynamic Deep Learning Workloads with SLO-awareness in GPU Clusters
SMIless: Serving DAG-based Inference with Dynamic Invocations under Serverless Computing
Derm: SLA-aware Resource Management for Highly Dynamic Microservices
Optimizing Resource Management for Shared Microservices: A Scalable System Design
Interference-aware Multiplexing for Deep Learning in GPU Clusters A Middleware Approach
Erms: Efficient Resource Management for Shared Microservices with SLA Guarantees
The Power of Prediction Microservice Auto Scaling via Workload Learning
An In-Depth Study of Microservice Call Graph and Runtime Performance
Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis
Cite
×