Grad: Intelligent Microservice Scaling by Harnessing Resource Fungibility

Abstract

This paper introduces Grad, an intelligent microservice scaling framework by harnessing resource fungibility between critical and non-critical microservices. Addressing the challenges posed by the dynamic nature of resource fungibility during scaling, Grad incorporates three key components. First, Grad employs a modular learning approach to profile individual microservice latency in relation to environmental conditions. Utilizing gradient extracts from this profile, Grad designs a scalable optimization module to dynamically select the optimal set of microservices for scaling. To rapidly mitigate SLA violations, Grad also deploys an accurate end-to-end latency predictor, serving as an simulator to obtain real-time feedback. We evaluate Grad in our cluster using real microservice benchmarks and production traces, demonstrating its ability to reduce resource usage by 49.1% and lower the probability of SLA violations by 3.7$\times$ when compared to state-of-the-art solutions.

Publication
In The International Symposium on High-Performance Computer Architecture (HPCA) 2025
Liao Chen
Liao Chen
2022 - Current

My research interests include distributed system and cloud computing.

Chenyu Lin
2022-2024 Master Student
Shutian Luo
2021-2023 PhD Student
Huanle Xu
Huanle Xu
2021.01 - Current

I am currently an assistant professor from the Department of Computer and Information Scicence, Univeristy of Macau.