Cremes: Cost-Efficient and Reliable Microservice Execution on Spot Instances

Abstract

While spot instances offer a cost-effective alternative to on-demand cloud resources, they introduce reliability challenges for latency-sensitive microservices due to preemption risks and unpredictable provisioning delays. Conventional resource management systems, which often rely on assumptions of immediate instance availability, fail to account for these operational realities—resulting in increased risk of SLO violations when deployed in spot-based environments. In this paper, we propose Cremes, an adaptive and cost-efficient scaling framework that ensures microservice recovery within the spot instance grace period. Cremes explicitly models both instance waiting time and microservice startup latency, leverages cloud-exposed availability metrics, and applies lightweight machine learning for end-to-end latency prediction. By integrating these components into a multi-dimensional optimization engine, Cremes minimizes cost while satisfying recovery and performance constraints. Evaluations on AWS instances using DeathStarBench, TrainTicket, and Alibaba trace-driven experiments show that Cremes reduces infrastructure cost by up to 37.1% and maintains SLO violation rates under preemptible environments below 6.7%.

Publication
In The 35th International Symposium on High-Performance Parallel and Distributed Computing
Liao Chen
Liao Chen
2022 - Current

My research interests include distributed system and cloud computing.

Chenyu Lin
2022-2024 Master Student
Junlin Chen
Junlin Chen
2025 - Current

.

Shutian Luo
Shutian Luo
2021-2023 PhD Student
Huanle Xu
Huanle Xu
2021 - Current

I am currently an assistant professor from the Department of Computer and Information Scicence, Univeristy of Macau.

Chengzhong Xu
Chengzhong Xu
2019 - Current

I am currently a Chair Professor in the Department of Computer and Information Science and serve as the Dean of the Faculty of Science and Technology at the University of Macau.