Cremes: Cost-Efficient and Reliable Microservice Execution on Spot Instances

Liao Chen, Chenyu Lin, Junlin Chen, Shutian Luo, Huanle Xu, Chengzhong Xu

April 2026

Abstract

While spot instances offer a cost-effective alternative to on-demand cloud resources, they introduce reliability challenges for latency-sensitive microservices due to preemption risks and unpredictable provisioning delays. Conventional resource management systems, which often rely on assumptions of immediate instance availability, fail to account for these operational realities—resulting in increased risk of SLO violations when deployed in spot-based environments. In this paper, we propose Cremes, an adaptive and cost-efficient scaling framework that ensures microservice recovery within the spot instance grace period. Cremes explicitly models both instance waiting time and microservice startup latency, leverages cloud-exposed availability metrics, and applies lightweight machine learning for end-to-end latency prediction. By integrating these components into a multi-dimensional optimization engine, Cremes minimizes cost while satisfying recovery and performance constraints. Evaluations on AWS instances using DeathStarBench, TrainTicket, and Alibaba trace-driven experiments show that Cremes reduces infrastructure cost by up to 37.1% and maintains SLO violation rates under preemptible environments below 6.7%.

Type

Conference paper

Publication

In The 35th International Symposium on High-Performance Parallel and Distributed Computing

Cremes: Cost-Efficient and Reliable Microservice Execution on Spot Instances

Abstract

Liao Chen

2022 - Current

Chenyu Lin

2022-2024 Master Student

Junlin Chen

2025 - Current

Shutian Luo

2021-2023 PhD Student

Huanle Xu

2021 - Current

Chengzhong Xu

2019 - Current