In this paper, we utilize an alternative approach by leveraging load imbalance. The central concept involves the dynamic load shifting across microservice containers with a focus on imbalance awareness. However, achieving seamless integration between load shifting and resource scaling, while accommodating the demands of partial connection between upstream and downstream containers, remains a challenge. To address this challenge, we introduce Imbres—a new microservice system that optimizes load shifting, connection management, and resource scaling in tandem. One significant advantage of Imbres lies in its rapid responsiveness, relying solely on online gradients of latency, eliminating the need for offline profiling. Evaluation using real microservice benchmarks reveals that Imbres reduces resource allocation by up to 62% and decreases SLA violation probability by up to 82%, compared to state-of-the-art systems.