Combinatorial Multi-Armed Bandits with Fairness Constraints: An Online Convex Optimization Perspective

Xiaosong Chen, Hanqin Zhuang, Yang Liu, Huanle Xu, Wing Cheong Lau

January 2025

Abstract

The problem of multi-armed bandit (MAB) with fairness constraints has emerged as an important research topic recently. For such problems, one common objective is to maximize the total rewards within a fixed number of pull rounds, while satisfying the fairness requirement of a minimum selection fraction for each individual arm in the long run. Previous works have made substantial advancements in designing various online selection solutions for MAB, however, when incorporating such fairness constraints, they fail to achieve a sublinear regret bound. In this paper, we study a combinatorial MAB problem with concave objective and fairness constraints. In particular, we design a new selection algorithm that solves MAB problems from an online convex optimization perspective. Our algorithm is computationally efficient, and more importantly, manages to achieve a sublinear regret bound of $\mathcal{O}(\sqrt{T \ln{T}})$ with high probability guarantees in $T$ selection rounds. We also extend our framework to include more general knapsack constraints. Finally, we assess the performance of our algorithm through extensive simulations and real dataset applications, demonstrating its significant advantages over baseline schemes.

Type

Journal article

Publication

Journal of Artificial Intelligence Research, 2025

Combinatorial Multi-Armed Bandits with Fairness Constraints: An Online Convex Optimization Perspective

Abstract

Xiaosong Chen

2022 - Current

Hanqin Zhuang

2023 - Current

Huanle Xu

2021 - Current