A verification theorem for indexability of real state restless bandits
The multi-armed restless bandit problem provides a powerful modelling framework for dynamic priority allocation among multiple stochastic projects vying for a common resource. Although the problem is in general computationally intractable, suffering from the curse of dimensionality, Whittle (1988) proposed a tractable heuristic index policy, whose empirical performance has been typically found to be nearly optimal. Yet a fundamental roadblock for application of Whittle's heuristic is the need to establish indexability (existence of the index) and to efficiently compute the index. In this talk I will present recent results of my research giving sufficient conditions for indexability of general real-state discrete-time restless bandit projects. The main contribution is a verification theorem establishing that, if performance metrics and an explicitly defined marginal productivity (MP) index satisfy three conditions, then the project is indexable with its Whittle index being given by the MP index, in a form implying optimality of threshold policies for dynamic project control. The proof is based on partial conservation laws and infinite-dimensional linear programming duality. Further contributions include characterisations of the index as a Radon-Nikodym derivative and as a shadow price, and a recursive index-computing scheme.