Scholars@Duke publication: On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

Publication , Journal Article

Shahrampour, S; Noshad, M; Tarokh, V

Published in: IEEE Transactions on Signal Processing

August 15, 2017

We consider the best-arm identification problem in multi-armed bandits, which focuses purely on exploration. A player is given a fixed budget to explore a finite set of arms, and the rewards of each arm are drawn independently from a fixed, unknown distribution. The player aims to identify the arm with the largest expected reward. We propose a general framework to unify sequential elimination algorithms, where the arms are dismissed iteratively until a unique arm is left. Our analysis reveals a novel performance measure expressed in terms of the sampling mechanism and number of eliminated arms at each round. Based on this result, we develop an algorithm that divides the budget according to a nonlinear function of remaining arms at each round. We provide theoretical guarantees for the algorithm, characterizing the suitable nonlinearity for different problem environments described by the number of competitive arms. Matching the theoretical results, our experiments show that the nonlinear algorithm outperforms the state-of-the-art. We finally study the side-observation model, where pulling an arm reveals the rewards of its related arms, and we establish improved theoretical guarantees in the pure-exploration setting.

Duke Scholars

Author Vahid Tarokh Electrical and Computer Engineering

Published In

IEEE Transactions on Signal Processing

DOI

10.1109/TSP.2017.2706192

ISSN

1053-587X

Publication Date

August 15, 2017

Volume

Issue

Start / End Page

4281 / 4292

Related Subject Headings

Networking & Telecommunications

Citation

APA

Chicago

ICMJE

MLA

NLM

Shahrampour, S., Noshad, M., & Tarokh, V. (2017). On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits. IEEE Transactions on Signal Processing, 65(16), 4281–4292. https://doi.org/10.1109/TSP.2017.2706192

Shahrampour, S., M. Noshad, and V. Tarokh. “On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits.” IEEE Transactions on Signal Processing 65, no. 16 (August 15, 2017): 4281–92. https://doi.org/10.1109/TSP.2017.2706192.

Shahrampour S, Noshad M, Tarokh V. On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits. IEEE Transactions on Signal Processing. 2017 Aug 15;65(16):4281–92.

Shahrampour, S., et al. “On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits.” IEEE Transactions on Signal Processing, vol. 65, no. 16, Aug. 2017, pp. 4281–92. Scopus, doi:10.1109/TSP.2017.2706192.

Shahrampour S, Noshad M, Tarokh V. On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits. IEEE Transactions on Signal Processing. 2017 Aug 15;65(16):4281–4292.

Published In

IEEE Transactions on Signal Processing

DOI

10.1109/TSP.2017.2706192

ISSN

1053-587X

Publication Date

August 15, 2017

Volume

Issue

Start / End Page

4281 / 4292

Related Subject Headings

Networking & Telecommunications