Scholars@Duke publication: Optimal sequential exploration: Bandits, clairvoyants, and wildcats

Optimal sequential exploration: Bandits, clairvoyants, and wildcats

Publication , Journal Article

Brown, DB; Smith, JE

Published in: Operations Research

May 1, 2013

This paper was motivated by the problem of developing an optimal policy for exploring an oil and gas field in the North Sea. Where should we drill first? Where do we drill next? In this and many other problems, we face a trade-off between earning (e.g., drilling immediately at the sites with maximal expected values) and learning (e.g., drilling at sites that provide valuable information) that may lead to greater earnings in the future. These "sequential exploration problems" resemble a multiarmed bandit problem, but probabilistic dependence plays a key role: outcomes at drilled sites reveal information about neighboring targets. Good exploration policies will take advantage of this information as it is revealed. We develop heuristic policies for sequential exploration problems and complement these heuristics with upper bounds on the performance of an optimal policy. We begin by grouping the targets into clusters of manageable size. The heuristics are derived from a model that treats these clusters as independent. The upper bounds are given by assuming each cluster has perfect information about the results from all other clusters. The analysis relies heavily on results for bandit superprocesses, a generalization of the multiarmed bandit problem. We evaluate the heuristics and bounds using Monte Carlo simulation and, in the North Sea example, we find that the heuristic policies are nearly optimal. ©2013 INFORMS.

Duke Scholars

Author David B. Brown Fuqua School of Business

Published In

Operations Research

DOI

10.1287/opre.2013.1164

EISSN

1526-5463

ISSN

0030-364X

Publication Date

May 1, 2013

Volume

Issue

Start / End Page

644 / 665

Related Subject Headings

Operations Research
3507 Strategy, management and organisational behaviour
1503 Business and Management
0802 Computation Theory and Mathematics
0102 Applied Mathematics

Citation

APA

Chicago

ICMJE

MLA

NLM

Brown, D. B., & Smith, J. E. (2013). Optimal sequential exploration: Bandits, clairvoyants, and wildcats. Operations Research, 61(3), 644–665. https://doi.org/10.1287/opre.2013.1164

Brown, D. B., and J. E. Smith. “Optimal sequential exploration: Bandits, clairvoyants, and wildcats.” Operations Research 61, no. 3 (May 1, 2013): 644–65. https://doi.org/10.1287/opre.2013.1164.

Brown DB, Smith JE. Optimal sequential exploration: Bandits, clairvoyants, and wildcats. Operations Research. 2013 May 1;61(3):644–65.

Brown, D. B., and J. E. Smith. “Optimal sequential exploration: Bandits, clairvoyants, and wildcats.” Operations Research, vol. 61, no. 3, May 2013, pp. 644–65. Scopus, doi:10.1287/opre.2013.1164.

Brown DB, Smith JE. Optimal sequential exploration: Bandits, clairvoyants, and wildcats. Operations Research. 2013 May 1;61(3):644–665.

Published In

Operations Research

DOI

10.1287/opre.2013.1164

EISSN

1526-5463

ISSN

0030-364X

Publication Date

May 1, 2013

Volume

Issue

Start / End Page

644 / 665

Related Subject Headings

Operations Research
3507 Strategy, management and organisational behaviour
1503 Business and Management
0802 Computation Theory and Mathematics
0102 Applied Mathematics