Bandits for bmo functions
Publication
, Conference
Wang, T; Rudin, C
Published in: 37th International Conference on Machine Learning, ICML 2020
January 1, 2020
We study the bandit problem where the underlying expected reward is a Bounded Mean Oscillation (BMO) function. BMO functions are allowed to be discontinuous and unbounded, and are useful in modeling signals with infinities in the domain. We develop a toolset for BMO bandits, and provide an algorithm that can achieve poly-log-regret a regret measured against an arm that is optimal after removing a-sized portion of the arm space.
Duke Scholars
Published In
37th International Conference on Machine Learning, ICML 2020
Publication Date
January 1, 2020
Volume
PartF168147-13
Start / End Page
9938 / 9948
Citation
APA
Chicago
ICMJE
MLA
NLM
Wang, T., & Rudin, C. (2020). Bandits for bmo functions. In 37th International Conference on Machine Learning, ICML 2020 (Vol. PartF168147-13, pp. 9938–9948).
Wang, T., and C. Rudin. “Bandits for bmo functions.” In 37th International Conference on Machine Learning, ICML 2020, PartF168147-13:9938–48, 2020.
Wang T, Rudin C. Bandits for bmo functions. In: 37th International Conference on Machine Learning, ICML 2020. 2020. p. 9938–48.
Wang, T., and C. Rudin. “Bandits for bmo functions.” 37th International Conference on Machine Learning, ICML 2020, vol. PartF168147-13, 2020, pp. 9938–48.
Wang T, Rudin C. Bandits for bmo functions. 37th International Conference on Machine Learning, ICML 2020. 2020. p. 9938–9948.
Published In
37th International Conference on Machine Learning, ICML 2020
Publication Date
January 1, 2020
Volume
PartF168147-13
Start / End Page
9938 / 9948