Scholars@Duke publication: Robust Exploration with Adversary via Langevin Monte Carlo

Robust Exploration with Adversary via Langevin Monte Carlo

Publication , Conference

Hsu, HL; Pajic, M

Published in: Proceedings of Machine Learning Research

January 1, 2024

In the realm of Deep Q-Networks (DQNs), numerous exploration strategies have demonstrated efficacy within controlled environments. However, these methods encounter formidable challenges when confronted with the unpredictability of real-world scenarios marked by disturbances. The optimization of exploration efficiency under such disturbances is not fully investigated. In response to these challenges, this work introduces a versatile reinforcement learning (RL) framework that systematically addresses the intricate interplay between exploration and robustness in dynamic and unpredictable environments. In particular, we propose a robust RL methodology, framed within a two-player max-min adversarial paradigm; this formulation is cast as a Probabilistic Action Robust Markov Decision Process (MDP), grounded in a cyber-physical perspective. Our methodology capitalizes on Langevin Monte Carlo (LMC) for Q-function exploration, facilitating iterative updates that empower both the protagonist and adversary to efficaciously explore. Notably, we extend this adversarial training paradigm to encompass robustness against delayed feedback episodes. Empirical evaluation, conducted on benchmark problems such as N-Chain and deep brain stimulation, underlines the consistent superiority of our method over baseline approaches across diverse perturbation scenarios and instances of delayed feedback.

Duke Scholars

Author Miroslav Pajic Electrical and Computer Engineering

Published In

Proceedings of Machine Learning Research

EISSN

2640-3498

Publication Date

January 1, 2024

Volume

242

Start / End Page

1592 / 1605

Citation

APA

Chicago

ICMJE

MLA

NLM

Hsu, H. L., & Pajic, M. (2024). Robust Exploration with Adversary via Langevin Monte Carlo. In Proceedings of Machine Learning Research (Vol. 242, pp. 1592–1605).

Hsu, H. L., and M. Pajic. “Robust Exploration with Adversary via Langevin Monte Carlo.” In Proceedings of Machine Learning Research, 242:1592–1605, 2024.

Hsu HL, Pajic M. Robust Exploration with Adversary via Langevin Monte Carlo. In: Proceedings of Machine Learning Research. 2024. p. 1592–605.

Hsu, H. L., and M. Pajic. “Robust Exploration with Adversary via Langevin Monte Carlo.” Proceedings of Machine Learning Research, vol. 242, 2024, pp. 1592–605.

Hsu HL, Pajic M. Robust Exploration with Adversary via Langevin Monte Carlo. Proceedings of Machine Learning Research. 2024. p. 1592–1605.

Published In

Proceedings of Machine Learning Research

EISSN

2640-3498

Publication Date

January 1, 2024

Volume

242

Start / End Page

1592 / 1605