Scholars@Duke publication: DeepMellow: Removing the need for a target network in deep q-learning

DeepMellow: Removing the need for a target network in deep q-learning

Publication , Conference

Kim, S; Asadi, K; Littman, M; Konidaris, G

Published in: Ijcai International Joint Conference on Artificial Intelligence

January 1, 2019

Deep Q-Network (DQN) is an algorithm that achieves human-level performance in complex domains like Atari games. One of the important elements of DQN is its use of a target network, which is necessary to stabilize learning. We argue that using a target network is incompatible with online reinforcement learning, and it is possible to achieve faster and more stable learning without a target network when we use Mellowmax, an alternative softmax operator. We derive novel properties of Mellowmax, and empirically show that the combination of DQN and Mellowmax, but without a target network, outperforms DQN with a target network.

Published In

Ijcai International Joint Conference on Artificial Intelligence

DOI

10.24963/ijcai.2019/379

ISSN

1045-0823

Publication Date

January 1, 2019

Volume

2019-August

Start / End Page

2733 / 2739

Citation

APA

Chicago

ICMJE

MLA

NLM

Kim, S., Asadi, K., Littman, M., & Konidaris, G. (2019). DeepMellow: Removing the need for a target network in deep q-learning. In Ijcai International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 2733–2739). https://doi.org/10.24963/ijcai.2019/379

Kim, S., K. Asadi, M. Littman, and G. Konidaris. “DeepMellow: Removing the need for a target network in deep q-learning.” In Ijcai International Joint Conference on Artificial Intelligence, 2019-August:2733–39, 2019. https://doi.org/10.24963/ijcai.2019/379.

Kim S, Asadi K, Littman M, Konidaris G. DeepMellow: Removing the need for a target network in deep q-learning. In: Ijcai International Joint Conference on Artificial Intelligence. 2019. p. 2733–9.

Kim, S., et al. “DeepMellow: Removing the need for a target network in deep q-learning.” Ijcai International Joint Conference on Artificial Intelligence, vol. 2019-August, 2019, pp. 2733–39. Scopus, doi:10.24963/ijcai.2019/379.

Kim S, Asadi K, Littman M, Konidaris G. DeepMellow: Removing the need for a target network in deep q-learning. Ijcai International Joint Conference on Artificial Intelligence. 2019. p. 2733–2739.

Published In

Ijcai International Joint Conference on Artificial Intelligence

DOI

10.24963/ijcai.2019/379

ISSN

1045-0823

Publication Date

January 1, 2019

Volume

2019-August

Start / End Page

2733 / 2739