DeepMellow: Removing the need for a target network in deep q-learning
Publication
, Conference
Kim, S; Asadi, K; Littman, M; Konidaris, G
Published in: Ijcai International Joint Conference on Artificial Intelligence
January 1, 2019
Deep Q-Network (DQN) is an algorithm that achieves human-level performance in complex domains like Atari games. One of the important elements of DQN is its use of a target network, which is necessary to stabilize learning. We argue that using a target network is incompatible with online reinforcement learning, and it is possible to achieve faster and more stable learning without a target network when we use Mellowmax, an alternative softmax operator. We derive novel properties of Mellowmax, and empirically show that the combination of DQN and Mellowmax, but without a target network, outperforms DQN with a target network.
Published In
Ijcai International Joint Conference on Artificial Intelligence
DOI
ISSN
1045-0823
Publication Date
January 1, 2019
Volume
2019-August
Start / End Page
2733 / 2739
Citation
APA
Chicago
ICMJE
MLA
NLM
Kim, S., Asadi, K., Littman, M., & Konidaris, G. (2019). DeepMellow: Removing the need for a target network in deep q-learning. In Ijcai International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 2733–2739). https://doi.org/10.24963/ijcai.2019/379
Kim, S., K. Asadi, M. Littman, and G. Konidaris. “DeepMellow: Removing the need for a target network in deep q-learning.” In Ijcai International Joint Conference on Artificial Intelligence, 2019-August:2733–39, 2019. https://doi.org/10.24963/ijcai.2019/379.
Kim S, Asadi K, Littman M, Konidaris G. DeepMellow: Removing the need for a target network in deep q-learning. In: Ijcai International Joint Conference on Artificial Intelligence. 2019. p. 2733–9.
Kim, S., et al. “DeepMellow: Removing the need for a target network in deep q-learning.” Ijcai International Joint Conference on Artificial Intelligence, vol. 2019-August, 2019, pp. 2733–39. Scopus, doi:10.24963/ijcai.2019/379.
Kim S, Asadi K, Littman M, Konidaris G. DeepMellow: Removing the need for a target network in deep q-learning. Ijcai International Joint Conference on Artificial Intelligence. 2019. p. 2733–2739.
Published In
Ijcai International Joint Conference on Artificial Intelligence
DOI
ISSN
1045-0823
Publication Date
January 1, 2019
Volume
2019-August
Start / End Page
2733 / 2739