Skip to main content

Policy evaluation using the Ω-return

Publication ,  Conference
Thomas, PS; Niekum, S; Theocharous, G; Konidaris, G
Published in: Advances in Neural Information Processing Systems
January 1, 2015

We propose the-return as an alternative to the λ-return currently used by the TD(λ) family of algorithms. The benefit of the-return is that it accounts for the correlation of different length returns. Because it is difficult to compute exactly, we suggest one way of approximating the-return. We provide empirical studies that suggest that it is superior to the λ-return and-return for a variety of problems.

Duke Scholars

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2015

Volume

2015-January

Start / End Page

334 / 342

Related Subject Headings

  • 1702 Cognitive Sciences
  • 1701 Psychology
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Thomas, P. S., Niekum, S., Theocharous, G., & Konidaris, G. (2015). Policy evaluation using the Ω-return. In Advances in Neural Information Processing Systems (Vol. 2015-January, pp. 334–342).
Thomas, P. S., S. Niekum, G. Theocharous, and G. Konidaris. “Policy evaluation using the Ω-return.” In Advances in Neural Information Processing Systems, 2015-January:334–42, 2015.
Thomas PS, Niekum S, Theocharous G, Konidaris G. Policy evaluation using the Ω-return. In: Advances in Neural Information Processing Systems. 2015. p. 334–42.
Thomas, P. S., et al. “Policy evaluation using the Ω-return.” Advances in Neural Information Processing Systems, vol. 2015-January, 2015, pp. 334–42.
Thomas PS, Niekum S, Theocharous G, Konidaris G. Policy evaluation using the Ω-return. Advances in Neural Information Processing Systems. 2015. p. 334–342.

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2015

Volume

2015-January

Start / End Page

334 / 342

Related Subject Headings

  • 1702 Cognitive Sciences
  • 1701 Psychology