Skip to main content

Steering Decision Transformers via Temporal Difference Learning

Publication ,  Conference
Hsu, HL; Bozkurt, AK; Dong, J; Gao, Q; Tarokh, V; Pajic, M
Published in: IEEE International Conference on Intelligent Robots and Systems
January 1, 2024

Decision Transformers (DTs) have been highly effective for offline reinforcement learning (RL) tasks, successfully modeling the sequences of actions in a given set of demonstrations. However, DTs may perform poorly in stochastic environments, which are prevalent in robotics scenarios. In this paper, we identify that the root cause of this performance degradation is the growing variance of returns-to-go, the signal used by DTs to predict actions, accumulated over the horizon. Building upon this insight, we propose an extension to DTs that allows them to be steered toward high-reward regions, where the expected returns are estimated using temporal difference learning. This way, we not only mitigate the growing variance problem but also eliminate the need for DTs to have access to returns-to-go during evaluation and deployment phases. We show that our method outperforms state-of-the-art offline RL methods in both simulated and real-world robotic arm environments.

Duke Scholars

Published In

IEEE International Conference on Intelligent Robots and Systems

DOI

EISSN

2153-0866

ISSN

2153-0858

Publication Date

January 1, 2024

Start / End Page

7477 / 7483
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Hsu, H. L., Bozkurt, A. K., Dong, J., Gao, Q., Tarokh, V., & Pajic, M. (2024). Steering Decision Transformers via Temporal Difference Learning. In IEEE International Conference on Intelligent Robots and Systems (pp. 7477–7483). https://doi.org/10.1109/IROS58592.2024.10801303
Hsu, H. L., A. K. Bozkurt, J. Dong, Q. Gao, V. Tarokh, and M. Pajic. “Steering Decision Transformers via Temporal Difference Learning.” In IEEE International Conference on Intelligent Robots and Systems, 7477–83, 2024. https://doi.org/10.1109/IROS58592.2024.10801303.
Hsu HL, Bozkurt AK, Dong J, Gao Q, Tarokh V, Pajic M. Steering Decision Transformers via Temporal Difference Learning. In: IEEE International Conference on Intelligent Robots and Systems. 2024. p. 7477–83.
Hsu, H. L., et al. “Steering Decision Transformers via Temporal Difference Learning.” IEEE International Conference on Intelligent Robots and Systems, 2024, pp. 7477–83. Scopus, doi:10.1109/IROS58592.2024.10801303.
Hsu HL, Bozkurt AK, Dong J, Gao Q, Tarokh V, Pajic M. Steering Decision Transformers via Temporal Difference Learning. IEEE International Conference on Intelligent Robots and Systems. 2024. p. 7477–7483.

Published In

IEEE International Conference on Intelligent Robots and Systems

DOI

EISSN

2153-0866

ISSN

2153-0858

Publication Date

January 1, 2024

Start / End Page

7477 / 7483