Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning.
Publication
, Preprint
Gong, Z; Duarte, F; Mooney, R; Pearson, J
August 19, 2025
Duke Scholars
Citation
APA
Chicago
ICMJE
MLA
NLM
Gong, Z., Duarte, F., Mooney, R., & Pearson, J. (2025). Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning. https://doi.org/10.1101/2025.07.18.665446
Gong, Ziyi, Fabiola Duarte, Richard Mooney, and John Pearson. “Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning.,” August 19, 2025. https://doi.org/10.1101/2025.07.18.665446.
Gong Z, Duarte F, Mooney R, Pearson J. Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning. 2025.
Gong, Ziyi, et al. Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning. 19 Aug. 2025. Pubmed, doi:10.1101/2025.07.18.665446.
Gong Z, Duarte F, Mooney R, Pearson J. Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning. 2025.