Page 94 - Read Online
P. 94
Zhang et al. Intell Robot 2022;2(3):27597 I http://dx.doi.org/10.20517/ir.2022.20 Page 287
67. Kostrikov I, Yarats D, Fergus R. Image augmentation is all you need: regularizing deep reinforcement learning from pixels. arXiv
preprint arXiv:200413649 2020. DOI
68. Yarats D, Fergus R, Lazaric A, Pinto L. Mastering visual continuous control: improved dataaugmented reinforcement learning. arXiv
preprint arXiv:210709645 2021. DOI
69. Ahmed O, Träuble F, Goyal A, et al. Causalworld: a robotic manipulation benchmark for causal structure and transfer learning. arXiv
preprint arXiv:201004296 2020. DOI
70. Dittadi A, Träuble F, Wüthrich M, et al. The role of pretrained representations for the OOD generalization of RL agents. arXiv preprint
arXiv:210705686 2021. DOI
71. Hsu K, Kim MJ, Rafailov R, Wu J, Finn C. Visionbased manipulators need to also see from their hands. arXiv preprint arXiv:220312677
2022. DOI
72. Eysenbach B, Asawa S, Chaudhari S, Levine S, Salakhutdinov R. Offdynamics reinforcement learning: training for transfer with domain
classifiers. arXiv preprint arXiv:200613916 2020. DOI
73. Liu J, Zhang H, Wang D. DARA: dynamicsaware reward augmentation in offline reinforcement learning. arXiv preprint
arXiv:220306662 2022. DOI
74. Lee K, Seo Y, Lee S, Lee H, Shin J. Contextaware dynamics model for generalization in modelbased reinforcement learning. In:
Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 1318 July 2020, Virtual Event. vol. 119 of Pro
ceedings of Machine Learning Research. PMLR; 2020. pp. 5757–66. Avaialble from: http://proceedings.mlr.press/v119/lee20g.html
[Last accessed on 30 Aug 2022].
75. Yu W, Tan J, Liu CK, Turk G. Preparing for the unknown: learning a universal policy with online system identification. arXiv preprint
arXiv:170202453 2017. DOI
76. Chen D, Zhou B, Koltun V, Krähenbühl P. Learning by cheating. In: Conference on Robot Learning. PMLR; 2020. pp. 66–75. Available
from: http://proceedings.mlr.press/v100/chen20a.html [Last accessed on 30 Aug 2022].
77. Tobin J, Fong R, Ray A, et al. Domain randomization for transferring deep neural networks from simulation to the real world 2017. DOI
78. Peng XB, Andrychowicz M, Zaremba W, Abbeel P. Simtoreal transfer of robotic control with dynamics randomization 2017. DOI
79. Kaiser L, Babaeizadeh M, Milos P, et al. Modelbased reinforcement learning for atari. arXiv preprint arXiv:190300374 2019. DOI
80. Schrittwieser J, Antonoglou I, Hubert T, et al. Mastering atari, go, chess and shogi by planning with a learned model. Nature
2020;588:604–9. DOI
81. Ye W, Liu S, Kurutach T, Abbeel P, Gao Y. Mastering atari games with limited data. Adv Neural Inform Proc Syst 2021;34:25476–88.
Available from: https://proceedings.neurips.cc/paper/2021/hash/d5eca8dc3820cad9fe56a3bafda65ca1Abstract.html [Last accessed on
30 Aug 2022].
82. Chua K, Calandra R, McAllister R, Levine S. Deep reinforcement learning in a handful of trials using probabilistic dynamics models.
Adva neural inform proc syst 2018;31. Available from: https://proceedings.neurips.cc/paper/2018/hash/3de568f8597b94bda53149c7d
7f5958cAbstract.html [Last accessed on 30 Aug 2022].
83. Janner M, Fu J, Zhang M, Levine S. When to trust your model: modelbased policy optimization. Adv Neural Inform Proc Syst 2019;32.
Available from: https://proceedings.neurips.cc/paper/2019/hash/5faf461eff3099671ad63c6f3f094f7fAbstract.html [Last accessed on 30
Aug 2022].
84. Hansen N, Wang X, Su H. Temporal difference learning for model predictive control. arXiv preprint arXiv:220304955 2022. DOI
85. Tassa Y, Doron Y, Muldal A, et al. Deepmind control suite. arXiv preprint arXiv:180100690 2018. DOI
86. Peng XB, Ma Z, Abbeel P, Levine S, Kanazawa A. Amp: adversarial motion priors for stylized physicsbased character control. ACM
Trans Graph (TOG) 2021;40:1–20. DOI
87. Peng XB, Guo Y, Halper L, Levine S, Fidler S. ASE: largescale reusable adversarial skill embeddings for physically simulated characters.
arXiv preprint arXiv:220501906 2022. DOI
88. Merel J, Hasenclever L, Galashov A, et al. Neural probabilistic motor primitives for humanoid control. arXiv preprint arXiv:181111711
2018. DOI
89. Hasenclever L, Pardo F, Hadsell R, Heess N, Merel J. Comic: complementary task learning & mimicry for reusable skills. In: International
Conference on Machine Learning. PMLR; 2020. pp. 4105–15. Available from: https://proceedings.mlr.press/v119/hasenclever20a.html
[Last accessed on 30 Aug 2022].
90. Liu S, Lever G, Wang Z, et al. From motor control to team play in simulated humanoid football. arXiv preprint arXiv:210512196 2021.
DOI
91. Escontrela A, Peng XB, Yu W, et al. Adversarial motion priors make good substitutes for complex reward functions. arXiv preprint
arXiv:220315103 2022. DOI
92. Levine S, Kumar A, Tucker G, Fu J. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint
arXiv:200501643 2020. DOI
93. Duan Y, Schulman J, Chen X, et al. Rl ˆ2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint arXiv:161102779
2016. DOI
94. Nichol A, Achiam J, Schulman J. On firstorder metalearning algorithms. arXiv preprint arXiv:180302999 2018. DOI
95. Rakelly K, Zhou A, Finn C, Levine S, Quillen D. Efficient offpolicy metareinforcement learning via probabilistic context variables.
In: International Conference on Machine Learning. PMLR; 2019. pp. 5331–40. Available from: http://proceedings.mlr.press/v97/rakell
y19a.html [Last accessed on 30 Aug 2022].
96. Seo Y, Lee K, James SL, Abbeel P. Reinforcement learning with actionfree pretraining from videos. In: International Conference on