Page 94 - Read Online
P. 94

Zhang et al. Intell Robot 2022;2(3):275­97  I http://dx.doi.org/10.20517/ir.2022.20  Page 287


               67.  Kostrikov I, Yarats D, Fergus R. Image augmentation is all you need: regularizing deep reinforcement learning from pixels. arXiv
                   preprint arXiv:200413649 2020. DOI
               68.  Yarats D, Fergus R, Lazaric A, Pinto L. Mastering visual continuous control: improved data­augmented reinforcement learning. arXiv
                   preprint arXiv:210709645 2021. DOI
               69.  Ahmed O, Träuble F, Goyal A, et al. Causalworld: a robotic manipulation benchmark for causal structure and transfer learning. arXiv
                   preprint arXiv:201004296 2020. DOI
               70.  Dittadi A, Träuble F, Wüthrich M, et al. The role of pretrained representations for the OOD generalization of RL agents. arXiv preprint
                   arXiv:210705686 2021. DOI
               71.  Hsu K, Kim MJ, Rafailov R, Wu J, Finn C. Vision­based manipulators need to also see from their hands. arXiv preprint arXiv:220312677
                   2022. DOI
               72.  Eysenbach B, Asawa S, Chaudhari S, Levine S, Salakhutdinov R. Off­dynamics reinforcement learning: training for transfer with domain
                   classifiers. arXiv preprint arXiv:200613916 2020. DOI
               73.  Liu J, Zhang H, Wang D.  DARA: dynamics­aware reward augmentation in offline reinforcement learning.  arXiv preprint
                   arXiv:220306662 2022. DOI
               74.  Lee K, Seo Y, Lee S, Lee H, Shin J. Context­aware dynamics model for generalization in model­based reinforcement learning. In:
                   Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13­18 July 2020, Virtual Event. vol. 119 of Pro­
                   ceedings of Machine Learning Research. PMLR; 2020. pp. 5757–66. Avaialble from: http://proceedings.mlr.press/v119/lee20g.html
                   [Last accessed on 30 Aug 2022].
               75.  Yu W, Tan J, Liu CK, Turk G. Preparing for the unknown: learning a universal policy with online system identification. arXiv preprint
                   arXiv:170202453 2017. DOI
               76.  Chen D, Zhou B, Koltun V, Krähenbühl P. Learning by cheating. In: Conference on Robot Learning. PMLR; 2020. pp. 66–75. Available
                   from: http://proceedings.mlr.press/v100/chen20a.html [Last accessed on 30 Aug 2022].
               77.  Tobin J, Fong R, Ray A, et al. Domain randomization for transferring deep neural networks from simulation to the real world 2017. DOI
               78.  Peng XB, Andrychowicz M, Zaremba W, Abbeel P. Sim­to­real transfer of robotic control with dynamics randomization 2017. DOI
               79.  Kaiser L, Babaeizadeh M, Milos P, et al. Model­based reinforcement learning for atari. arXiv preprint arXiv:190300374 2019. DOI
               80.  Schrittwieser J, Antonoglou I, Hubert T, et al. Mastering atari, go, chess and shogi by planning with a learned model. Nature
                   2020;588:604–9. DOI
               81.  Ye W, Liu S, Kurutach T, Abbeel P, Gao Y. Mastering atari games with limited data. Adv Neural Inform Proc Syst 2021;34:25476–88.
                   Available from: https://proceedings.neurips.cc/paper/2021/hash/d5eca8dc3820cad9fe56a3bafda65ca1­Abstract.html [Last accessed on
                   30 Aug 2022].
               82.  Chua K, Calandra R, McAllister R, Levine S. Deep reinforcement learning in a handful of trials using probabilistic dynamics models.
                   Adva neural inform proc syst 2018;31. Available from: https://proceedings.neurips.cc/paper/2018/hash/3de568f8597b94bda53149c7d
                   7f5958c­Abstract.html [Last accessed on 30 Aug 2022].
               83.  Janner M, Fu J, Zhang M, Levine S. When to trust your model: model­based policy optimization. Adv Neural Inform Proc Syst 2019;32.
                   Available from: https://proceedings.neurips.cc/paper/2019/hash/5faf461eff3099671ad63c6f3f094f7f­Abstract.html [Last accessed on 30
                   Aug 2022].
               84.  Hansen N, Wang X, Su H. Temporal difference learning for model predictive control. arXiv preprint arXiv:220304955 2022. DOI
               85.  Tassa Y, Doron Y, Muldal A, et al. Deepmind control suite. arXiv preprint arXiv:180100690 2018. DOI
               86.  Peng XB, Ma Z, Abbeel P, Levine S, Kanazawa A. Amp: adversarial motion priors for stylized physics­based character control. ACM
                   Trans Graph (TOG) 2021;40:1–20. DOI
               87.  Peng XB, Guo Y, Halper L, Levine S, Fidler S. ASE: large­scale reusable adversarial skill embeddings for physically simulated characters.
                   arXiv preprint arXiv:220501906 2022. DOI
               88.  Merel J, Hasenclever L, Galashov A, et al. Neural probabilistic motor primitives for humanoid control. arXiv preprint arXiv:181111711
                   2018. DOI
               89.  Hasenclever L, Pardo F, Hadsell R, Heess N, Merel J. Comic: complementary task learning & mimicry for reusable skills. In: International
                   Conference on Machine Learning. PMLR; 2020. pp. 4105–15. Available from: https://proceedings.mlr.press/v119/hasenclever20a.html
                   [Last accessed on 30 Aug 2022].
               90.  Liu S, Lever G, Wang Z, et al. From motor control to team play in simulated humanoid football. arXiv preprint arXiv:210512196 2021.
                   DOI
               91.  Escontrela A, Peng XB, Yu W, et al. Adversarial motion priors make good substitutes for complex reward functions. arXiv preprint
                   arXiv:220315103 2022. DOI
               92.  Levine S, Kumar A, Tucker G, Fu J. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint
                   arXiv:200501643 2020. DOI
               93.  Duan Y, Schulman J, Chen X, et al. Rl ˆ2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint arXiv:161102779
                   2016. DOI
               94.  Nichol A, Achiam J, Schulman J. On first­order meta­learning algorithms. arXiv preprint arXiv:180302999 2018. DOI
               95.  Rakelly K, Zhou A, Finn C, Levine S, Quillen D. Efficient off­policy meta­reinforcement learning via probabilistic context variables.
                   In: International Conference on Machine Learning. PMLR; 2019. pp. 5331–40. Available from: http://proceedings.mlr.press/v97/rakell
                   y19a.html [Last accessed on 30 Aug 2022].
               96.  Seo Y, Lee K, James SL, Abbeel P. Reinforcement learning with action­free pre­training from videos. In: International Conference on
   89   90   91   92   93   94   95   96   97   98   99