Page 80 - Read Online
P. 80

Lv et al. Intell Robot 2022;2(2):168­79  I http://dx.doi.org/10.20517/ir.2022.09    Page 178


               REFERENCES
               1.  Littman ML. Markov games as a framework for multi­agent reinforcement learning. Machine Learning Proceedings 1994. Elsevier; 1994.
                  pp. 157­63. DOI
               2.  Busoniu L, Babuska R, De Schutter B. A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst, Man, Cybern C
                  2008;38:156­72. DOI
               3.  Vinyals O, Babuschkin I, Czarnecki WM, et al. Grandmaster level in StarCraft II using multi­agent reinforcement learning. Nature
                  2019;575:350­4. DOI
               4.  Hernandez­Leal P, Kaisers M, Baarslag T, de Cote EM. A survey of learning in multiagent environments: Dealing with non­stationarity.
                  arXiv preprint arXiv:1707.09183, 2017. DOI
               5.  Papoudakis G, Christianos F, Rahman A, Albrecht SV. Dealing with non­stationarity in multi­agent deep reinforcement learning. arXiv
                  preprint arXiv:1906.04737, 2019. DOI
               6.  Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013. DOI
               7.  Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347,
                  2017. DOI
               8.  Haarnoja T, Zhou A, Abbeel P, Levine S. Soft actor­critic: Off­policy maximum entropy deep reinforcement learning with a stochastic
                  actor. In: International Conference on Machine Learning, PMLR, 2018. p. 1861­70.
               9.  Hernandez­leal P, Kartal B, Taylor ME. A survey and critique of multiagent deep reinforcement learning. Auton Agent Multi­Agent Syst
                  2019;33:750­97. DOI
               10. Hernandez­Leal P, Taylor ME, Rosman B, et al. Identifying and tracking switching, non­stationary opponents: A bayesian approach. In:
                  Workshops at the Thirtieth AAAI Conference on Artificial Intelligence, 2016.
               11. Zheng Y, Meng Z, Hao J, et al. A deep bayesian policy reuse approach against non­stationary agents. In: Proceedings of the 32nd
                  International Conference on Neural Information Processing Systems, 2018. p. 962–972.
               12. Yang T, Hao J, Meng Z, et al. Towards efficient detection and optimal response against sophisticated opponents. In: Twenty­Eighth
                  International Joint Conference on Artificial Intelligence, 2019.
               13. He H, Boyd­Graber J, Kwok K, Daumé III H. Opponent modeling in deep reinforcement learning. In: International Conference on
                  Machine Learning, PMLR, 2016. p. 1804­13. DOI
               14. Hong ZW, Su SY, Shann TY, Chang YH, Lee CY. A deep policy inference q­network for multi­agent systems. In: Proceedings of the 17th
                  International Conference on Autonomous Agents and MultiAgent Systems, 2018. p. 1388­.96.
               15. Raileanu R, Denton E, Szlam A, Fergus R. Modeling others using oneself in multi­agent reinforcement learning. In: International
                  Conference on Machine Learning, PMLR, 2018. p. 4257­66.
               16. Foerster J, Chen RY, Al­Shedivat M, Whiteson S, Abbeel P, Mordatch I. Learning with opponent­learning awareness. In: Proceedings of
                  the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018. p. 122­30.
               17. Lin LJ. Reinforcement learning for robots using neural networks. Carnegie Mellon University, 1992. Available from: https://dl.acm.org
                  /doi/book/10.5555/168871 [Last accessed on 7 Jun 2022]
               18. Chaudhry A, Rohrbach M, Elhoseiny M, et al. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019.
                  DOI
               19. Chrysakis A, Moens MF. Online continual learning from imbalanced data. In: International Conference on Machine Learning, PMLR,
                  2020. p. 1952­61.
               20. Khetarpal K, Riemer M, Rish I, Precup D. Towards continual reinforcement learning: a review and perspectives. arXiv preprint
                  arXiv:2012.13490, 2020. DOI
               21. van den Oord A, Li Y, Vinyals O. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
                  DOI
               22. Chen T, Kornblith S, Norouzi M, Hinton G. A simple framework for contrastive learning of visual representations. In: International
                  conference on machine learning, PMLR, 2020. p. 1597–1607.
               23. He K, Fan H, Wu Y, Xie S, Girshick R. Momentum contrast for unsupervised visual representation learning. In: Proceedings of the
                  IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. p. 9729­38.
               24. Laskin M, Srinivas A, Abbeel P. Curl: Contrastive unsupervised representations for reinforcement learning. In: International Conference
                  on Machine Learning, PMLR, 2020. p. 5639­50.
               25. Koopmans T. Activity analysis of production and allocation. 1951. Available from: https://cowles.yale.edu/sites/default/files/files/pub/
                  mon/m13­all.pdf [Last accessed on 7 Jun 2022]
               26. Carmel D, Markovitch S. Model­based learning of interaction strategies in multi­agent systems. Journal of Experimental & Theoretical
                  Artificial Intelligence1998;10:309­32. DOI
               27. Rosman B, Hawasly M, Ramamoorthy S. Bayesian policy reuse. Mach Learn 2016;104:99­127. DOI
               28. de Weerd H, Verbrugge R, Verheij B. How much does it help to know what she knows you know? an agent­based simulation study.
                  Artificial Intelligence 2013;199­200:67­92. DOI
               29. Chen X, Fan H, Girshick R, He K. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020.
                  DOI
               30. Chen T, Kornblith S, Swersky K, Norouzi M, Hinton GE. Big self­supervised models are strong semi­supervised learners. Advances in
                  Neural Information Processing Systems 2020;33:22243­55.
               31. Grill JB, Strub F, Altché F, et al. Bootstrap your own latent ­ a new approach to self­supervised learning. Advances in Neural Information
   75   76   77   78   79   80   81   82   83   84   85