Page 80 - Read Online
P. 80
Lv et al. Intell Robot 2022;2(2):16879 I http://dx.doi.org/10.20517/ir.2022.09 Page 178
REFERENCES
1. Littman ML. Markov games as a framework for multiagent reinforcement learning. Machine Learning Proceedings 1994. Elsevier; 1994.
pp. 15763. DOI
2. Busoniu L, Babuska R, De Schutter B. A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst, Man, Cybern C
2008;38:15672. DOI
3. Vinyals O, Babuschkin I, Czarnecki WM, et al. Grandmaster level in StarCraft II using multiagent reinforcement learning. Nature
2019;575:3504. DOI
4. HernandezLeal P, Kaisers M, Baarslag T, de Cote EM. A survey of learning in multiagent environments: Dealing with nonstationarity.
arXiv preprint arXiv:1707.09183, 2017. DOI
5. Papoudakis G, Christianos F, Rahman A, Albrecht SV. Dealing with nonstationarity in multiagent deep reinforcement learning. arXiv
preprint arXiv:1906.04737, 2019. DOI
6. Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013. DOI
7. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347,
2017. DOI
8. Haarnoja T, Zhou A, Abbeel P, Levine S. Soft actorcritic: Offpolicy maximum entropy deep reinforcement learning with a stochastic
actor. In: International Conference on Machine Learning, PMLR, 2018. p. 186170.
9. Hernandezleal P, Kartal B, Taylor ME. A survey and critique of multiagent deep reinforcement learning. Auton Agent MultiAgent Syst
2019;33:75097. DOI
10. HernandezLeal P, Taylor ME, Rosman B, et al. Identifying and tracking switching, nonstationary opponents: A bayesian approach. In:
Workshops at the Thirtieth AAAI Conference on Artificial Intelligence, 2016.
11. Zheng Y, Meng Z, Hao J, et al. A deep bayesian policy reuse approach against nonstationary agents. In: Proceedings of the 32nd
International Conference on Neural Information Processing Systems, 2018. p. 962–972.
12. Yang T, Hao J, Meng Z, et al. Towards efficient detection and optimal response against sophisticated opponents. In: TwentyEighth
International Joint Conference on Artificial Intelligence, 2019.
13. He H, BoydGraber J, Kwok K, Daumé III H. Opponent modeling in deep reinforcement learning. In: International Conference on
Machine Learning, PMLR, 2016. p. 180413. DOI
14. Hong ZW, Su SY, Shann TY, Chang YH, Lee CY. A deep policy inference qnetwork for multiagent systems. In: Proceedings of the 17th
International Conference on Autonomous Agents and MultiAgent Systems, 2018. p. 1388.96.
15. Raileanu R, Denton E, Szlam A, Fergus R. Modeling others using oneself in multiagent reinforcement learning. In: International
Conference on Machine Learning, PMLR, 2018. p. 425766.
16. Foerster J, Chen RY, AlShedivat M, Whiteson S, Abbeel P, Mordatch I. Learning with opponentlearning awareness. In: Proceedings of
the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018. p. 12230.
17. Lin LJ. Reinforcement learning for robots using neural networks. Carnegie Mellon University, 1992. Available from: https://dl.acm.org
/doi/book/10.5555/168871 [Last accessed on 7 Jun 2022]
18. Chaudhry A, Rohrbach M, Elhoseiny M, et al. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019.
DOI
19. Chrysakis A, Moens MF. Online continual learning from imbalanced data. In: International Conference on Machine Learning, PMLR,
2020. p. 195261.
20. Khetarpal K, Riemer M, Rish I, Precup D. Towards continual reinforcement learning: a review and perspectives. arXiv preprint
arXiv:2012.13490, 2020. DOI
21. van den Oord A, Li Y, Vinyals O. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
DOI
22. Chen T, Kornblith S, Norouzi M, Hinton G. A simple framework for contrastive learning of visual representations. In: International
conference on machine learning, PMLR, 2020. p. 1597–1607.
23. He K, Fan H, Wu Y, Xie S, Girshick R. Momentum contrast for unsupervised visual representation learning. In: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. p. 972938.
24. Laskin M, Srinivas A, Abbeel P. Curl: Contrastive unsupervised representations for reinforcement learning. In: International Conference
on Machine Learning, PMLR, 2020. p. 563950.
25. Koopmans T. Activity analysis of production and allocation. 1951. Available from: https://cowles.yale.edu/sites/default/files/files/pub/
mon/m13all.pdf [Last accessed on 7 Jun 2022]
26. Carmel D, Markovitch S. Modelbased learning of interaction strategies in multiagent systems. Journal of Experimental & Theoretical
Artificial Intelligence1998;10:30932. DOI
27. Rosman B, Hawasly M, Ramamoorthy S. Bayesian policy reuse. Mach Learn 2016;104:99127. DOI
28. de Weerd H, Verbrugge R, Verheij B. How much does it help to know what she knows you know? an agentbased simulation study.
Artificial Intelligence 2013;199200:6792. DOI
29. Chen X, Fan H, Girshick R, He K. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020.
DOI
30. Chen T, Kornblith S, Swersky K, Norouzi M, Hinton GE. Big selfsupervised models are strong semisupervised learners. Advances in
Neural Information Processing Systems 2020;33:2224355.
31. Grill JB, Strub F, Altché F, et al. Bootstrap your own latent a new approach to selfsupervised learning. Advances in Neural Information