Page 60 - Read Online
P. 60

Page 55                                                                  Qi et al. Intell Robot 2021;1(1):18-57  I http://dx.doi.org/10.20517/ir.2021.02



                   from: http://arxiv.org/abs/1512.04455.
               52.  Foerster J, Nardelli N, Farquhar G, et al. Stabilising experience replay for deep multi­agent reinforcement learning. In: Precup
                   D, Teh YW, editors. Proceedings of the 34th International Conference on Machine Learning. vol. 70 of Proceedings of Machine Learning
                   Research. PMLR; 2017. pp. 1146–55. Available from: https://proceedings.mlr.press/v70/foerster17b.html.
               53.  Van der Pol E, Oliehoek FA. Coordinated deep reinforcement learners for traffic light control. Proceedings of Learning, Inference and
                   Control of Multi­Agent Systems (at NIPS 2016) 2016. Available from: https://www.elisevanderpol.nl/papers/vanderpolNIPSMALIC2
                   016.pdf.
               54.  Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S. Counterfactual multi­agent policy gradients. In: Proceedings of the AAAI
                   Conference on Artificial Intelligence. vol. 32; 2018. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/11794.
               55.  Lowe R, Wu Y, Tamar A, et al. Multi­agent actor­critic for mixed cooperative­competitive environments. CoRR 2017;abs/
                   1706.02275. Available from: http://arxiv.org/abs/1706.02275.
               56.  Nadiger C, Kumar A, Abdelhak S. Federated reinforcement learning for fast personalization. In: 2019 IEEE Second International
                   Conference on Artificial Intelligence and Knowledge Engineering (AIKE); 2019. pp. 123–27.
               57.  Liu B, Wang L, Liu M, Xu C. Lifelong federated reinforcement learning: a learning architecture for navigation in cloud robotic
                   systems. CoRR 2019;abs/1901.06455. Available from: http://arxiv.org/abs/1901.06455.
               58.  Ren J, Wang H, Hou T, Zheng S, Tang C. Federated learning­based computation offloading optimization in edge computing­supported
                   internet of things. IEEE Access 2019;7:69194–201.
               59.  Wang X, Wang C, Li X, Leung VCM, Taleb T. Federated deep reinforcement learning for internet of things with decentralized
                   cooperative edge caching. IEEE Internet Things 2020;7:9441–55.
               60.  Chen J, Monga R, Bengio S, Józefowicz R. Revisiting Distributed Synchronous SGD. CoRR 2016;abs/1604.00981. Available from:
                   http://arxiv.org/abs/1604.00981.
               61.  Mnih V, Badia AP, Mirza M, et al. Asynchronous methods for deep reinforcement learning. In: Balcan MF, Weinberger KQ,
                   editors. Proceedings of The 33rd International Conference on Machine Learning. vol. 48 of Proceedings of Machine Learning
                   Research. New York, New York, USA: PMLR; 2016. pp. 1928–37. Available from: https://proceedings.mlr.press/v48/mniha1 6.html.
               62.  Espeholt L, Soyer H, Munos R, et al. IMPALA: scalable distributed deep­RL with importance weighted actor­ learner
                   architectures. In: Dy J, Krause A, editors. Proceedings of the 35th International Conference on Machine Learning. vol. 80 of
                   Proceedings of Machine Learning Research. PMLR; 2018. pp. 1407–16. Available from: http://proceedings.mlr.press/v80/espeholt18a.
                   html.
               63.  Horgan D, Quan J, Budden D, et al. Distributed prioritized experience replay. CoRR 2018;abs/1803.00933. Available from: http://
                   arxiv.org/abs/1803.00933.
               64.  Liu T, Tian B, Ai Y, et al. Parallel reinforcement learning: a framework and case study. IEEE/CAA Journal of Automatica Sinica
                   2018;5:827–35.
               65.  Zhuo HH, Feng W, Xu Q, Yang Q, Lin Y. Federated reinforcement learning. CoRR 2019;abs/1901.08277. Available from: http: /
                   /arxiv.org/abs/1901.08277.
               66.  Canese L, Cardarilli GC, Di Nunzio L, et al. Multi­agent reinforcement learning: a review of challenges and applications.
                   Applied Sciences 2021;11:4948. Available from: https://doi.org/10.3390/app11114948.
               67.  Busoniu L, Babuska R, De Schutter B. A comprehensive survey of Multiagent Reinforcement Learning. IEEE Transactions on Systems,
                   Man, and Cybernetics, Part C (Applications and Reviews) 2008;38:156–72.
               68.  Zhang K, Yang Z, Başar T. Multi­agent reinforcement learning: a selective overview of theories and algorithms. Handbook of Rein
                   forcement Learning and Control 2021:321–84.
               69.  Stone P, Veloso M. Multiagent systems: A survey from a machine learning perspective. Auton Robot 2000;8:345–83.
               70.  Szepesvári C, Littman ML.  A unified analysis of value­function­based reinforcement­learning algorithms. Neural Comput
                   1999;11:2017–60.
               71.  Littman ML. Value­function reinforcement learning in Markov games. Cogn Syst Res 2001;2:55–66.
               72.  Tan M. Multi­agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the tenth international conference
                   on machine learning; 1993. pp. 330–37.
               73.  Lauer M, Riedmiller M. An algorithm for distributed reinforcement learning in cooperative multi­agent systems. In: In Proceedings of
                   the Seventeenth International Conference on Machine Learning. Citeseer; 2000. Available from: http://citeseerx.ist.psu.edu/viewdoc/su
                   mmary.
               74.  Monahan GE. State of the art—a survey of partially observable Markov decision processes: theory, models, and algorithms. Manage Sci
                   1982;28:1–16.
               75.  Oroojlooyjadid A, Hajinezhad D. A review of cooperative multi­agent deep reinforcement learning. CoRR 2019;abs/1908.03963.
                   Available from: http://arxiv.org/abs/1908.03963.
               76.  Bernstein DS, Givan R, Immerman N, Zilberstein S. The complexity of decentralized control of Markov decision processes. Math Oper
                   Res 2002;27:819–40.
               77.  Omidshafiei S, Pazis J, Amato C, How JP, Vian J. Deep decentralized multi­task multi­agent reinforcement learning under partial
                   observability. In: Precup D, Teh YW, editors. Proceedings of the 34th International Conference on Machine Learning. vol. 70 of
   55   56   57   58   59   60   61   62   63   64   65