Page 57 - Read Online
P. 57
Qi et al. Intell Robot 2021;1(1):18-57 I http://dx.doi.org/10.20517/ir.2021.02 Page 52
for evaluating the contribution of agents in FRL.
6.7. Peertopeer cooperation
FRL applications have the option of choosing between a central server-client model as well as a distributed
peer-to-peer model. A distributed model can not only eliminate the single point of failure, but it can also
improve energy efficiency significantly by allowing models to be exchanged directly between two agents. In a
typical application, two adjacent cars share experience learned from road condition environment in the form
of models with D2D communications to assist autonomous driving. However, the distributed cooperation
increases the complexity of the learning system and imposes stricter requirements for application scenarios.
This research should include, but not be limited to, the agent selection method for the exchange model, the
mechanismfortriggeringthemodelexchange,theimprovementofalgorithmadaptability,andtheconvergence
analysis of the aggregation algorithm.
7. CONCLUSION
As a new and potential branch of RL, FL can make learning safer and more efficient while leveraging the
benefits of FL. We have discussed the basic definitions of FL and RL as well as our thoughts on their integration
in this paper. In general, FRL algorithms can be classified into two categories, i.e., HFRL and VFRL. Thus, the
definition and general framework of these two categories have been given. Specifically, we have highlighted the
difference between HFRL and VFRL. Then, a lot of existing FRL schemes have been summarized and analyzed
accordingtodifferentapplications. Finally, thepotentialchallengesinthedevelopmentofFRLalgorithmshave
been explored. Several open issues of FRL have been identified, which will encourage more efforts toward
further research in FRL.
DECLARATIONS
Authors’ contributions
Made substantial contributions to the research and investigation process, reviewed and summarized the liter-
ature, wrote and edited the original draft: Qi J, Zhou Q
Performed oversight and leadership responsibility for the research activity planning and execution, as well as
developed ideas and evolution of overarching research aims: Lei L
Performed critical review, commentary and revision, as well as provided administrative, technical, and
material support: Zheng K
Availability of data and materials
Not applicable.
Financial support and sponsorship
This work was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada
(Discovery Grant No. 401718) and the CARE-AI Seed Fund at the University of Guelph.
Conflicts of interest
The authors declared that there are no conflicts of interest.
Ethical approval and consent to participate
Not applicable.
Consent for publication
Not applicable.