Page 79 - Read Online
P. 79
Page 177 Lv et al. Intell Robot 2022;2(2):16879 I http://dx.doi.org/10.20517/ir.2022.09
(a) (b)
Figure 5. t-SNE projection of the embeddings in the soccer environment: (a) the two colors represent the two base opponent policies −1
1
and ; and (b) the different colors represent the classes of trajectory representations encoded by the contrastive learning.
−1
2
a non-stationary opponent. In particular, we only use the observation information of the opponent, and the
setting is looser than other opponent modeling algorithms. In the future, we would like to combine multi-task
learning algorithms to learn different opponent policies as different tasks and explore more efficient ways to
distinguish opponent policies.
DECLARATIONS
Authors’ contributions
Designed and run experiments: Lv Y
Made substantial contributions to conception and design of the study: Zheng Y
Provided administrative, technical, and material support: Hao J
Availability of data and materials
Not applicable.
Financial support and sponsorship
None.
Conflicts of interest
All authors declared that they have no conflicts of interest to this work.
Ethical approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Copyright
© The Author(s) 2022.