Page 101 - Read Online
P. 101

Zhang et al. Intell Robot 2022;2(3):275­97   I http://dx.doi.org/10.20517/ir.2022.20                              Page 291



                                   ANYmal-B,  ANYmal-C  Jueying 3   Jueying-Mini  robot  ANYmal  Panther  Black  robot  A1  Unitree  Unitree  Laikago  A1  Unitree







                                    RaiSim             PyBullet      RaiSim       Mujoco     RaiSim      Pybullet    Pybullet   Pybullet





                                Velocity,  and  Target  Velocity,  Veloc-  Ref-  State,  Contacts,  Torque  Position  unifor-  Pitch  and  Position,  Base Lin-  and  Velocity,  Contact.  and  Joint  Tar-  to



                                Angular  Reward  Foot Clearance,  Torque.  and  and  Height  (Torque,  Body  and  Joint  Gait,  Torque  Base  Velocities,  Desired  Foot  and  Velocities,  Velocity,  Torque.  and  Velocity



                                and  Motion      Pose,  Regularisation  State,  Foot  Positions  Position.  Velocity,  Velocity,  Smoothness,  Positions,  End-Effector Positions,  Angular  Torque,  Balance,  Base  Forward  Count,  Forward  Orientation.
                                Linear  Base  Collision,  Smoothness,  Base  ity),  erence  Goal  Base  and  uniformity,  mity,  Limitations.  Joint  and  ear  Quaternion.  Joint  Base  Torques,  Survival.  Base  Limit  Desired  and  get




                                  and  Residu-      Joint  Gating:  Joint       Joint       Joint    Fre-  Cutoff  Stance  Offset.  Joint  Base  Velocity,
                                  Frequencies  Position  Desired  Weights.                           Leg  and  Phase          Swing,  and




                                  Leg  Foot  als.   Expert:  Positions.  Variable  Desired  Positions.  Desired  Positions.  Desired  Positions.  Desired  quency,  (Swing  Phase),  Desired  Torques.  Goal  Height  Orientation.


                           Base  Joint  FTG  Joint  Vector,  and  States,  Ve-  Base  Goal  Direction,  Lin-  Posi-  (Base,  Ac-  Linear  Base  An-  Po-  Gait

                           Gravity,  Frequency,  Velocities,  Frequencies,  Normal  Forces  Contact  Contact  Force.  Gravity,  and  Vector,  Gravity  and  Velocity  Joint  and  Velocity.  End-Effector,  Velocities  Previous  Base  Height,  Joint  and  Motors  Map,  and


                           Direction,  and  and  and  Terrain  Height,  History,  External  Position,  Phase  Height,  Angular  Acceleration,  Angular  (Base,  State  CoM),  Orientation,  Command.  VelocityCommand,SineandCo- sine Values (4 phases), Joint Posi- tion and Velocity, Angular Veloc-  Gravity.  Actual  and  Base  Velocities  (12-dim).  Height  Orientation,




                           Goal  Velocities  States  Phases  History,  Foot  Target  Friction,  Joint  locities,  Position.  Base  Base  ear  and  tion  Image,  Joint,  Joint),  tion,  ities,  Desired  Velocity  Orientation,  Linear  gles  Global  sitions,  Phase.




                                    TRPO               SAC           PPO2      V-MPO [25] ,  MO-  VMPO [26] ,  PPO  ARS [23]  PPO  SAC





                                incor-  show-  gener-  generate   learning  combines  ob-  re-  to  framework  frame-  transitions  re-  a  adapta-  trajectories  and


                                solution  proprioception  zero-shot  to  for  which  DRL.  controller  policies  gaits.  learning  gait  with  domain  high-level


                                Sim2Real  remarkable  architecture adaptiveskillsfromagroupofex-  method  gaits,  and  adaptive  training producetrajectoriesplannedbya  solver.  quadrupedal for training a control policy to lo-  various  in  hierarchical  which  automatically  energy.  min.  for tion by identifying a simulator to  simulated  ones.  target  AnovelHTCframeworkleverag-  the  for optimal control for the low-level.


                                novel  porating  alization.  MEL  skills.  training  bounding  pre-training  terrain  by  tained  non-linear  novel  comote  in  emerge  of  framework  the  match  the  DRL

                                A     ing           A    pert     A           A            A         A  work  ward  A    to     ing

                                      2020              2020         2021         2021      and  2021    2021        2021       2021
                                   Science  Robotics  Science  Robotics  ArXiv    ArXiv  IEEE  Robotics  Automation  Letters  CoRL  ICRA  IROS





                                   Locomotion        adap-  of  (MEL)  Policies  Control  us-  Bounding  Networks [57]  Terrain-  a  Imitating  Planner [27]  for  Transition  Phase-Guided  via Locomotion  Control  Locomotion




                                   Quadrupedal  Terrain [7]  Challenging  learning  locomotion [9]  of  Learning  Quadruped  Neural  Coordinated  by Locomotion  Dynamics  Gait  Free  via  Robots  Efficient  Transitions [52]  SimGAN: Hybrid Simulator Identifica- tionforDomainAdaptationviaAdver-  Terrain-Aware  Quadrupedal by Combining DRL and Optimal Con-  3 https://www.deeprobotics.cn/





                                   Learning  over    Multi-expert  legged  tive  Efficient  Robust  for  Pretrained  ing  Learning  Adaptive  Centroidal  Learning  Quadruped  Controller [58]  and  Fast  Gait  Learned  RL [104]  sarial  Hierarchical  for  (HTC)  trol [54]
   96   97   98   99   100   101   102   103   104   105   106