Page 103 - Read Online
P. 103

Page 292                                Zhang et al. Intell Robot 2022;2(3):275­97   I http://dx.doi.org/10.20517/ir.2022.20



                                               A1                          A1        A1         A1       A1        A1                       A1
                               Yobogo          Unitree         ANYmal      Unitree   Unitree    Unitree  Unitree   Unitree      Solo8       Unitree





                               Webots [106]  Foot  Pybullet    RaiSim  Tra-  RaiSim  RaiSim     RaiSim   RaiSim  and  Ori-  Foot  Pybullt  Isaac,  Pybullet  End-  Pybullet  Ve-






                              BaseTargetPositions(withPrevi-  Survival.  and Orientation,  Velocity,  Angular  and  Collision,  and  Motion  Smoothness,  Target  Clearance,  Map. Traversability  and  Smoothness,  State,  Foot  Motion,  Joint Orientation,  Goal.  Tracking,  Consump-  Energy  Gap,  and Hip Lateral Movement,  Linear and Angular Velocity, and  Velocities.  and  Torques  Orientation,  and  Motion  Smooth-  Impact,  Ground  Slip.  Foot  Speed,  Joint  End-Effector,  Imitation,  Deviation,  Sate,  Acceleration.  Height,  and  Velocities  Motion,  Joint  Action  Collision,  Knee  Reward.  Hips  and  Gaits,  Position,  Base  and  Position,  Veloc-  Angular
















                            Base  and  ous),  Height.  Linear  Base  Joint  Torque,  Ve-  Torque,  End-  Ve-  and  jectory  Velocity  Joint  tion,  Joints.  Joint  Joint  Base  Joint  Work,  ness,  Joint  Joint  Base  Base  entation,  State,  Rate,  Joint  Joint  ity.  Joint  Joint  Effector  locity.

                            Position,  Pitch,  Max.        and  State  Deviation,  State  Deviation.               Torques.


                            Goal  Velocity,  Lift  Leg  Desired  Positions.  Base  locity  Effector  locities  Desired  Positions.  Desired  Positions.  Desired  Positions.  Desired  Positions [43]  Desired  Desired  Positions.  Desired  Positions.



                              and     Direction,  and  Frequen-  His-  Con-  Env.  Velocities,  Veloc-  Po-  Height  Ve-  Ex-  and  Velocities,  Velocities,  Foot  Ac-  Reference  Po-  Joint  Command,  Velocities,  and  and  An-  (3-step),
                              Orientation,  Goal  State  Joint  Phases,  Joint  FTG),  and  Targets  States,  Contact  and  and  Positions  Trajectory  Velocities,  Command  Action  and Orientation, Foot Contact, Previ-  and  and  (2-dim),  history),  (with  and  history).  Gravity,  Velocity,  and  Roll)  and  (Yall  Joint  (3-step),  Actions  Poses.





                              Error,  Map,  Velocities,  FTG  and  (Base  Foot  Forces,  Positions  Corrected  and  Proprioception,  Previous  Vector.  Positions  Action.  Positions  Vector.  Data  History,  (with  Velocities,  and  Action.  Positions  (Yall  Velocities  (3-step),  Target

                              Angle  Command.  Elevation  Base  Velocity,  cies  tory,  tact  Param.  Robot  Reference  ities,  sitions  Maps.  locities,  trinsic  Joint  ous  Joint  Orientation  Contact  Sensor  tions  Poses  Base  sition  Last  Joint  Orientation  Angular  Roll).  Orientation  gles  Future




                                                                                                                                            REDQ [29]
                               SAC             SAC             PPO         PPO       PPO        PPO      PPO       PPO          PPO





                           quadruped  lo-  learning  terrains  con- DRL-based  stability.  noisy  get  to  to  order  in  tracking  propri-  and  of  tasks  locomo-  to  related  is  real-time  for  in  problems  control  motion  opera-  human  control  torque  high-  via  torques  robust  create  us-  training  demon-  fine-  for  locomotion




                           well-performing  for  system  real-world  in  pre-training.  terrain-aware troller integrating a RAN to guar-  action  the  DRL  using  trajectory  quadrupedal  a  vision Incorporating  navigation  in  robots.  Energy constraints leading to the  natural  of  choice  the  and  speed.  desired  algorithm  adaptation  robots.  quadrupedal  allowing  quadrupedal  predicting  joint  to  approach policiesdeployableonrealrobots  additional  optimised  single  system  RL  real-world






                           A  robot  comotion  without  A  antee  policy  A  reference  generate  system.  oception  legged  emergence  tion,  the  RMA  online  quadruped  A  system  tion.  A  framework  frequency  RL.  RL  A  without  a  ing  stration.  robot  A  tuning  policies.


                               2021            2021            2021                  2021       2021     2022      2022         2022        2022
                               IROS            IROS            ICRA     CVPR  workshop  2021  CoRL  RSS  RSS       ArXiv        ArXiv       ICRA




                            for  on          for  Tough      Adaptation  using       in  Gaits  for    of  DRL [108]  for                Learn-  Policies

                            Framework  Based Locomotion  Risk-Assessment-  DRL  (RAN)  in Locomotion  Trajectory  Locomotion  Propriorception  and  Robots [11]  Legged  Consumption  of  Emergence  Adaptation  Control  using  Control  Locomotion [44]  Model-freeRLforRobustLocomotion usingDemonstrationsfromTrajectory  on  Keep  Locomotion




                            Hierarchical   Terrain-Aware  Network-Aided  Quadrupedal  Quadrupedal  Vision  of  navigation  Energy  the  to  Robots [107]  Motor  Rapid  Robots [60]  Motion  Robots Quadrupedal  Torque  Quadrupedal  Optimization [109]  that  Robots  Fine-Tuning  World [30]  Real





                            A  Quadruped  RL [105]  Terrain [51]  Real-Time  for  DRL [53]  Coupling  for  Minimising  Leads  Legged  RMA:  Legged  Human  Learning  Legged  ing:  the  in
   98   99   100   101   102   103   104   105   106   107   108