Page 105 - Read Online
P. 105

Zhang et al. Intell Robot 2022;2(3):275­97   I http://dx.doi.org/10.20517/ir.2022.20                              Page 293



                                B,  C        Mini    B,  C,  A1  A1                                                   A1         A1
                                ANYmal  ANYmal  MIT  Cheetah  ANYmal  ANYmal  Unitree  Unitree  Quadruped  Humanoid  Transformer  ANYmal-C  ANYmal  Unitree  Unitree






                                  RaiSim       IsaacGym  IsaacGym  Gym  Issac  Gym  Issac  RaiSim      MuJoCo         Pybullet   Pybullet





                           Velocities,  Stability.  Deviation,  State  Motion  Pose,  Base  Limits,  Airtime.  Motion,  Collisions,  Velocity  Prior  Velocity  Joint  and  Foot  Mo-  Joint  Smooth-  CoM,  End-Effector  Quaternions,  Track-  Dribbling:  Torques,  Sphere




                           Base  Slip,  State  Recovery:  Foot  Smoothness.  Joint  Rate,  Joint  Rate,  Angular  Motion  Angular  Position.  Motion,  Collisions,  Constraint,  Slip.  Tracking  Walking:  Ball  Positions.  Motor  Collected  Base Position, Desired Direction,  Energy.
                              Foot  State.  (Planning),  Joint),  Tracking,  Action  Tracking,  Action  and  and  and  Pose,  and  Body  Torque,  Velocities,  Body  Draw.  Ball  and  Consumed

                           Planning:  Torque,  Adaption:  Robot  Space  (Foot,  Velocity  Self-Collision,  Torques,  Velocity  Torques,  Airtime.  Feet  Linear  Tracking,  Discrimination  Linear  Tracking,  Velocity  Velocities,  Clearance,  and  tion  ness,  Imitation:  Joint  Positions,  Current  Velocity.  ing  Tracking  Distance,  Survival,  Count.  and




                            Coordi-  Adaption:  Recov-  Joint  Joint  Joint  De-  for  Joint       Latent  Low-  Joint  Desired  Joint  Joint  1)




                            Planning:  nates.  Torques.  Joint  Desired  ery:  Positions.  Desired  Positions.  Desired  Positions.  Torques  Joint  Positions.  sired  Desired  Positions.  Phase Offset, Joint Po-  Target.  sition  High-Level:  Command.  Level:  Positions.  Desired  Positions.  Desired  Positions.  Table  to  Package





                           Joint States,  Map.  Goal,  Feet  Recov-  Posi-  Goal  Joint  Ter-  Actions,  Ori-  Velocities,  Actions.  and  Velocity,  Mo-  and  Phase,  CPG  States,  Air-  Forces,  Latent  Ve-  and  Velocity  End-  Position,  ID.  Pre-  Im-  and  An-  Joint  Feet  Velocities,  and  (Supplement  Open-Source



                           State,  Elevation  State,  Map.  Position,  JointAnglesandVelocities,Grav- ity, PreviousActions, Goal Veloc-  Gravity,  Previous  BaseStateandVelocities, Gravity,  and  Pose  History,  Contact  Motion,  States  Goal  Ball  Clip  Joint Rotations,  (3-step),  depth).  (3-dim),  (4-dim),

                           Base  Velocity,  Base  Elevation  Velocity.  Velocities,  Previous  Measurements.  and  Angles  and  Positions  Positions.  Base  Samples,  External  and  States  Joint  Gravity,  Position,  Position,  Actions  dense  Angular  Vector  Velocity.  publications
                           Planning:  Goal  Adaption:  Torques,  Joint  and  tions  Base  Motion,  Joint  entations  Joint  Wheel  Command,  Joint  tion,  Height  Friction,  time.  Base  Command,  locities,  Effector  Orientation,  vious  (4  Orientation  and  Contact  Base

                                                          rain
                                                                                                                                gles
                                      ery:
                                                                                                          and
                                                                                                                                               about
                                                                                                                        age
                                                 ity.
                                TD3,           PPO     PPO      PPO        PPO          PPO           V-MPO,          PPO        SAC           information
                                SAC,  GCPO [28]                                                          MO-VMPO,

                                                                                                                                               More  Gap
                               data-  terrain.                  learned  prior-  for  so-             using  and  lever-  states  loco-  con-  tra-  2.  Reality

                               and           controller  framework  generation  functions  motion  allow  to  switchable  locomotion  exteroceptive  perception.  locomotion  robots  human  method  for  approach  foot  Table  to
                               model-based drivenapproachforquadrupedal  uneven  over  Cheetah  agility.  record  training  policy  fast  reward  rewards  captures.  approach  discretely  integrating  proprioceptive  reusable  legged  of knowledge  movement.  RL  proprioceptive  observations  control.  RL-based  evolutionary  generator.  Solution





                               unified  locomotion  Mini  MIT  achieving  robotic  achieving  parallelism.  Substituting  stylish  motion  adversarial  RL  multiple,  quadrupedal  Learning  real  for  animal  end-to-end  both  visual  motion  novel  an  taining  jectory

                               A             A       A    via   with  from  An  based  styles.  A  lution  and  skills  prior  An  aging  and  A

                                Transactions  on  RSS  CoRL     ArXiv      ArXiv         Robotics      ArXiv          ICLR          Autom
                              IEEE  Robotics  2022  2022  2022  2022       2022       Science  2022    2022           2022     IEEE  Robotics  2022


                                                                                                                                                     Others
                               Loco-  Con-            Using              Ad-         locomo-  the  in  Learn-  Movement  Animal  quadrupedal  cross-  Trajectory  for


                               Legged  Optimal  Reinforcement  Minutes  DRL [35]  Multiple  RL [56]  in  perceptive  robots  Repurpose:  and  with  Approach  Locomotion [62]


                               Terrain-Aware  and  RL  using  via Locomotion  in  Walk  to  Parallel  AdversarialMotionPriorsMakeGood Substitutes for Complex Reward Func-  through  Skills  Priors  Motion  robust  quadrupedal  and  Robot  Reusable  Human  From  vision-guided  end-to-end  transformers [13]  Evolutionary  General  A





                               RLOC:  motion  trol [42]  Rapid  Learning [36]  Learning  Massively  tions [91]  Advanced  versarial  Learning  for  tion  wild [8]  Imitate  ing  Skills  Behaviors [12]  Learning  locomotion  modal  with  RL  Generator:  Quadrupedal  Publication
   100   101   102   103   104   105   106   107   108   109   110