下偏矩近端策略最佳化:提升機器人在平衡板上的穩定性
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
none
This study proposes an improved version of the Proximal Policy Optimization (PPO) algorithm by incorporating the Lower Partial Moment (LPM) method. The added loss function penalizes low advantage values, aiming to enhance the policy’s robustness against noise and performance. The new LPM-PPO algorithm is compared with leading methods such as SAC, DDPG, TRPO, and RPO across multiple Isaac Gym simulation environments to verify its effectiveness. For the Sim2Real transfer, the research applies the balance board task to a real-world humanoid robot. This process accounts for complex physical factors like friction, inertia, mass distribution, and motor dynamics. To accurately collect observations, the study uses OpenCV for vision-based tracking, forward kinematics for position estimation, and adds noise during training to mimic real-world sensor errors—improving the robot’s real-world adaptability and robustness.
This study proposes an improved version of the Proximal Policy Optimization (PPO) algorithm by incorporating the Lower Partial Moment (LPM) method. The added loss function penalizes low advantage values, aiming to enhance the policy’s robustness against noise and performance. The new LPM-PPO algorithm is compared with leading methods such as SAC, DDPG, TRPO, and RPO across multiple Isaac Gym simulation environments to verify its effectiveness. For the Sim2Real transfer, the research applies the balance board task to a real-world humanoid robot. This process accounts for complex physical factors like friction, inertia, mass distribution, and motor dynamics. To accurately collect observations, the study uses OpenCV for vision-based tracking, forward kinematics for position estimation, and adds noise during training to mimic real-world sensor errors—improving the robot’s real-world adaptability and robustness.
Description
Keywords
none, Humanoid Robots, LPM-PPO, Reinforcement Learning, Sim2Real, Balance Board