下偏矩近端策略最佳化：提升機器人在平衡板上的穩定性

廖翊承; Liao, Yi-Cheng

下偏矩近端策略最佳化：提升機器人在平衡板上的穩定性

Files

202500047262-109628.pdf (7.04 MB)

Date

2025

Authors

廖翊承

Liao, Yi-Cheng

Abstract

none
This study proposes an improved version of the Proximal Policy Optimization (PPO) algorithm by incorporating the Lower Partial Moment (LPM) method. The added loss function penalizes low advantage values, aiming to enhance the policy’s robustness against noise and performance. The new LPM-PPO algorithm is compared with leading methods such as SAC, DDPG, TRPO, and RPO across multiple Isaac Gym simulation environments to verify its effectiveness. For the Sim2Real transfer, the research applies the balance board task to a real-world humanoid robot. This process accounts for complex physical factors like friction, inertia, mass distribution, and motor dynamics. To accurately collect observations, the study uses OpenCV for vision-based tracking, forward kinematics for position estimation, and adds noise during training to mimic real-world sensor errors—improving the robot’s real-world adaptability and robustness.

Keywords

none, Humanoid Robots, LPM-PPO, Reinforcement Learning, Sim2Real, Balance Board

URI

https://etds.lib.ntnu.edu.tw/thesis/detail/56bc277e0be6270e6a86aad3a13f69d1/
http://rportal.lib.ntnu.edu.tw/handle/20.500.12235/125044

Collections

學位論文

Full item page

下偏矩近端策略最佳化：提升機器人在平衡板上的穩定性

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By