Reinforcement Learning in Humanoid Robotics: Deploying Deep Policy Networks for Agile Motion Control
Main Article Content
Abstract
This work considers the use of deep reinforcement learning (DRL) methods for humanoid robot’s control of nimble motion. We evaluate performance of bipedal locomotion, object manipulation and disturbance recovery in both simulations and in real world using PPO, SAC, and DQL-CL algorithms. Results indicate that PPO is good at velocity and trajectory tracking, SAC is good at efficiency, while DQL-CL is superior at adaptability and transferability. Then, residual torque control and safety supervision is integrated for robust operation. The result is that hybrid control architectures and curriculum learning, which provide maximum real world deployment potential, have a great impact on the stability, resilience, and generalization of next generation humanoid robotics.