Simulation-Based Development of a Non-linear Excavator System: CasADi-based MPC and reinforcement learning control
Hou, Jiancai (2026)
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202604025605
https://urn.fi/URN:NBN:fi:amk-202604025605
Tiivistelmä
This work presents a simulation-based development and evaluation of a non-linear excavator system, focusing on two complementary tasks: end-effector pose tracking and contact-rich object manipulation (log grasping and lifting). The study was conducted entirely in simulation, following a simulation-first approach aimed at supporting future sim-to-real deployment.
For pose tracking, a non-linear Model Predictive Control (MPC) framework was implemented in Python using CasADi (Andersson et al. 2019) with the IPOPT (Interior Point Optimizer) solver. The controller was based on a URDF (unified robotic description format) aligned planar kinematic model that incorporates link orientation and TCP (tool-center-point) offsets, and it ran in a real-time receding-horizon manner.
For object manipulation, we built a reinforcement learning (RL) environment in NVIDIA Isaac Sim/Isaac Lab. A PPO (Proximal Policy Optimization) policy trained using RSL-RL (A Learning Library for Robotics Research) controlled an excavator-arm through a differential inverse kinematics (IK) under full 3D rigid-body dynamics with contact interaction.
The results showed that the MPC controller achieved stable and smooth convergence to different target poses. Meanwhile, the RL policy learned multi-stage manipulation behavior, allowing the robot to approach, grasp, and lift the log successfully in most test cases.
For pose tracking, a non-linear Model Predictive Control (MPC) framework was implemented in Python using CasADi (Andersson et al. 2019) with the IPOPT (Interior Point Optimizer) solver. The controller was based on a URDF (unified robotic description format) aligned planar kinematic model that incorporates link orientation and TCP (tool-center-point) offsets, and it ran in a real-time receding-horizon manner.
For object manipulation, we built a reinforcement learning (RL) environment in NVIDIA Isaac Sim/Isaac Lab. A PPO (Proximal Policy Optimization) policy trained using RSL-RL (A Learning Library for Robotics Research) controlled an excavator-arm through a differential inverse kinematics (IK) under full 3D rigid-body dynamics with contact interaction.
The results showed that the MPC controller achieved stable and smooth convergence to different target poses. Meanwhile, the RL policy learned multi-stage manipulation behavior, allowing the robot to approach, grasp, and lift the log successfully in most test cases.
