Hyppää sisältöön
    • Suomeksi
    • På svenska
    • In English
  • Suomi
  • Svenska
  • English
  • Kirjaudu
Hakuohjeet
JavaScript is disabled for your browser. Some features of this site may not work without it.
Näytä viite 
  •   Ammattikorkeakoulut
  • Savonia-ammattikorkeakoulu
  • Opinnäytetyöt (Avoin kokoelma)
  • Näytä viite
  •   Ammattikorkeakoulut
  • Savonia-ammattikorkeakoulu
  • Opinnäytetyöt (Avoin kokoelma)
  • Näytä viite

Training quadruped robot controllers in Isaac Sim using reinforcement learning : exploring quadruped robot performance in different simulated surfaces

Jubaer, A S M (2026)

 
Avaa tiedosto
Jubaer_ASM.pdf (5.799Mt)
Lataukset: 


Jubaer, A S M
2026
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-202605049154
Tiivistelmä
A deep reinforcement learning approach was presented in this thesis using the Proximal Policy Optimization (PPO) algorithm to obtain robust quadrupedal locomotion on the Unitree Go2 robot via the Isaac Lab simulation. The research examined the learning performance, convergence and generalization ability of the learned policy across various terrains; these include flat ground, slope, stairs and an obstacle course. The experimental data indicated that a similar learning behavior was exhibited when trained with the same PPO-approach, however a stable and predictable convergence was achieved on flat and slope terrains. In addition to providing a basis for developing smooth and efficient locomotion behaviors for easier terrain conditions, this method demonstrated an ability to provide adaptability through changes in foot placement and balance strategies as a function of changing terrain conditions. Conversely, significant increases were seen in both variance and performance degradation when locomotion was in environments with stairs or obstacles. Increased variance and decreased performance were observed in the stair and obstacle course terrains
because of their high degree of complexity and non-continuous contact dynamics. In terms of the training efficiency it was determined that training in simpler terrains required significantly less time than those terrains which had a higher level of complexity, therefore slower training times and higher levels of variability were also evident. Therefore, the need for progressive training methods such as curriculum learning are highlighted as being essential to improve the robustness of the developed policy. Although encouraging results have been obtained in terms of demonstrating the applicability of the PPO reinforcement learning method to develop terrain adaptive quadrupedal locomotion behaviors, some
limitations remain; specifically, the potential for instability in locomotion when operating on irregular terrain has been noted and the lack of consideration for visual perception methods for determining balance and navigation has been acknowledged.
Kokoelmat
  • Opinnäytetyöt (Avoin kokoelma)
Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatKoulutusalatAsiasanatUusimmatKokoelmat

Henkilökunnalle

Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste