Reinforcement Learning using  Autonomous Driving

Bellizzi, Matthias

Reinforcement Learning using Autonomous Driving

Bellizzi, Matthias (2025)

Avaa tiedosto

Bellizzi_Matthias.pdf (1.541Mt)

Lataukset:

Bellizzi, Matthias

2025

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025060219039

Tiivistelmä

This thesis investigated Proximal Policy Optimization (PPO) to enhance
autonomous driving agent adaptability and generalization within simulated
environments of varying complexity. Key objectives included evaluating PPO's
learning on static tracks with diverse geometric challenges, assessing its
generalization to unseen static environments, and analysing performance when
transitioning to a simple dynamic scenario with a pedestrian obstacle.
The study employed a simulation-based experimental approach, utilizing the Unity
game engine and the ML-Agents toolkit to train and evaluate PPO agents. Custom
designed tracks were developed, including three distinct static training
environments, two unseen static test tracks with novel features, and a modified
training track incorporating a dynamic pedestrian. Key performance metrics,
including lap completion rate, crash rate, average lap time, and timeout were
recorded and analysed.
The results demonstrate that PPO, after tuning, learned robust and effective
policies for specific static training tracks, achieving high average lap completion
(89%). On unseen static tracks, the agent showed a degree of generalization (75%
average completion), though performance decreased with greater environmental
complexity. However, introducing even a simple dynamic obstacle significantly
impacted performance despite retraining (57% lap completion on a previously
mastered track), underscoring dynamic adaptation challenges.
This study concludes that PPO can master specific simulated navigation tasks, yet
broad generalization and robust dynamic adaptation remain significant hurdles.
Findings highlight PPO's sensitivity to environmental changes and difficulties in
transferring learned behaviours. Further research into advanced sensors, more
complex dynamic training, and refined reward structures is crucial to bridge
simulation achievements with real-world autonomous driving demands.

Kokoelmat

Opinnäytetyöt