AI-Assisted Annotation for Panoptic Segmentation
Ivanov, Miroslav (2025)
Ivanov, Miroslav
2025
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025120332055
https://urn.fi/URN:NBN:fi:amk-2025120332055
Tiivistelmä
This thesis presents the design and implementation of AI-assisted image annotation workflows for panoptic segmentation, developed in support of the ROADVIEW project focused on autonomous driving in extreme weather conditions. Panoptic segmentation requires detailed pixel-level annotations that combine semantic and instance segmentation, making manual labelling both time-consuming and resource-intensive. To address this challenge, the research integrated state-of-the-art foundation models, including Segment Anything Model (SAM), Grounding DINO, Mask2Former, and Depth Anything V2, into semi-automated annotation pipelines.
The methodology involved leveraging zero-shot detection and segmentation capabilities, depth-informed mask refinement, and temporal mask propagation to accelerate annotation while preserving label quality. Custom tools were developed to automate bounding box generation, guide segmentation tasks, and streamline manual corrections. The workflows were tested on large-scale sequential image datasets, demonstrating significant improvements in annotation speed, consistency, and scalability.
Results indicate that AI-assisted workflows can reduce annotation time by up to 70%, with minimal compromise on precision. The thesis highlights practical strategies for integrating foundation models into real-world annotation tasks and offers insights into the limitations and future potential of semi-automated labelling systems. The author utilized OpenAI ChatGPT (GPT-5) for routine writing support, such as identifying relevant sources, checking grammar, and improving clarity, while retaining full responsibility for all technical decisions, analyses, and claims presented in this thesis.
The methodology involved leveraging zero-shot detection and segmentation capabilities, depth-informed mask refinement, and temporal mask propagation to accelerate annotation while preserving label quality. Custom tools were developed to automate bounding box generation, guide segmentation tasks, and streamline manual corrections. The workflows were tested on large-scale sequential image datasets, demonstrating significant improvements in annotation speed, consistency, and scalability.
Results indicate that AI-assisted workflows can reduce annotation time by up to 70%, with minimal compromise on precision. The thesis highlights practical strategies for integrating foundation models into real-world annotation tasks and offers insights into the limitations and future potential of semi-automated labelling systems. The author utilized OpenAI ChatGPT (GPT-5) for routine writing support, such as identifying relevant sources, checking grammar, and improving clarity, while retaining full responsibility for all technical decisions, analyses, and claims presented in this thesis.
