Military Threat Detection : Addressing the Data Problem for Training The Threat Classification Model
Kothari, Chetan (2025)
Kothari, Chetan
2025
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025052214432
https://urn.fi/URN:NBN:fi:amk-2025052214432
Tiivistelmä
This thesis addresses the problem of generating structured datasets for automated threat detection using Vision-Language Models (VLMs) and Large Language Models (LLMs). The primary goal is to establish an automated pipeline based on Agentic AI, where specialized agents, guided by structured prompts, produce detailed and parsable image descriptions along with associated threat-level annotations. The resulting data is intended primarily to facilitate training multimodal classification models, such as CLIP, which directly classify threat levels from images and extensive textual embeddings. Experiments were performed using classical machine learning models (Gradient Boosting, Random Forest) and neural network approaches (Multilayer Perceptron) to validate dataset effectiveness. Results indicated that deep learning methods, particularly the MLP classifier and gradient boosting, offered reliable threat-level predictions. The thesis concludes that Agentic AI pipelines are effective for generating structured datasets, enabling more accurate and scalable threat-detection applications.