News Articles Topic Classification Using Transformer Model : An Interactive AI Application

Shrestha, Bishal Ram

News Articles Topic Classification Using Transformer Model : An Interactive AI Application

Shrestha, Bishal Ram (2025)

Avaa tiedosto

Shrestha_Bishal_Ram.pdf (879.0Kt)

Lataukset:

Shrestha, Bishal Ram

2025

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025121135215

Tiivistelmä

This thesis presents an interactive AI system to make an automatic classification of news articles in twelve categories having a headline and a description as input. The system assists editors in making quick editorial decisions. It is important to categorise news articles in the workflow before publishing. It helps in taking editorial decisions fast and clear. A balanced dataset of synthetic news samples covering all twelve categories were generated using the GPT-Neo 2.7B model. The dataset has information related to World, Politics, Business, Technology and Science which supports consistent representation of news patterns.
The dataset was fine tuned using RoBERTa base transformer model. The accuracy, macro precision, macro recall and macro F1 score were used in the evaluation to ensure that the performance is fair across the twelve categories. The model showed an accuracy of 51.87 percent and has a macro F1-score of 0.52 on the unseen test data which shows that the model is able to identify more than half of the articles correctly and performance is not skewed over the different categories.
A Gradio web interface was used to show how the model would work in real life. The Application suggests the predicted topic, confidence score and routing suggestion as per the confidence threshold. When the prediction is uncertain or below the given threshold, the system suggests manual review for editors selection. The tool wil be helpful for editorial judgement and not totally automated.
Transformer-based models can help editors in their daily news editing tasks by improving efficiency and creating uniformity in classifying news articles as demonstrated by this project. Despite some limitations, especially for categories with similar wording, the system proves that transformer-based model can provide consistent and good news classification in an editorial workflow.

Kokoelmat

Opinnäytetyöt