Inappropriate content classification: Natural Language Processing
Sergeev, Yury (2023)
Sergeev, Yury
2023
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2023112631677
https://urn.fi/URN:NBN:fi:amk-2023112631677
Tiivistelmä
The main objective of this work was to investigate the effectiveness of various Natural Language Processing (NLP) techniques in processing and analyzing text data. The study focused on identifying and detecting toxic comments, which is becoming increasingly important in the digital age due to the massive amount of user-generated content on social media and online platforms. Robust methods are needed to monitor and filter harmful or inappropriate language to create a safer and more respectful online environment. This research aimed to contribute to this effort by efficiently identifying toxic commentary.
The study utilized the Jigsaw Multilingual Toxic Comment dataset to train several models. This dataset is valuable because it contains a diverse and comprehensive collection of comments, which can simulate real-world online interactions. The research involved experimenting with various word embedding techniques, such as Word2Vec, TF-IDF, GloVe, FastText, and specialized embedding layers, to represent text data effectively. These techniques play a crucial role in translating human language into formats that can be understood and processed by machine learning models. For the classification task, a diverse set of models was employed, including both traditional machine learning algorithms like Naive Bayes, Random Forest, Logistic Regression, LinearSVM, and XGBoost, as well as advanced deep learning models such as Convolutional Neural Network (CNN), Recurrent Network (RNN), Long-short Term Memory (LSTM), Gated Recurrent Units (GRU), Bidirectional Encoder Representations from Transformers (BERT), a distilled version of BERT (DistilBert), and XLM-RoBERTa (XLM-R). This comprehensive approach was designed to assess and compare the capabilities of different models in accurately identifying toxic comments.
XLM-RoBERTa was the most effective model, with an accuracy of 96% and an F1 score of 88% in detecting toxic comments. This high level of performance indicates the model's robustness and reliability in detecting toxic comments in diverse contexts. Further solidifying its practical applicability, the best-performing model was then tested on real-world data obtained from Twitter, aiming to detect inappropriate tweets in a live environment.
The study utilized the Jigsaw Multilingual Toxic Comment dataset to train several models. This dataset is valuable because it contains a diverse and comprehensive collection of comments, which can simulate real-world online interactions. The research involved experimenting with various word embedding techniques, such as Word2Vec, TF-IDF, GloVe, FastText, and specialized embedding layers, to represent text data effectively. These techniques play a crucial role in translating human language into formats that can be understood and processed by machine learning models. For the classification task, a diverse set of models was employed, including both traditional machine learning algorithms like Naive Bayes, Random Forest, Logistic Regression, LinearSVM, and XGBoost, as well as advanced deep learning models such as Convolutional Neural Network (CNN), Recurrent Network (RNN), Long-short Term Memory (LSTM), Gated Recurrent Units (GRU), Bidirectional Encoder Representations from Transformers (BERT), a distilled version of BERT (DistilBert), and XLM-RoBERTa (XLM-R). This comprehensive approach was designed to assess and compare the capabilities of different models in accurately identifying toxic comments.
XLM-RoBERTa was the most effective model, with an accuracy of 96% and an F1 score of 88% in detecting toxic comments. This high level of performance indicates the model's robustness and reliability in detecting toxic comments in diverse contexts. Further solidifying its practical applicability, the best-performing model was then tested on real-world data obtained from Twitter, aiming to detect inappropriate tweets in a live environment.