Hyppää sisältöön
    • Suomeksi
    • På svenska
    • In English
  • Suomi
  • Svenska
  • English
  • Kirjaudu
Hakuohjeet
JavaScript is disabled for your browser. Some features of this site may not work without it.
Näytä viite 
  •   Ammattikorkeakoulut
  • Jyväskylän ammattikorkeakoulu
  • Opinnäytetyöt (Avoin kokoelma)
  • Näytä viite
  •   Ammattikorkeakoulut
  • Jyväskylän ammattikorkeakoulu
  • Opinnäytetyöt (Avoin kokoelma)
  • Näytä viite

Fake news detection using natural language processing and machine learning: a comparative study of supervised algorithms and text representation techniques

Nguyen, Loi (2025)

 
Avaa tiedosto
Nguyen_Loi.pdf (1.094Mt)
Lataukset: 


Nguyen, Loi
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025101125915
Tiivistelmä
The widespread phenomenon of fake news on digital platforms causes serious social issues, highlighting the
need for precise and scalable detection methods. There was a comparison among studies being carried out.
It is responsible for assessing the functional strategies of supervised machine learning algorithms and text
representation. The main purpose of this study was seeking the most effective feature extraction tech
niques and algorithmic integration to improve how well classification performed. With the supportive
source of comprehensive evaluation methodology and pertinent peer-reviewed publications, databases
were utilised to searching for phrases like “supervised learning”, “fake news detection”, “Bag-of-Words”,
and “word-embeddings”. Some highlighted examples are IEEE Xplore, ScienceDirect, and SpringerLink.
Bag-of-Words (BoW) and word-embeddings (Word2Vec, GloVe) were approached as the two main subjects
of the investigation. By contrast, combining them with supervised algorithms like Support Vector Machine
(SVM), Logistic Regression (LR), Random Forest (RF), and Naïve Bayes (NB) had a conflict. The arrangement
of comparative tables was deployed for datasets, feature extraction techniques, classifiers, and perfor
mance indicators to generate codes. Through the result, it was recorded that the embedding-based tech
niques had a greater performance than BoW. The more sophisticated combination of classification systems
like SVM and collaborative models were, the more efficient majority of scenarios was. Previous trained em
beddings were required with above standard lexical models regarding to accuracy and F1-score through
data collections like LIAR and FEVER.
It was undeniable to realise how majorly the degree of preprocessing, algorithmic sophistication, and qual
ity of data collection impacted on identifying performance. Under any encouragement of results, some
drawbacks still were existent, including inconsistent assessment criteria, and unbalanced dataset, and dif
ferent preprocessing techniques used in different research. The best condition for developers to choose the
suitable models depended on how available data is and computing capacity is one example of the practical
ramifications, indeed. The suggestion of future studies brought the new investigation of deep learning and
unsupervised learning techniques and create uniform data points for more reliable assessment.
Kokoelmat
  • Opinnäytetyöt (Avoin kokoelma)
Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatKoulutusalatAsiasanatUusimmatKokoelmat

Henkilökunnalle

Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste