Hyppää sisältöön
    • Suomeksi
    • På svenska
    • In English
  • Suomi
  • Svenska
  • English
  • Kirjaudu
Hakuohjeet
JavaScript is disabled for your browser. Some features of this site may not work without it.
Näytä viite 
  •   Ammattikorkeakoulut
  • Metropolia Ammattikorkeakoulu
  • Opinnäytetyöt
  • Näytä viite
  •   Ammattikorkeakoulut
  • Metropolia Ammattikorkeakoulu
  • Opinnäytetyöt
  • Näytä viite

Machine Learning-Based Ransomware Detection Through Static Analysis of PE File Features

Ali, Muhammad (2025)

 
Avaa tiedosto
Ali_Muhammad.pdf (1.554Mt)
Lataukset: 


Ali, Muhammad
2025
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025080623818
Tiivistelmä
This thesis introduces a machine learning framework for ransomware detection based on static analysis of executable file attributes. The primary contribution of the thesis is the introduction of a useful low-false-positive detection system, a gap-filling mechanism between the performance claims of the academic space and the real-world deployment needs of the cybersecurity applications. Using the EMBER 2018 dataset, which is a benchmark collection of real malware samples, this thesis comprehensively tests a number of machine learning models against a balanced sampling of 50,000 variants, 25,000 benign, and 25,000 malicious .The novelty of the work is the extensive feature engineering approach, extracting 50 different structural features from PE files, including entropy distribution, import features, header features, and histogram statistics, with a deliberate emphasis on minimizing false positives.

This thesis demonstrates through cross-validation and a large holdout set (7,500 samples) that ensemble methods, in particular, XGBoost, significantly outperform traditional methods. The optimized XGBoost model achieved 94.5% accuracy, 94.6% precision, and 94.4% recall, with a low false positive rate of 5.4% which was a significant improvement from previous methods, which were plagued by excessive false alarms.

Based on the feature importance analysis, the three best classifiers for ransomware detection were the GUI application flag, entropy in certain byte ranges, and imported features. These findings re-evaluate long-held beliefs about which file attributes were the strongest indicators of malicious code and present new challenges for security practitioners.

This thesis adds value to cybersecurity research by establishing realistic performance standards for static analysis-based malware detection, provides a systematic testing comparison of six machine learning algorithms under the same conditions, and displays that current ensemble methods can obtain practically deployable detection rates with manageable false positive rates. The proposed approach and findings connect theoretical research to operational security considerations and provide a pragmatic way to discover new ransomware variants that are unknown to traditional signature-based methods.
Kokoelmat
  • Opinnäytetyöt
Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatKoulutusalatAsiasanatUusimmatKokoelmat

Henkilökunnalle

Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste