Using Open Source LLM Model for Medical Transcription

Chowdhury, Mohammed Nowshad Ruhani

Using Open Source LLM Model for Medical Transcription

Chowdhury, Mohammed Nowshad Ruhani (2025)

Avaa tiedosto

Chowdhury_Mohammed_Nowshad_Ruhani.pdf (1.030Mt)

Lataukset:

Chowdhury, Mohammed Nowshad Ruhani

2025

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025052716517

Tiivistelmä

In modern healthcare, clinical documentation is paramount for patient safety, accurate diagnoses, and continuity of care. However, physician burnout has been caused by the increasing overhead of electronic health record (EHR) systems, which take up less time for real human interaction. In less-resourced languages such as Finnish, in which natural language processing (NLP) tools are only beginning to emerge, this is an even bigger challenge. This thesis investigates the fine-tuning of the open-source LLaMA 3.1–8B language model on simulated Finnish clinical conversations that is, transcribed clinical dialogues created by Metropolia UAS students. The aim is to verify if a domain- aligned large language model (LLM) is able to reliably translate spoken Finnish medical discourse into formal clinical reports. With 7-fold cross-validation, the fine-tuned model achieved a BLEU score of 0.1242, ROUGE-L score of 0.4982, and BERTScore F1 score of 0.8373, showing satisfactory semantic performance using a small dataset and scalability of privacy-oriented NLP tools in Finnish medical environments.

Kokoelmat

Opinnäytetyöt