AI tool for scientific literature data extraction
Ronkainen, Justiina (2025)
Ronkainen, Justiina
2025
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025060420152
https://urn.fi/URN:NBN:fi:amk-2025060420152
Tiivistelmä
The aim of this project was to explore artificial intelligence (AI) by developing a tool that leverages large language models (LLMs) to extract structured information from scientific articles. Systematic literature reviews and meta-analyses are based on accurate data extraction, a process that is traditionally manual, time-consuming and prone to mistakes. This project investigated the potential of LLMs to automate and streamline this task. Focus was to extract the key study elements such as article metadata, study design, statistical methods and results.
The project was implemented using Python’s LangChain framework and OpenAI API. Key techniques included prompt engineering, text processing and chunking to adapt the content suitable for LLM. GPT-3.5 and GPT-4.1 models were tested and evaluated against each other and human-extracted gold standard to assess performance. The model demonstrated potential in extracting some information, such as article metadata and study design; however, it struggled to reliably extract all relevant results from the articles.
Despite current limitations, LLMs hold promise for automating aspects of scientific data extraction. Improvements such as section-specific extraction and potential fine-tuning may enhance the performance and offer a scalable solution to one of the most labour-intensive steps in systematic literature review.
The project was implemented using Python’s LangChain framework and OpenAI API. Key techniques included prompt engineering, text processing and chunking to adapt the content suitable for LLM. GPT-3.5 and GPT-4.1 models were tested and evaluated against each other and human-extracted gold standard to assess performance. The model demonstrated potential in extracting some information, such as article metadata and study design; however, it struggled to reliably extract all relevant results from the articles.
Despite current limitations, LLMs hold promise for automating aspects of scientific data extraction. Improvements such as section-specific extraction and potential fine-tuning may enhance the performance and offer a scalable solution to one of the most labour-intensive steps in systematic literature review.
