Extracting and Structuring Text from Photos of Receipts and Notes
Leskinen, Waltteri (2025)
Leskinen, Waltteri
2025
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025053018626
https://urn.fi/URN:NBN:fi:amk-2025053018626
Tiivistelmä
In this study, extracting and structuring text from photos of receipts and notes
was examined to address the challenges in digitizing handwritten unstructured
documents. This study was motivated by the prevalence of paper-based records
in business environments, especially in less developed regions.
Traditional OCR models struggle to process irregular handwriting and
unstructured formatting. Because of these limitations, the thesis sought to
improve the accuracy of these models by integrating human validation into the
OCR workflow.
The objective of the thesis was to develop a system that utilizes an existing OCR
engine combined with a user-friendly interface for manual correction. The aim
was to convert unstructured data into a more structured digital format that would
be more suited for digital recordkeeping.
A multi-phase methodology was used in which text extraction was performed
using established OCR tools, followed by human verification through a graphical
user interface. The backend was developed with Python, and the frontend was
developed in TypeScript using NextJS framework. Various OCR models were
also evaluated.
The study showed that traditional OCR systems can give satisfactory results
when processing unstructured data and make it significantly easier to digitize
them when integrated with Human-In-The-Loop validation. Also Improved
reliability in document digitization was shown, with potential benefits for future
research and workflows.
was examined to address the challenges in digitizing handwritten unstructured
documents. This study was motivated by the prevalence of paper-based records
in business environments, especially in less developed regions.
Traditional OCR models struggle to process irregular handwriting and
unstructured formatting. Because of these limitations, the thesis sought to
improve the accuracy of these models by integrating human validation into the
OCR workflow.
The objective of the thesis was to develop a system that utilizes an existing OCR
engine combined with a user-friendly interface for manual correction. The aim
was to convert unstructured data into a more structured digital format that would
be more suited for digital recordkeeping.
A multi-phase methodology was used in which text extraction was performed
using established OCR tools, followed by human verification through a graphical
user interface. The backend was developed with Python, and the frontend was
developed in TypeScript using NextJS framework. Various OCR models were
also evaluated.
The study showed that traditional OCR systems can give satisfactory results
when processing unstructured data and make it significantly easier to digitize
them when integrated with Human-In-The-Loop validation. Also Improved
reliability in document digitization was shown, with potential benefits for future
research and workflows.