Hyppää sisältöön
    • Suomeksi
    • På svenska
    • In English
  • Suomi
  • Svenska
  • English
  • Kirjaudu
Hakuohjeet
JavaScript is disabled for your browser. Some features of this site may not work without it.
Näytä viite 
  •   Ammattikorkeakoulut
  • Kaakkois-Suomen ammattikorkeakoulu
  • Opinnäytetyöt
  • Näytä viite
  •   Ammattikorkeakoulut
  • Kaakkois-Suomen ammattikorkeakoulu
  • Opinnäytetyöt
  • Näytä viite

Extracting and Structuring Text from Photos of Receipts and Notes

Leskinen, Waltteri (2025)

 
Avaa tiedosto
Leskinen_Waltteri.pdf (1.665Mt)
Lataukset: 


Leskinen, Waltteri
2025
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2025053018626
Tiivistelmä
In this study, extracting and structuring text from photos of receipts and notes
was examined to address the challenges in digitizing handwritten unstructured
documents. This study was motivated by the prevalence of paper-based records
in business environments, especially in less developed regions.

Traditional OCR models struggle to process irregular handwriting and
unstructured formatting. Because of these limitations, the thesis sought to
improve the accuracy of these models by integrating human validation into the
OCR workflow.

The objective of the thesis was to develop a system that utilizes an existing OCR
engine combined with a user-friendly interface for manual correction. The aim
was to convert unstructured data into a more structured digital format that would
be more suited for digital recordkeeping.

A multi-phase methodology was used in which text extraction was performed
using established OCR tools, followed by human verification through a graphical
user interface. The backend was developed with Python, and the frontend was
developed in TypeScript using NextJS framework. Various OCR models were
also evaluated.

The study showed that traditional OCR systems can give satisfactory results
when processing unstructured data and make it significantly easier to digitize
them when integrated with Human-In-The-Loop validation. Also Improved
reliability in document digitization was shown, with potential benefits for future
research and workflows.
Kokoelmat
  • Opinnäytetyöt
Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatKoulutusalatAsiasanatUusimmatKokoelmat

Henkilökunnalle

Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste