Optical character recognition
Shrestha, Pramoj (2018)
Shrestha, Pramoj
Turun ammattikorkeakoulu
2018
All rights reserved
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2018061213671
https://urn.fi/URN:NBN:fi:amk-2018061213671
Tiivistelmä
Optical Character Recognition (OCR) is the process of extracting the characters from a digital image. The concept behind OCR is to acquire a document in image or pdf formats and extract the characters from that image and present it to the user in an editable format.
The author of this thesis tested an Artificial Neural Network (ANN), which is a mathematical representation of the functionality of the human brain, using Back-propagation Algorithm with test case files of English alphabets. The purpose of this thesis was to test systems capable of recognizing English alphabets with different fonts, and to be familiar with ANN and digital image processing and apply it for character recognition.
Scientific journals and reports were used to research the relevant information required for the thesis project. The chosen software was then trained and tested with both computer and hand-written alphabets in image files. The tests revealed that the OCR software is able to recognize both computer and hand-written alphabets, and learns to do it better with each iteration.
The study shows that although the system needs more training for hand-written characters than computerized fonts, the use of ANN in OCR is of great benefit and allows for quicker and better character recognition.
The author of this thesis tested an Artificial Neural Network (ANN), which is a mathematical representation of the functionality of the human brain, using Back-propagation Algorithm with test case files of English alphabets. The purpose of this thesis was to test systems capable of recognizing English alphabets with different fonts, and to be familiar with ANN and digital image processing and apply it for character recognition.
Scientific journals and reports were used to research the relevant information required for the thesis project. The chosen software was then trained and tested with both computer and hand-written alphabets in image files. The tests revealed that the OCR software is able to recognize both computer and hand-written alphabets, and learns to do it better with each iteration.
The study shows that although the system needs more training for hand-written characters than computerized fonts, the use of ANN in OCR is of great benefit and allows for quicker and better character recognition.