SlideShare a Scribd company logo
Vidyut Singhania
Divyanshu Sagar
Ahmed Zaid
Contents
Problem Definition
Introduction to OCR
Applications of OCR
Platform Used
Steps in OCR
Working of OCR
Future Enhancements & Prospects
Summary
 Humans are bound to make errors – some time
or the other – especially while performing
mundane and boring tasks like digitization or
security, continuously.
 Many times we are unable to perceive certain
digits due to various factors – motion, lack of
digit clarity &/or illumination and so on.
 It is these problems which have primarily lead us
to delve into this topic.
1 2 3 4 5 6 7 8 9 0
1. Ingenious piece of software.
2. Involves the mechanical/electronic
conversion of scanned images of
typewritten/printed text into machine-
encoded/computer-readable text.
 3. Heavily used in the
industry.
 Common method of digitizing printed texts
 Subtle software which is as highly overlooked as it is
simple.
 Numerous applications and uses – editing, scanning,
searching, comparison, compact storage and many
more!
 OCR is a field of research in pattern
recognition, artificial intelligence and computer
vision.
OCR
 TranslateColour Images into Machine
readable format
 Conversion of printed / written digits to
Machine legible form
 Reduction of Human Error
 Human effort on daily mundane tasks
 MATLAB 8.3 which is a high-level cross-
platform, multi paradigm programming
language.
 MATLAB R 2013a & MATLAB R2014a
Pre-
processing
Glyph
Recognition
Classification
App specific
optimization
Pre-Processing
Feature extraction
Classification
 Deals with Improving quality of the Image for
better recognition by the system.
 Consists of : Noise Removal, Deblurring,
Binarization & Edge Detection
 Take any image, Synthetic/Handwritten, any
size but specific formats [those accepted by
MATLAB]
 Transforming the input data into the set of
features is called Feature extraction.
 Feature extraction is performed on raw data
prior to applying k-NN algorithm on the
transformed data in Feature space.
 Feature Extraction serves two purposes; one is
to extract properties that can identify a
character uniquely. Second is to extract
properties that can differentiate between similar
characters.
Eg
 Once the features are extracted, we can go
ahead and train a neural network using the
training data for which we already know the true
classes.After training, recognizing a new
scanned image involves:
1.Reading the Image
2.Segmenting the Images into Lines
3.Segmenting each Line into Glyph
4.Classifying each glyph by extracting its feature
set and using neural networks to predict its class.
WORKING OF OCR
 Image Acquisition:
Take any image whose format is supported by
MATLAB.
Step as follows:
 Noise Removal : Add Salt & Pepper noise to the
image and cleanse using Median filter
 De blur image :Wiener Deconvolution filter
 Conversion of Image:
Resulting image Binary image
Otsu method with Graythresh() Fn is used
 Edge Detection:Three different filters to
Binary image
a. Canny edge detector
b. Prewitt edge detector
c. Zerocross filter
Applied separately on the Binary image
The three images obtained are sent to the
Second phase for Character Recognition.
Final Report on Optical Character Recognition
Method I
o Apply K nearest Neighbour method to the best edge
detected image of last step
 Supervised learning – We provide it some data sets with
the correct answer and ask it to predict more correct data
values on the basis of existing data sets.
 Classification learning –We predict the output in a discrete
manner – not a continuous manner.
eg. A Cancer is malignant or benign – CAN’T be both!
 Thus, KNN is an eg. of Supervised Classification wherein
we ask the algo to detect the character in any 1 of the
numerous possible digits on the basis of the existing
training data sets.
 The usage of K-Nearest Neighbor on the
MNIST data set results in an accuracy level of
96.91% - a major achievement given
that we’re still novices in this field!
 Thus, we have validated the software
by testing it on numerous data items – the
MNIST test set, the MATLAB inbuilt image
sets and even numerous downloaded
scanned images.
 Currently, the scope of our engine extends to recognizing
one character at a time.
 We propose to extend this functionality to enable the
accurate prediction of multiple characters simultaneously
– thereby enabling truly real time Character Recognition.
 Also, we shall delve further into the implementation of
Neural Networks and come up with methods to increase
our accuracy levels.
 Last, but not the least, we shall develop a GUI which shall
enable greater User usability and popularity.
 We are looking at this engine as the stepping
stone towards the future.
 Implementation in Automatic Number Plate
Recognition system.
 This can be deployed in commercial
buildings, IT parks, high-end and niche
buildings as a security measure and/or
as a part of Home Automation.
 OCR technology provides fast, automated
data capture which can save considerable
time and labour costs of organisations.
 The system has its advantages such as
Automation of mundane tasks, LessTime
Complexity,Very Small Database and High
Adaptability to untrained inputs with only a
small number of features to calculate.
Final Report on Optical Character Recognition

More Related Content

Final Report on Optical Character Recognition

  • 2. Contents Problem Definition Introduction to OCR Applications of OCR Platform Used Steps in OCR Working of OCR Future Enhancements & Prospects Summary
  • 3.  Humans are bound to make errors – some time or the other – especially while performing mundane and boring tasks like digitization or security, continuously.  Many times we are unable to perceive certain digits due to various factors – motion, lack of digit clarity &/or illumination and so on.  It is these problems which have primarily lead us to delve into this topic.
  • 4. 1 2 3 4 5 6 7 8 9 0
  • 5. 1. Ingenious piece of software. 2. Involves the mechanical/electronic conversion of scanned images of typewritten/printed text into machine- encoded/computer-readable text.  3. Heavily used in the industry.
  • 6.  Common method of digitizing printed texts  Subtle software which is as highly overlooked as it is simple.  Numerous applications and uses – editing, scanning, searching, comparison, compact storage and many more!  OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
  • 7. OCR
  • 8.  TranslateColour Images into Machine readable format  Conversion of printed / written digits to Machine legible form  Reduction of Human Error  Human effort on daily mundane tasks
  • 9.  MATLAB 8.3 which is a high-level cross- platform, multi paradigm programming language.  MATLAB R 2013a & MATLAB R2014a
  • 12.  Deals with Improving quality of the Image for better recognition by the system.  Consists of : Noise Removal, Deblurring, Binarization & Edge Detection  Take any image, Synthetic/Handwritten, any size but specific formats [those accepted by MATLAB]
  • 13.  Transforming the input data into the set of features is called Feature extraction.  Feature extraction is performed on raw data prior to applying k-NN algorithm on the transformed data in Feature space.  Feature Extraction serves two purposes; one is to extract properties that can identify a character uniquely. Second is to extract properties that can differentiate between similar characters.
  • 14. Eg
  • 15.  Once the features are extracted, we can go ahead and train a neural network using the training data for which we already know the true classes.After training, recognizing a new scanned image involves: 1.Reading the Image 2.Segmenting the Images into Lines 3.Segmenting each Line into Glyph 4.Classifying each glyph by extracting its feature set and using neural networks to predict its class.
  • 17.  Image Acquisition: Take any image whose format is supported by MATLAB. Step as follows:  Noise Removal : Add Salt & Pepper noise to the image and cleanse using Median filter  De blur image :Wiener Deconvolution filter  Conversion of Image: Resulting image Binary image Otsu method with Graythresh() Fn is used
  • 18.  Edge Detection:Three different filters to Binary image a. Canny edge detector b. Prewitt edge detector c. Zerocross filter Applied separately on the Binary image The three images obtained are sent to the Second phase for Character Recognition.
  • 20. Method I o Apply K nearest Neighbour method to the best edge detected image of last step  Supervised learning – We provide it some data sets with the correct answer and ask it to predict more correct data values on the basis of existing data sets.  Classification learning –We predict the output in a discrete manner – not a continuous manner. eg. A Cancer is malignant or benign – CAN’T be both!  Thus, KNN is an eg. of Supervised Classification wherein we ask the algo to detect the character in any 1 of the numerous possible digits on the basis of the existing training data sets.
  • 21.  The usage of K-Nearest Neighbor on the MNIST data set results in an accuracy level of 96.91% - a major achievement given that we’re still novices in this field!  Thus, we have validated the software by testing it on numerous data items – the MNIST test set, the MATLAB inbuilt image sets and even numerous downloaded scanned images.
  • 22.  Currently, the scope of our engine extends to recognizing one character at a time.  We propose to extend this functionality to enable the accurate prediction of multiple characters simultaneously – thereby enabling truly real time Character Recognition.  Also, we shall delve further into the implementation of Neural Networks and come up with methods to increase our accuracy levels.  Last, but not the least, we shall develop a GUI which shall enable greater User usability and popularity.
  • 23.  We are looking at this engine as the stepping stone towards the future.  Implementation in Automatic Number Plate Recognition system.  This can be deployed in commercial buildings, IT parks, high-end and niche buildings as a security measure and/or as a part of Home Automation.
  • 24.  OCR technology provides fast, automated data capture which can save considerable time and labour costs of organisations.  The system has its advantages such as Automation of mundane tasks, LessTime Complexity,Very Small Database and High Adaptability to untrained inputs with only a small number of features to calculate.

Editor's Notes

  1. History- Since early 1900s- Edmund Dálbe invented Optophone- Produced tones corresponding to different characters.Electronic Conversion of scanned image into Machine readable format.
  2. Common method for Digitising Printed text.Most important aspect of OCR? Why is it we are working on it? What are it’s applications? Future scope?Data entry (Passports, CTS)ANPRBusiness Card ReaderGoogle Books- Electronic images of Printed documents searchable
  3. Advantages: Widely used in academic and research institutes all over the world.Easy to understand for beginnersWell written manualsWritten in JAVA, implemented on multiple platformsLarge database of inbuilt algorithms for image processing
  4. Application specific optimizationTweaking the system to better deal with specific or different inputs.Segmentation Includes two important phases: 1) Obtaining training samples 2) Recognizing new images after trainingFeature Extraction Feature of the character are extracted and hence are compared with the glyphClassification After the extraction, neural network is trained using the training data
  5. De-skew – If the document was not aligned properly when scanned, it may need to be tilted a few degrees clockwise or counterclockwise in order to make lines of text perfectly horizontal or vertical.Despeckle – remove positive and negative spots, smoothing edges[8]Binarization – Convert an image from colour or greyscale to black-and-white (called a "binary image" because there are two colours). In some cases, this is necessary for the character recognition algorithm; in other cases, the algorithm performs better on the original image and so this step is skipped.
  6. A character can be written in a variety of ways, and yet can be easily recognized correctly by aHuman. Thus, there exist a set of principles or logics that surpass all variation differences. Thus,the features used by the system work upon such properties which are close to the psychology ofthe characters.
  7. System performance can be increased further by:1) Increasing the DATABASE used for training the ANN, so as to enable it to recognize stylizedfonts also. 2) Using better algorithms for training the ANN, so as to decrease the Timecomplexity while handling larger databases. 3) Better Feature Extraction techniques so as toincrease the precision of results.