Optical Character Recognition

From SingletonMillerWiki
Jump to: navigation, search

The aim of this project is to use the Raspberry Pi camera module to capture images and then scan them for text using OCR [en.wikipedia.org/wiki/OCR en.wikipedia.org/wiki/OCR]


To manipulate images and add OCR tools

sudo apt-get -s install tesseract-ocr imagemagick

Tesseract is the OCR tool and image-magick is a very power suite of image manipulation tools.

Basic OCR

Provided there is good contrast for the characters the basic fuctionality of tesseract works well.

UK Numberplates Front UK Numberplate font for rear plates

$ tesseract numberplate_UK_front.JPG numberplate_UK_front
$ cat numberplate_UK_front.txt
$ tesseract numberplate_UK_rear.JPG numberplate_UK_rear
$ cat numberplate_UK_rear.txt

The accuracy isn't terrible considering that the software is only using the basic training.

Image Manipulation

RPi camera image from Motion
If this image is passed to tesseract then nothing much happens.
$ tesseract 04-20130623173123-00.jpg 04-20130623173123-00 

The resulting txt file is empty indicatiing nothing is detected.

One improvement is to increase the constrast using.

convert 04-20130623173123-00bw.jpg  -threshold 20% 04-20130623173123-00bw.jpg 
BW image

However this doesn't work either!

$ tesseract 04-20130623173123-00bw.jpg 04-20130623173123-00bw
$ cat 04-20130623173123-00bw.txt

Reducing the scope

Perhaps the OCR programme can't cope with the clutter around the text. The next attempt was to crop the image. 04-20130623173123-00bwcut.jpg.

$ tesseract 04-20130623173123-00bwcut.jpg 04-20130623173123-00bwcut
$ cat 04-20130623173123-00bwcut.txt
um Km}

Not a terrific result, definite room for improvement.

Document Scanning with the Raspberry Pi Camera

  1. Take a photo of the document page. This results in a colour picture
  2. Threshold the image with 'convert' to produce a B&W image.
  3. Pass the image through 'tesseract'.
  4. review the results.

This can be done in one hit with something like.

raspistill -o image.jpg | convert image.jpg  -threshold 20% imagebw.jpg | tesseract imagebw.jpg  imagebw | cat imagebw.txt

Test Case

Here I'm using a scan image from the internet!

$ wget http://www.mattmahoney.net/ocr/stock_gs200.jpg
$ convert stock_gs200.jpg -threshold 70% stock_gs200bw.jpg
$ tesseract stock_gs200bw.jpg stock_gs200bw
$ cat stock_gs200bw.txt


.; USA Track your invesmwnts with our continuously
Iouqy updated stocks. Vlsit us on the web at
~‘°'“ mnney.usatoday.mm

This is reasonably good, with room for improvement.


  1. en.wikipedia.org/wiki/OCR