While I wait for a £1.75 USB LED light to solve my cupboard lumination problem I thought I would investigate digit recognition for an LCD screen.
Turns out some clever people before me have considered the same problem from the point of view of allowing the blind to read displays.
I found some good ideas in this publication – http://www.ski.org/rerc/HShen/Publications/embedded.pdf
Turns out the problem can be distinguished from classic OCR and bespoke algorithms provide better success. Both papers have Symbian implementations and so look perfect for implementation on the Raspberry Pi. The second publication looks slightly easier for a novice like me.
This leads me to sketch out the following rough process for a C++ program:
- Take PGM file from uvccapture as input
- Read PGM file in c++ – http://stackoverflow.com/questions/8126815/how-to-read-in-data-from-a-pgm-file-in-c
- Then binarize as per algorithm in second article.
- Then recognise blobs as per Wikipedia pseudo code for connected component analysis – http://en.wikipedia.org/wiki/Connected-component_labeling
- Then filter (e.g. By size h/w ration).
- Then detect digits (look up Hough voting).
- Then check digits are sensible.
- Output a short or byte number reading (e.g. 4.4).
There may be scope for adding in edge detection (as per first publication) – may be as an extra input for blob detection or filtering. Edge detection in Linux : http://linux.about.com/library/cmd/blcmdl1_pgmedge.htm .
This Thesis also has some useful ideas about using the OpenCV resource: http://repositories.tdl.org/ttu-ir/bitstream/handle/2346/ETD-TTU-2011-05-1485/LI-THESIS.pdf?sequence=1 (although I don’t think Tesseract would work very well – and it hasn’t been ported to the Pi as far as I am aware). However, for now loading 3GB of code (for OpenCV) may be overkill for my task.