character_recognition

character_recognition / NaturalTextNet / 0.2.1

README.md

This ocr algorithm is capable of extracting text from not only documents, but natural scene images as well! This algorithm is particularly well suited for noisy images like the one above. Give it a try and don't forget to leave a comment if you like it or have a suggestion/comment.

note: This algorithm works best when the image only contains text and is on a single line. It's recommended to crop out everything else from the image with an algorithm like text detection first.

Table of Contents

I/O

{  
   "image": String
}
  • image - (required) - a hosted image file, may be a web url (http, https) or a data connector uri (data://, s3://, etc).

Alternatively you can just pass a url directly to the algorithm as a string.

Examples

Example 1 - Resturant sign

Input

{"image":"http://i.imgur.com/qaaIRVa.jpg"}

Output

{"prediction":"toastbox"}

Example 2 - Rusty Stop sign

Input

{"image: "http://i.imgur.com/OKhDnVt.gif"}

Output

{"prediction":"grill"}

FAQ

Question: Is this is the perfect general purpose OCR algorithm?

No, it has flaws just like any other OCR tool on the market. It's only capable of parsing single lines of text and only when cropped appropriately; don't expect to throw it an uncropped stop sign and have it return "stop". Becuse of how the underlying model was trained, it has a character limit before the results start to get jumbled. This starts to set in around ~8-9 characters.

Question: In that case, what is this algorithm good for?

This algorithm is great at extracting data from natural scene images, or images that are too noisy for an algorithm like tesseract to understand. Tesseract is generally more accurate with less noisy images however which makes this algorithm an excellent fallback device.

Question: Is this algorithm available in my language?

Presently the underlying torch model was trained exclusively with English. However if your language closely resembles english you should get acceptable results. For a OCR algorithm that works in most langauges take a look at tesseract.

Credits

This algorithm uses the model described in An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition for most processing, the code can be found here.

All images sourced from the wikimedia foundation with the creative commons license.