What preprocessing operations are performed by Tesseract OCR? -

- September 15, 2011

i couldn't find detailed documentation , don't feel browsing source code. want not redo canny edge detection example if done tesseract engine.

this document provides overview of engine: https://github.com/tesseract-ocr/docs/blob/master/tesseracticdar2007.pdf

so looks don't need implement canny edge detection.

tesseract uses otsu thresholding binarize image before processing https://github.com/tesseract-ocr/tesseract/blob/master/ccstruct/otsuthr.h

edit: if want see binarized image create new config file in "\tessdata\configs\", add line: tessedit_write_images true , process image: tesseract your_image out your_config_file. tesseract saves binarized image tessinput.tif.

Search This Blog

Hide

What preprocessing operations are performed by Tesseract OCR? -

Comments

Post a Comment

Popular posts from this blog

c# - how to use buttonedit in devexpress gridcontrol -

how to display 2 form fields on same line with bootstrap -

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -