What preprocessing operations are performed by Tesseract OCR? -


i couldn't find detailed documentation , don't feel browsing source code. want not redo canny edge detection example if done tesseract engine.

this document provides overview of engine: https://github.com/tesseract-ocr/docs/blob/master/tesseracticdar2007.pdf

so looks don't need implement canny edge detection.

tesseract uses otsu thresholding binarize image before processing https://github.com/tesseract-ocr/tesseract/blob/master/ccstruct/otsuthr.h

edit: if want see binarized image create new config file in "\tessdata\configs\", add line: tessedit_write_images true , process image: tesseract your_image out your_config_file. tesseract saves binarized image tessinput.tif.


Comments

Popular posts from this blog

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -

c# - how to use buttonedit in devexpress gridcontrol -

nvd3.js - angularjs-nvd3-directives setting color in legend as well as in chart elements -