OCR_RegionMode

<< Click to Display Table of Contents >>

Navigation:  OCR Module > OCR Library Types >

OCR_RegionMode


 

PRO SDK Icon OCR_RegionMode

 


 

OCR_RegionMode is used to set the region mode of the OCR process. This can improve the accuracy of page segmentation (one of the first stages of the OCR process) when the layout is known ahead of time - for example in cases of zonal OCR where single lines or words are used. Page segmentation determines the layout of text lines and paragraphs. If full-page OCR is being performed then automatic layout analysis (OCR_Auto) is often the best setting to use. Please note that constants are case-sensitive:

 

 

CONSTANT

 

 

VALUE

 

MEANING

 

OCR_Auto

 

1

 

The OCR Module determines text layout automatically. This is the best option for most tasks, especially when the format of input data is not known.

 

 

OCR_SingleColumn

 

4

 

A single column of text.

 

 

OCR_VerticalText

 

5

 

Vertical text (horizontal upright characters arranged in a vertical line).

 

 

OCR_Block

 

6

 

A paragraph of text.

 

 

OCR_Line

 

7

 

A single line of input text.

 

 

OCR_Word

 

8

 

A single word.

 

 

OCR_Symbol

 

10

 

One character.

 

 

OCR_AutoRotateImageOnly

 

999

 

This constant is a special flag that OCR_MakeSearchable uses. It specifies that recognition is used only to straighten/rotate input pages and then output them to a new PDF document (OCR is not performed). New documents will be copies of originals (or the subset that the input PXO_Pagelist specified) that feature pages optimized to best fit horizontal lines of text. This is a useful feature for the pre-processing of image-based PDFs prior to performing other tasks, such as zonal OCR.