OCR_GetText

<< Click to Display Table of Contents >>

Navigation:  OCR Module > High-Level Functions >

OCR_GetText


 

PRO SDK Icon OCR_GetText

 


 

OCR_GetText processes the input document then formats and returns the plain text. Please note that all elements are case-sensitive:

 

HRESULT OCR_GetText(

PXODocument Doc,

PXO_Options* pOptions,

BSTR* bstrTextOut,

PXO_Pagelist Pagelist = Null,

LPWSTR delim=L"\n"

);

 

Parameters

 

Doc

Specifies the PXODocument that OCR_Init created and OCR_LoadW or OCR_LoadA loaded.

 

pOptions

This parameter is an input pointer to a PXO_Options structure that contains the OCR parameters.

 

bstrTextOut

This parameter is a pointer to the BSTR variable that receives the allocated text. SysFreeString() must be used to deallocate text when this process is complete.

 

PageList

This parameter is an optional input PXO_Pagelist structure that contains a list of PDF pages to include in the OCR. If set to NULL then the function will OCR all document pages.

 

delim

This parameter is an optional text delimiter that is inserted between recognized pages of text. The default value is L”\n”, which is a newline.

 

Return Values

 

If the function succeeds then the return value is OCR_OK.

 

If the function fails then the return value is an error code.