This activity read the contents of the PDF text, including headers, and extracts the text.
- From Page Number – Set the page extraction mode into "Range" and specify the page numbers to start the extraction.
- Image Format – Specify the image format to save the extracted images.
- Image Resize Percentage – Allows you to rescale an image by the mentioned percentage.
- OCR Engine – OCR engine instance returned by the activity Create Tesseract OCR Engine.
- Page Extraction Mode – Set the page extraction mode to "All," "Single," or "Range" to continue the extraction.
- Password – TSets the password to the PDF file, if necessary.
- PDF File Path – Specify the name of the PDF file to export as an image.
- Single Page Number – Set the page extraction mode to "Single" and specify the page number to extract text.
- To Page Number – Set the page extraction mode to “Range” and specify until which page to extract the text.
- DisplayName – Add a display name to your activity.
- Private – By default, activity will log the values of your properties inside your workflow. If private is selected, then it stops logging.
- Continue On Error – Specifies if the automation should continue even when the activity throws an error. This field only supports Boolean values (True, False). The default value is False.
Note: If this activity is included in Try Catch and the value of this property is True, no error is caught when the project is executed
- Result – It displays the input text extracted from the PDF file using the OCR engine.