Get Capture Document

The Get Capture Document action task is used after the Capture Fields Processor to retrieve all documents that have been updated.

This task is put into effect in the following use cases and example processes:

Input

A data file in PGC or PDF format that is accompanied by valid Metadata. This Metadata must contain Capture information and is generally available after a Capture Fields Processor or Find Capture Documents task. However, it is also possible to directly retrieve the required information from a specific Document ID. When a specific ID is used, the data file and Metadata are completely ignored by this task's condition rules, and the database information is used instead.

Processing

One PDF, corresponding to the information present either in the Metadata or specified in the task, is extracted from the Capture database.

When retrieving documents from the database, the PDF from which the document is obtained will remain in the database until each document contained in it is retrieved from it. For example, if a 10-page PDF contains 5 documents, the 10 pages remain in that PDF until all 5 documents have received ink, been closed and retrieved from the database. This may mean space issues if too many PDF files remain in your database.

Performance-wise, when this plugin retrieves a document from a 10,000 page PDF in the database, it will take more time then if it retrieved it from a 100 page PDF.

Output

The Get Capture Document action task is a loop that outputs a PDF version of the Capture Document. The PDF contains the original document, any ink added by the Capture Fields Processor action task.

In addition, any ICR information available (when using PlanetPress Capture ICR) will be placed at the Page Level, as follows:

  • ICR_[FieldName]_Val : The value of the text that was recognized by the ICR engine, for the field named [FieldName]. If the field is not and ICR field or if that field contains no ink, the value will be empty.
  • ICR_[FieldName]_Cfd : The confidence value (in percentage) of the engine for the value provided.

Task properties

General Tab
  • Document Origin group:
    • Document to process: Determines where the document information is read
      • From Metadata: Select to use the current document available in the Metadata generated by the Capture Field Processor.
      • From Specific ID: Select to specify an exact Document ID from the database. This document does not need to be loaded as a data file or its Metadata manually obtained, as this task simply looks up the information directly in the PlanetPress Capture database.
  • Document Type group
    • Get all documents: Get all the documents that have been updated, according to the Metadata.
    • Get closed documents only: Get only the documents that have been closed in this process, according to the Metadata.
  • Close document after retrieval: Once the task has retrieved the document from the Capture database, the document will be closed even if it is incomplete.
  • Annotate PDF: Add annotations to the PDF that describe each Capture field and the ink that is included in those fields. Note that not all PDF readers support annotations.

On Error Tab

For a description of the options on the On Error tab see Using the On Error tab.

Miscellaneous Tab

The Miscellaneous tab is common to all tasks.

It contains a text area (Task comments) that lets you write comments about the task. These comments are saved when the dialog is closed with the OK button and are displayed in The Task Comments Pane.

Check the option Use as step description to display the text next to the icon of the plugin in the Process area.

The tab also provides an option to highlight the task in The Process area with the default color, set in the Preferences (see Colors), or the color selected or defined under Highlight color on this tab.
To revert the selected highlight color to the default color, open this tab, turn the Highlight option off and close the dialog with the OK button; then turn highlighting back on.
Highlighting can also be turned on and off via the task's contextual menu and with the Highlight button on the View ribbon.