Extract step properties
The Extract step takes information from the data source and places it in the record set that is the result of the extraction workflow. For more information see Extract step and Extracting data.
Description
This subsection is collapsed by default in the interface, to give more screen space to other important parts.
Name: The name of the step.
This name will be displayed on top of the step's icon in the Steps
pane.
Comments: The text entered
here will be displayed in the tooltip
that appears when hovering over the step in the Steps
pane.
Extraction Definition
- Data Table: Defines where
the data will be placed in the extracted record. The root table is
record, any other table inside the record is a detail table. For more
information see Extracting transactional data.
- Append values to current record:
When the Extract step is inside a loop, check this to ensure that
the extraction will be done in the same detail
table as any previous extractions within the same loop.
This ensures that, if multiple extracts are present, only one detail table is created.
Field Definition
The following field definition settings are identical for all fields.
- Field List: The Field List displays each of the single fields that belong to the selected step in a drop-down. Fields can be re-ordered and re-named within the Order and rename fields dialog (see Order and rename fields dialog). Select one of the fields to make further settings for that field.
- Add Unique ID to extraction field:
Check to add a unique numerical set of characters to the end of the
extracted value. This ensures no two values are identical in this field in the record set.
- Mode: Determines the
origin of the data.
Fields always belong to an Extract step, but they don't necessarily contain extracted data. See Fields for more information.
- Location: The contents
of the data selection determine the value of the extracted
field. The settings for location-based fields are listed separately, per file type:
- JavaScript : The result
of the JavaScript Expression written below the drop-down will
be the value of the extracted field. If the expression contains
multiple lines, the last value attribution (variable = "value";)
will be the value. See DataMapper API.
- Properties: The value of the property selected below will be the value of the selected field.
- Property: This drop-down lists all the currently defined properties (including system properties).
Custom properties can be defined in the Preprocessor step; see Preprocessor step. For an explanation of the objects to which the properties belong, see DataMapper Scripts API. - Choose a property button: Click this button to open a filter dialog that lets you find a property based on the first few letters that you type.
- Type: The data type of the selected data; see Data types. Make sure that the data format that the DataMapper expects matches the actual format of the data in the data source; see Data Format.
Settings for location-based fields in a Text file
- Left: Defines
the start of the data selection to extract.
- Right: Defines
the end of the data selection to extract.
- Top offset: The
vertical offset from the current pointer location in the ).
- Height: The height
of the selection box. When set to 0, this instructs the DataMapper to extract all lines starting from the given position until the end of the record and store them in a single field.
- Use selection:
Click to use the value (Left, Right, Top offset and Height) of the current data selection (in the Data ) for the extraction.
If the selection contains
multiple lines, only the first line is extracted.
- Post
Function: Enter a JavaScript expression to be run after
the extraction.
A Post function script operates directly on the extracted data, and its results replace the extracted data. For example, the Post function script replace("-", ""); would replace the first dash character that occurs inside the extracted string. - Use JavaScript Editor:
Click to display the Script
Editor dialog.
- Trim: Select
to trim empty characters at the beginning or the end of the field.
- Concatenation string:
- Split: Separate the selection into individual fields based on the Concatenation string defined above.
Settings for location-based fields in a PDF File
These are the settings for location-based fields in a PDF file.
Settings for location-based fields in CSV and Database files
These are the settings for location-based fields in CSV and Database files.
- Column: Drop-down listing all fields in the Data Sample, of which the value will be used.
- Top offset: The
vertical offset from the current pointer location in the Data Sample (Viewer).
- Post
Function: Enter a JavaScript expression to be run after
the extraction. For example replace("-","")
would replace a single dash character inside the extracted
string.
- Use JavaScript Editor:
Click to display the Script
Editor dialog.
- Trim: Select to
trim empty characters at the beginning or the end of the field.
Settings for location-based fields in an XML File
These are the settings for location-based fields in an XML file.
- XPath: The path to the XML field that is extracted.
- Post
Function: Enter a JavaScript expression to be run after
the extraction. For example replace("-","")
would replace a single dash character inside the extracted
string.
- Use JavaScript Editor:
Click to display the Script
Editor dialog.
- Trim: Select to
trim empty characters at the beginning or the end of the field.
Data Format
Format settings can be defined in three places: in the user preferences (Datamapper preferences), the current data mapping configuration (Data format settings) and per field via the Step properties. Any format settings specified per field are always used, regardless of the user preferences or data source settings.
Data format settings tell the DataMapper how certain types of data are formatted in the data source. They don't determine how these data are formatted in the Data Model or in a template. In the Data Model, data are converted to the native data type. Dates, for example, are converted to a DateTime object in the Data Model, and will always be shown as "year-month-day" plus the time stamp, for example: 2012-04-11 12.00 AM.
- Negative Sign Before : A negative sign will be displayed before any negative value.
- Decimal Separator : Set the decimal separator for a numerical value.
- Thousand Separator : Set the thousand separator for a numerical value.
- Currency Sign : Set the currency sign for a currency value.
- Date Format : Set the date format for a date value.
- Date Language : Set the date language for a date value (ex: If English is selected, the term May will be identified as the month of May).
- Treat empty as 0 : A numerical empty value is treated as a 0 value.
|
|