Data splitters

Splitter action tasks are used to split single data files into multiple data files.

Splitters initiate a recurring cycle that stops only when the original file has been completely processed. When a given splitter creates a file, it hands it down to the task that follows, and all the tasks on the same branch are performed until the output task. Then the splitter task creates yet another file that is again handed down to the next task, and so forth until the cycle ends (when there is no more data in the original file).

In PlanetPress Suite, such tasks can be used, for example, to split files that contain statements for multiple clients into smaller files that each contain a single client statement. Each statement can then be printed and sent by snail mail, or emailed directly from PlanetPress Workflow, to each individual client.

Note that if the process merges the split data with a PlanetPress Design document, the splitter must not alter the structure of the data file. In other words, each split file must have the same structure as the original files, otherwise the Design documents to which they will be sent will not be able to extract the data correctly and the merging process will fail.

In OL Connect jobs, data is normally extracted from a data file using the Execute Data Mapping task. That task stores the extracted data in records which can then be merged with a template.

Splitters do not modify the Metadata that is currently active within your process. This means that, if you are intending to use Metadata along with a process using splitters, you can either use the Metadata Sequencer instead of a splitter, or (re)create the Metadata after the splitter.

About using emulations with data splitters

An emulation specifies how to interpret a data file (see About data emulation.) When an emulation is used with a splitter action task, the job file is emulated, cut to pieces and de-emulated. Most times, the emulation/de-emulation process is completely transparent. However, in some cases, there may be minute differences.

When using the ASCII or Channel Skip emulation, if there are missing line feed characters (when lines end with a single carriage return in ASCII, or when lines start with a No line feed channel in Channel Skip), the output data will be different from the input data, but the change will not be significant.

Let us imagine that a splitter action task processes the following data file using the ASCII emulation:

Data line1 of page 1<cr><lf>

Data line2 of page 1<cr>

Last data line of page 1<cr><lf>

Data line1 of page 2<cr><lf>

...and so forth...

Once split, the first file generated by the action task would look like this:

Data line1 of page 1<cr><lf>

Data line2 of page 1<cr>

Data line2 of page 1<cr><lf>

Last data line of page 1<cr>

But when opened with PlanetPress Design or a PlanetPress Workflow using the ASCII emulation, the data in the generated file would look exactly like the data in the original. The same would hold true for the Channel Skip emulation.

Note the following details about emulations and their options:

  • With most emulations, if a file is split on a form feed, the form feed will not be appended to the output file.
  • With the ASCII emulation, tabs within the input data file are replaced by spaces (the number of spaces is determined within the configuration of the emulation).
  • With the ASCII emulation, if the Remove HP PCL Escapes option is selected, the data coming out of the splitter will have no escape sequences.
  • The Goto column option of Channel Skip emulation is not supported.