|
Engine configurationThe Connect Server cooperates with different engines to handle specific tasks. Connect allows for the parallelization of jobs. This means you can allocate 1 or more engines to process jobs. The number of each type of engine is configurable, as well as the amount of engines than can work together on the same job (determined by job size: small, medium or large) and at what maximum speed. This gives you, as solution developer or application manager, full control of how to apply a machines power. For example, you can share the available resources to process multiple jobs at once or allocate all resources to process one job as fast as possible, or anything in between.
Connect categorizes incoming print jobs on the number of records inside it. The boundaries between a Small to Medium job and Medium to Large job can be configured per server (see below, Allocating processing power to jobs). This topic explains all of these settings and the principles behind them, and it provides guidelines for letting the Server manage the workload in such a way as to achieve the highest possible output speeds. Factors to take into account are:
Other ways to enhance performance are described in another topic: Performance considerations.
Speed quota: PPM and speed unitsThe highest possible output speed depends first and foremost upon your licence. One engine needs at least one free speed unit to be able to create output. It is important to note that only output operations are limited by this quota.
In situations where Print and Email and/or Web output are created at the same time, only the Merge engines that create Email/Web output count towards the maximum number of speed units for that type of output. Spare speed units are distributed proportionallySince the number of engines is configurable, and jobs may run concurrently, the number of engines in use may not match the exact number of available speed units. Output speed is the speed at which the output is created by the engine in question. Data mapping and other steps in a production process are not taken into account. The throughput speed is the speed of the entire production process. This will always be lower than the output speed.
Launching multiple engines One single engine
can only process a single job at a time and will run mostly single-threaded. In order to benefit from multi-core systems it is recommended that several engines run in parallel. Modern hardware typically has both full cores and hyper-threading or logical cores. The logical cores should not be counted as a full core when determining how many engines to use. As a guide, count logical cores for only 25%-50% of a full core. To configure the number of engines:
It is advised that you do not configure more engines than can be backed by actual processing power. This adds overhead while not adding processing power. Deciding how many engines of each type to launchWhen jobs run in parallel, different types of engines may run at the same time. It depends on the usage situation which type of engines has the biggest impact on performance. The more and the larger operations of a kind need to be performed simultaneously with smaller operations, the sooner you will see a performance increase when using multiple engines. Merge engineGenerally, launching a relatively high number of Merge engines results in better performance, as Merge engines are involved in the creation of output of all kinds (Print, Email and Web) and because content creation is relatively time-consuming. DataMapper engineAdding a DataMapper engine might be useful when large data mapping operations have to run simultaneously with many smaller jobs, and also:
The Connect MySQL database needs a fast storage system (SSD or other fast devices) to be able to keep up with two or more DataMapper engines. When the database is installed on a system with a slow hard drive, adding a DataMapper engine may not increase the overall performance.
Weaver engineAdding an extra Weaver engine might be useful when large Print jobs have to run simultaneously with smaller Print jobs. Memory per engineBy default, each engine is set to use 640MB of RAM. To make optimum use of the machine's capabilities it might be useful to increase the amount of memory that an engine can use.
The Maximum memory per engine setting is found in the scheduling preferences of each engine type; see Merge engine scheduling and Weaver engine scheduling. Note that this setting only controls the maximum size of the Java heap memory that an engine can use; the total amount of memory used by an engine is actually somewhat higher. Also keep in mind that the Connect Server and the operating system itself need memory to keep running. Allocating processing power to jobsWhich engine configuration is most efficient in your case depends on how Connect is used. What kind of output is needed: Print, Email, and/or Web? How often? How big are those jobs? Do they have to be handled at the same time or in sequence? Would it be useful to give priority to small, medium or large jobs, and/or to jobs of a certain kind? Depending on the answers to these questions, you can allocate processing power to jobs in order to run them as fast as possible, and/or in the order of your preference. Job sizeConnect lets you define job sizes by setting the maximum number of records in a small job, and the minimum number of records in a large job. Jobs that are neither small nor large are medium sized. (Note that the term 'records' refers to top-level records only. Detail records are not considered.) There is no recommendation to make for the number of records in a small, medium or large job. This setting needs to be based on an assessment of the actual (or expected) workload of Connect. Job size is a relative concept: in a small service company a job may be considered large when it counts 1,000 records, whereas in a large insurance company the same job may be seen as small. Also take into account that jobs with fewer records could actually be medium or large if each individual record outputs 10,000 pages. To set the job sizes:
Running a job as fast as possibleNumber of parallel engines per Print jobTwo or more engines of a kind can be combined to work on the same Print job. Generally jobs will run faster with more than one engine, because sharing the workload saves time. To select a number of parallel engines per Print job size:
When each individual record
in a job is composed of a very large number of pages, the Memory per engine setting and the machine's hard drive speed are probably more important than the number of Merge engines, since one record cannot be split over multiple
cores (see Memory per engine).
Number of speed units per Print jobIf a Print job of a specific size has more than one parallel speed unit assigned to it, that multiplies its speed, however it reduces the number of Print jobs that can be run simultaneously. When no other Print output operations run at the same time, a single job will get all available speed units, or the maximum number of speed units reserved for jobs of that size (see Dividing processing power over jobs). To set a number of speed units per Print job:
Number of speed units for Email and WebAlthough assigning parallel speed units to Email and HTML jobs is possible (on the Merge Engine settings page), it is advised to use only one speed unit per job, firstly because these jobs are usually small. Dividing processing power over jobsThere is a number of ways in which you can divide processing power over output operations of a certain kind and/or size.
All of these engine configuration settings are found in the Scheduling Preferences:
How the Server decides if a job can be handledIn summary, this is how jobs are handled when they can run in parallel.
The following limitations apply at all times:
ExamplesHere are a few examples of use cases and settings that would be appropriate in such cases. Batch processing. In a batch processing situation, jobs don't have to be handled simultaneously. All jobs - whether they are big and small - are processed one after another. Every job should be handled as quickly as possible. It is therefor recommended to assign the maximum number of engines and speed units to all jobs. Do not reserve engines or speed units for certain jobs. Web requests. In online communication, response times are critical. If the Server receives a lot of Web requests, it should handle as many as possible, as quickly as possible, at the same time. It is recommended to launch as many Merge engines as possible and to reserve most of them for HTML output. The jobs will generally be small and can do with just one Merge engine. Mixed jobs that are processed in parallel. In a situation where small, medium and large jobs can come in at any time and should be handled in parallel, the challenge is to find a balance between how much power can be allocated to jobs (to minimize the time they cost) and how long they can wait. No single job should require all of the processing power, unless it is acceptable for it to have to wait until the maximum number of engines finally comes available - and then all other jobs will have to wait. |
|