Parallel Job

image_pdfimage_print

This job type is used to execute several loads or jobs simultaneously. Parallel jobs come in two forms:

  • Several jobs that are started at the same time and are executed simultaneously. Normally, a job of any other type is always executed in single mode, i.e., without another job running simultaneously, by queuing the job that is started later. Note that data previews and component tests are always executed simultaneously.
  • Loads or sub-jobs that are executed in parallel, or sequentially, as in a process chain.

The definition of the job is similar to the standard job, with an additional “parallel” flag for all loads or sub-jobs that define its synchronization behavior. Every job of some other type (e.g. Groovy or switch) can be parallelized by simply creating new parallel job and adding the other job to it. However, jobs of type “parallel” can not be executed from Groovy jobs.

Example:

Job with three loads. An “x” denotes that the load is defined as parallel.

Load1

Load2

Load3

Explication

Executed sequentially

X

X

X

Executed in parallel

X

X

Load1 and Load2 in parallel, and when both are finished, Load3 is started

X

X

When Load1 is finished, Load2 and Load3 are executed in parallel

Parallel jobs or loads are useful when there are several jobs or loads that are processing distinct data, such as when there are multiple cube-type loads, each of which writes into a distinct data area (such as various cubes, or different areas within the same cube). If the same source system is used in the parallelized loads, the performance gain of parallelization of course also depends on the parallelization capabilities of this source system.

Jobs or loads that are defined as parallel will use multiple CPU cores. However, the processing inside a single component itself (e.g. a complex transform) is always single-threaded. The amount of required memory will also be higher if parallel jobs are used, as the memory required by the various jobs will be used at the same time. The amount of parallel jobs that will be executed at the same time by default is limited to 5; this limit can be changed via a configuration option. If the amount of running parallel jobs reaches the configured limit, further parallel jobs will be queued and started when execution slots become available.

If a parallel job (i.e., a job of type “parallel”) and a single job (i.e. a job of any other type) are queued, the parallel job will be executed first, independent of the order in which the jobs have been started. If several single jobs or several parallel jobs are queued, the starting order is respected. For example, a parallel job P can be included in a single job S, and the parallel sub-jobs of P will be executed in parallel. But if another parallel job is started independently, it will get queued until the job S is finished.

Advanced settings

Fail on status: if the job executes several loads or sub-jobs, the selected option defines the behavior in case of a warning or an error message in one of the loads or sub-jobs. The options are described below:

none:

All subsequent loads or sub-jobs are executed. The job terminates with “Completed with warnings” or “Completed with errors”.

error:

In case of a warning message, the subsequent loads or sub-jobs are executed and the job terminates with status “Completed with warnings”. In case of an error message, the job terminates without executing subsequent loads or sub-jobs with status “Failed”.

warning:

In case of a warning message, the job terminates without executing subsequent loads or sub-jobs and the job terminates with status “Failed”.

 

image_pdfimage_print