SURENDRA PULAGAM: Datastage Runtime Column propagation

Sunday, January 24, 2010

Datastage Runtime Column propagation

Enable the RCP feature, stages in parallel jobs can handle undefined
columns that they encounter when the job is run, and propagate these
columns through to the rest of the job.

Enable Runtime Column Propagation for Parallel Jobs
If you enable this feature, stages in parallel jobs can handle undefined columns that they encounter when the job is run, and propagate these columns through to the rest of the job. This check box enables the feature, to actually use it you need to explicitly select the option on each stage

If runtime column propagation is enabled in the DataStage Administrator, you can select the Runtime column propagation to specify that columns encountered by a stage in a parallel job can be used even if they are not explicitly defined in the meta data. You should always ensure that runtime column propagation is turned on if you want to use schema files to define column meta data.

There are some special considerations when using runtime column propagation with certain stage types:

# Sequential File
# File Set
# External Source
# External Target

RCP Set at DataStage Adminstrator:

RCP Set at DataStage Stage Output:

Merge Stage: Ensure required column meta data has been specified (this may be done in another stage, or may be omitted altogether if you are relying on Runtime Column Propagation).

Lookup Stage: Ensure required column meta data has been specified (this may be omitted altogether if you are relying on Runtime Column Propagation).

Shared Container: When inserted in a job, a shared container instance already has meta data defined for its various links. This meta data must match that on the link that the job uses to connect to the container exactly in all properties. The Inputs page enables you to map meta data as required. The only exception to this is where you are using runtime column propagation (RCP) with a parallel shared container. If RCP is enabled for the job, and specifically for the stage whose output connects to the shared container input, then meta data will be propagated at run time, so there is no need to map it at design time.

For parallel shared containers, you can take advantage of runtime column propagation to avoid the need to map the meta data. If you enable runtime column propagation, then, when the jobs runs, meta data will be automatically propagated across the boundary between the shared container and the stage(s) to which it connects in the job.