Help Centre

Managing pipelines
Running new data

Quantemplate pipelines can be used to perform the same set of transformations on data sources which are regularly updated. Once you’ve configured a Quantemplate pipeline to transform a particular set of data, it’s easy to feed in similarly formatted new data as you receive it.

This process is typically managed using Feeds. Once a Feed is connected to a pipeline, it can be set to bring in the new data and either automatically retain or deselect prior data. See how.

To run new data through a pipeline manually:

  1. Open the stage you wish to bring in the new data to. Click ‘Add inputs’ and drag and drop the new files into the popup. Click ‘Apply’ to select the new files as inputs to the stage.
  2. If desired, deselect the previous inputs. Do this after brining in new files so as not to lose any column mappings.
  3. If the importing is stage is linked to the the next stage, the new files will flow in automatically. Otherwise, open that stage and select the outputs from the previous stage as inputs. If this stage is a Union, the files will be renamed and flow through all connected downstream stages.
  4. Run the pipeline to generate your output data, validation and mapping reports

Best Practice

If you’re creating a pipeline that will be frequently running new data, consider using this structure:

  1. Start with a Transform stage to Remove Rows, Detect Headers and Map Columns to a common schema.
  2. Once mapped to a common schema, add in adate cleanseandvalidation operation.
  3. Now add a Unionstage to combine the source data. This creates a single output with the name based on the stage name, meaning the rest of the pipeline can flow in the new data automatically.
  4. Use thelinked stages feature to link the outputs of Stage 1 to the inputs of Stage 2
  5. Add subsequent stages to cleanse values, enrich, etc.