Help Centre

Pipeline inputs

In this section

Data flow through Quantemplate

Quantemplate imports raw data via Integrate pipelines, cleanses and harmonises, then outputs it to the data repo for downstream processes such as querying in Quantemplate Analyse or sending to external systems.

Adding raw data to a pipeline

Bordereaux or submission data which requires cleansing should be uploaded to a pipeline, rather than the Data Repo. Data can be uploaded to a pipeline via the Uploads tab or directly to a stage via the Input selector.

Upload via the Input Selector

The fastest way to get new data into a pipeline is via the Input Selector.

  1. Open a stage.
  2. Click add inputs.
  3. Go to the Uploads tab and add files via drag-and-drop or the ‘choose files’ button.
  4. Select the uploaded files using the checkbox and click ‘Apply Selection’.

Uploaded files are made available to all stages in the pipeline. Uploads can be reviewed and managed in the Uploads tab

Supported file formats

Quantemplate supports XLS, XLSX, CSV and GZipped CSV files.

GZipping a CSV file can significantly reduce its file size and upload time. 7-Zip is useful tool for applying GZip compression. Mac users can also use a Terminal command.

We are aware of an issue where CSV files which contain a Byte Order Mark and have GZip compression applied may fail to upload. In this case, use an uncompressed CSV format or open the file in a text editor and export as UTF-8 with no BOM.

Input tabs

The main panel of the pipeline workspace contains three input tabs:

  • Uploads
    Upload and manage raw data datasets.
  • Reference data
    View reference datasets currently used in the pipeline
  • Partner datasets
    View partner organisations’ pipeline outputs currently used in the pipeline

Uploads tab

In the Pipeline view, the right panel is the area to view and manage raw data files. They can also be uploaded here, though once uploaded need to be added to the correct pipeline stages via the Input Selector.

The uploads tab shows:

  • Filename plus the names of any worksheets in an Excel file
  • Stage in which an uploaded dataset has been used as an input
  • File size
  • Uploaded date

The uploads tab can display up to 500 items. To display more items, apply filters or archive some files.


Sort the uploads list

By default, files are sorted by Uploaded date, with most recent uploads at the top. The list can also be sorted by Name and Size. Click on a column header to sort by that column. Click again to reverse the sort direction.

Excel files with multiple worksheets

In the uploads view, when an Excel file contains multiple worksheets, they are presented as a subset of the file. The data in each worksheet can be previewed by clicking on it. Clicking on the main file name will show a preview of the first worksheet.

Elsewhere in Quantemplate, worksheets in Excel files are represented by joining the filename to the worksheet name.

Example

Filename: Acme-claims-April-2018.xlsx
Worksheet name: Sheet1
Filename in Quantemplate: Acme-claims-April-2018.xlsx: Sheet1

Search and filter uploads

Filter buttons for Used and Unused uploads allow the files uploaded to a pipeline to be easily managed.

  • Used shows only the datasets which have been used in the pipeline. If an Excel file contains multiple worksheets, only the used worksheet is shown, along with the Excel filename. This view gives you a compact summary of the uploads used in your pipeline.
  • Unused shows all uploaded files which have not been used in the pipeline. If an Excel file contains a worksheet which has been used, it will not appear. This view helps you quickly archive unused files, without accidentally removing a worksheet used in the pipeline.

Click a filter button once to toggle its state from showing to hiding, or vice-versa. Double-click a filter button to make it the only filter type showing.

Text filters allow you to search filenames to quickly find a file or worksheet.

Add or remove uploads

Adding uploads

To upload files to a pipeline, click the + button on the top right to open the browser file chooser, or drag-and-drop files directly onto the Uploads tab.

If a Used filter is applied at the time of upload, the upload progress will be shown, but the file will be filtered out once uploaded. This will also be the case if a text filter is applied which does not match the name of the uploaded file.

Adding files to the Uploads tab is useful if the uploads need to be previewed in Quantemplate before adding to the pipeline. Otherwise, the most efficient way to add uploads is via the stage Input selector, where they can be quickly selected as inputs to the stage.


Archiving uploads

Archiving an uploaded file will remove it from the Uploads tab and any stages which use the file. The file will not be permanently deleted, since all uploads are preserved as part of the pipeline’s history. Restoring a pipeline to a previous revision number will also restore all uploads that were present at that revision point.

To remove an uploaded file, click on the cross which appears on the right when hovering an item. A confirmation button will appear. Click on this to archive the file. This cannot be undone.

Multi-select and batch archive

To archive multiple files at once, first select the files:

  • To select individual files, use the selection checkboxes which appear on the right of a row when hovering. Once one checkbox is selected, the checkbox will appear on the other rows.
  • To select all files, click on the checkbox on the top right of the table. If a filter is active, only the filtered items will be selected. If a filter is cleared, the seleciion remains in place.
  • To select all files within a range, select one item, then hold down shift whilst selecting another item. All items between them will also be selected.

When one or more items are selected, the bulk edit bar appears. Click the Archive button in the bulk edit bar to archive the selected files.

Archiving Excel files with multiple worksheets
If an Excel file has multiple worksheets, individual worksheets cannot be archived. This preserves the integrity of the uploaded data. The whole file can be archived by clicking the Archive button next to the filename.

Previewing uploads

Click on an item on the uploads tab to see a preview of the first 1,000 rows.

Once in the dataset preview, navigate between uploads via the file navigator dropdown, above the data grid. The currently previewed input is highlighted blue.

The list can be filtered to show only used or unused uploads, as described above.

To return to the Uploads view, click the ‘Uploads’ navigation button in the top left.

Filter a dataset preview

Apply filters to the data preview to understand your data better. Click the filter button on the top right, or press the F key, to open the filter bar.

Read more about filter bars in Quantemplate.

Download the original file

Data that is imported to Quantemplate is pre-processed, removing all visual formatting and annotations. Files with multiple Excel worksheets are split out into separate files. The original file, preserving formatting, annotations, tabs, etc. is retained.

To download the original file: navigate to the dataset preview for the upload, click on the download button on the top right and select ’Download Original file‘.

Reference data tab

About reference data

The Data Repo is the storage area for clean datasets and reference data.

Datasets with a single row of headers can be uploaded directly to the Data Repo to create a reference dataset for use in pipelines – for example, a target header schema to map to. The upload process will ignore any blank rows above or below the data, or blank columns either side of the data. The first line of data will be interpreted as column headers. Read more about uploading data to the data repo.

Cleansed outputs from pipelines can also be exported to the Data Repo for onward sharing via API, sharing within Quantemplate, or reporting on in Analyse.

About the reference data tab

The reference data tab shows reference datasets currently used in the pipeline. Datasets can be added to or removed from the pipeline via the stage input selector.

Alongside the dataset name, the tab shows the stage the dataset is used in, the row count, last updated date, and whether the dataset is used in an Automap Values operation. The view can be sorted by name, rows and updated date.

The Dataset Information popup shows information about how a dataset has been created and updated, and where else it is used. Read more about it here.

Click the Automap Values icon to show the names of the stages and operations the dataset is used in.

Click anywhere on the row to view the reference dataset in a new tab.

Using multiple browser tabs
If your organisation does not use SSO login, you may need to use a browser incognito tab to prevent automatic logout when using multiple tabs.

Dataset missing error

If a dataset is unavailable to the pipeline, a notification will appear in the reference dataset tab.

Missing dataset: archived file

The dataset has been archived, so may not be accessible to the pipeline. The dataset should be replaced, or the dataset owner should restore the file from the archive. If you are the dataset owner, you can just click the ‘Restore file’ button which appears beneath the error message.

Missing dataset: permission required

The user does not have permission to view the dataset, so the pipeline cannot access it. The dataset owner should share the dataset with the user.

Notify an owner about a missing dataset
If a dataset has been archived, or you do not have permission to view a reference dataset used in the pipeline, you can send an email notification to the owner by clicking the ‘Notify owner’ button which appears beneath the error message.
Removing archived reference datasets from a pipeline
If a reference dataset is used in a pipeline but has been archived, or you do not have permission to access it, it will appear in the reference data tab, but not in the stage inputs. To remove an archived dataset, first restore it so that it appears in the stage inputs. Next, remove it from the stage inputs. It can then be archived again in the Data tab.

Partner data tab

Quantemplate allows organisations to make pipeline outputs available to partner organisations, where they can be used as inputs to a pipeline.

To add a partner organisation’s pipeline output to the current pipeline, use the Partners tab in the stage input selector.

The partner data tab shows all the partner organisation’s pipeline outputs which are being used in the current pipeline, alongside the stage they are used in and the last updated date. The view can be sorted by partner name.

Click on the row to preview a partner dataset.

Learn more about partner sharing →