The most performance-intensive operations are Automap Values and Aggregate .
Only use for Aggregation, don’t use to simply rename or subset columns.
A one-to-many join occurs when the items in one dataset are matched multiple times in the other dataset, creating a row for every permutation.
Consider if the full set of exploded values is required. Could the same result be achieved by joining to a simple lookup table? See Company Nmae Matching solution deep dive video , 23:50 onwards.
Creates a table for each value in the partition column, so use with care. Limited to 5000 tables.
Disable outputs from all other stages so they do not appear in the outputs view. This will speed up the pipeline run, since these outputs do not have to be created.