This operation consecutively numbers the rows within a group of matching values.
A table of transactions has been sorted by Insured Name and Transaction Date. Using Number Within Groups, we can derive a sequential transaction count for each transaction in the ‘Insured Name’ group.
Insured Name |
Transaction Date |
Transaction Count |
George |
20230101 |
1 |
George |
20230102 |
2 |
George |
20230103 |
3 |
John |
20230101 |
1 |
John |
20230102 |
2 |
John |
20230103 |
3 |
Paul |
20230101 |
1 |
Paul |
20230102 |
2 |
Adjacent rows with the same value in the the Group Column form a contiguous group.
If the contiguous grouping option is switched on, rows with matching values which are grouped together will be numbered consecutively. If the same values are encountered later in the dataset, numbering will restart at 1. In the example above, contiguous grouping is enabled.
If the contiguous grouping is switched off, rows with matching values will be numbered consecutively, even if they are not grouped together.
The table in the example above shows a set of January data. To demonstrate contiguous and non-contiguous group-numbering, we can append a set of February data and apply two instances of the Number Within Groups operation.
The first instance of Number Within Groups is set for contiguous grouping to derive a transaction count within each monthly dataset.
The second instance is set for non-contiguous grouping to derive an all-time transaction count.
Contiguous |
Non-contiguous |
||
Insured Name |
Transaction Date |
Count within month |
Count all time |
George |
20230101 |
1 |
1 |
George |
20230102 |
2 |
2 |
George |
20230103 |
3 |
3 |
John |
20230101 |
1 |
1 |
John |
20230102 |
2 |
2 |
John |
20230103 |
3 |
3 |
Paul |
20230101 |
1 |
1 |
Paul |
20230102 |
2 |
2 |
George |
20230201 |
1 |
4 |
John |
20230201 |
1 |
4 |
Paul |
20230201 |
1 |
3 |
If a stage uses multiple inputs, Number Within Groups will count within each group individually. If non-contiguous grouping across all inputs is desired, the inputs should be unioned prior to inputting to the stage.
To add a Number Within Groups operation to your pipeline: