Help Centre


About the Regex operation

The Regex operation allows you to find data matching a certain pattern which can then be extracted or changed. For example:

Patterns are defined using a language known as 'Regular Expressions', commonly called Regexes.

Regex tutorial

Simpler ways to achieve the same result
We're always trying to make it easier to configure a pipeline. Since we made the Regex video tutorial above, we've introduced simpler ways to do some of the examples shown.
  • Substitute ‘Rd’ for ‘Road’
    You can now use the Substitute function in Calculate
  • Change a four-digit zip code to a five-digit zip code
    In Calculate, you can now use the expression:
    RIGHT(CONCATENATE("00000", 'Column-name'), 5)
  • Change a date format
    Use the Date Cleanse and Date Output operations



Select the source column to search.


Enter the search term or regex pattern.

Output column

Either name a new output column, or select the source column to overwrite it. If no column name is entered, an unnamed column will be created.

Configure output

To remove regex matches, select ‘Remove matched value’.
To replace regex matches select ‘Advanced’ and enter a text string or a result pattern
(e.g. $1, $2, $3).

Search patterns

Enter the search pattern in the Regex field. This could be a specific sequence of text or numbers, such as the word ‘Rd’, though this is usually better dealt with using the Substitute function in Calculate.

The Regex operation is best used to search for patterns, using the Regular Expressions syntax. For example:

See the Syntax reference section for details and examples of the most commonly used Regex syntax.

Testing regexes
Rapidly develop your syntax by testing on a regex utility website, before using it in a pipeline.
The site allows you to enter a regex, then test it against a few sample values.
The site Java Regular Expressions Tester also allows you to test a regex, including the output group replacement functionality.

Result patterns

Once a pattern has been matched, it can be either be removed, replaced, modified or repeated. The result can overwrite the source column or can be written to a new column. The $ symbol is used to reference groups within the regex pattern.


1. Extract any numbers at the start of a string

We want to find the building number from a column of addresses and extract it to a new column. We can do this by writing a regex to identify any sequence of numbers at the start of an address.

  1. Select the source column
  2. Enter a regex pattern:

    anchors to the start of the string.
    looks for a sequence of digits of any length.
    indicates that the digits should be followed by any number of characters.

  3. Output to a new column called
    Building No.
  4. Select ‘Advanced’ and enter the result pattern
    to write the matched patterns to the new column.

2. Remove non-alphanumeric characters

We want to strip out non-alphanumeric characters, such as punctuation or symbols, to help standardise a list of company names. This regex looks any characters that are not words, whitespaces or digits and removes them.

  1. Select the source column
    Column 1
  2. Enter a regex pattern:

    [ ]
    matches characters not contained within the brackets
    matches any word character
    matches any whitespace character
    matches any digit
  3. Select the output column
    Column 1
    to overwrite the original column.
  4. Select ‘Remove matched value’.

3. Combine initials

In a list of company names, where initials are separated by a space, we want to remove the space. This regex looks for a single non-whitespace character, followed by one or more spaces. It then writes only the matched non-whitespace characters followed by the rest of the string.

  1. Select the source column
    Column 1
  2. Enter a regex pattern:

    asserts position at a word boundary
    matches any single non-whitespace character and defines the capturing group
    matches any whitespace character, one or more times
    Positive lookahead for a non-whitespace character at a word boundary. This marks the start of the rest of the string.

  3. Select the output column
    Column 1
    to overwrite the original column.
  4. Select ‘Advanced’ and enter the result pattern
    to write the matched non-whitespace characters, followed by rest of the string, to the original column.

Syntax reference

Quantemplate uses the Java version of regex. For full syntax details see the Java Regex Documentation.


Matches either the expression before or the expression after
matches either “gray” or “grey”
( )
both match “gray” or “grey”

[ ]
Matches a single character that
contained within the brackets
matches “a”, “b”, or “c”

matches “gray” or “grey”, but not “graey”, “graay”, etc.

specifies a range which matches any lowercase letter
from “a” to “z”

matches “a”, “b”, “c”, “x”, “y”, or “z”

matches “hat" and “cat”

matches only “a”, “.”, or “c”

[^ ]
Matches a single character that
is not
contained within the brackets
matches any character other than “a”, “b”, or “c”

matches any character that is not a lowercase letter
from “a” to “z”

matches all strings matched by .at except “bat”

matches “qu” in “question”. It does not match “Iraq” since there is no character after the q for the negated character class to match

Preceding item is matched exactly n times
matches only “aaa”

matches a number between 1000 and 9999

Preceding item is matched min or more times
matches “aaa”, “aaaa”, “aaaaa”, “aaaaaa”, etc.

matches “aaa”, “aaaa” and “aaaaa”

Preceding item is matched at least min times, but not more than max times
matches only “aaa”, “aaaa” and “aaaaa”.

matches a number between 100 and 99999

Matches any character
matches “abc”, “acc”, “adc”, etc.

matches “gray”, “grey”, “gr%y”

matches any three-character string ending with “at”, including “hat”, “cat”, and “bat”

Matches the preceding character
zero or one
matches “ac”, “abc”

matches “color” and “colour”

matches “hat”, “cat”, and “at”

Matches the preceding character
zero or more
matches “ac”, “abc”, “abbc”, “abbbc”, etc.

matches “s” followed by zero or more characters, for example: “s” and “saw” and “seed”

matches “hat”, “cat”, “hhat”, “chat”, “hcat”, “cchchat”, “at”, etc.

Matches the preceding character
one or more
matches “abc”, “abbc”, “abbbc”, but not “ac”

matches “hat”, “cat”, “hhat”, “chat”, “hcat”, “cchchat”, etc. but not “at”

Anchors the
position in a string
matches “hat” and “cat”, but only at the beginning of the string

Anchors the
position in a string
matches “hat” and “cat”, but only at the end of the string


Treats the escaped character as a literal rather than a regex command
matches any single character surrounded by “[” and “]” since the brackets are escaped, for example: “[a]” and “[b]”

matches “1+1=2”

Character classes

Any digit
Not a digit
Any word character (Alphahnumeric characters, plus the underscore “_”)
Not a word character
Whitespace character (tab, carriage return, space, etc.)
Not a whitespace
First character in a string
Last character in a string
Uppercase letters
Lowercase letters
Visible characters and the space character
Word boundaries
Non-word boundaries


The character x
The backslash character
The tab character
The carriage-return character
The escape character

Easy-to-use syntax testing environment for regular expressions.

Java Regex Documentation
Full documentation of the Java Regex version used by Quantemplate.

Java Regular Expressions Tester
Another environment to test the validity and output of a regular expression, including output group replacement functionality.