Using Regular Expressions (Regex)

Regex, or regular expressions, are powerful tools for pattern matching within strings. Despite their complex appearance, regex can be straightforward once understood and can significantly outperform ML models in terms of speed and accuracy for defined patterns.

Benefits of Using Regex

Accuracy

Regex provides high accuracy in pattern recognition for defined patterns.

Speed

Regex is faster to create and execute compared to training an ML model.

Efficiency

Regex simplifies many pattern-matching tasks once learned.

Implementing Regex in Parloa Bots

To use regex within a Storage block in Parloa, follow these steps to extract the desired entities from input text.

Step-by-Step Implementation

Step 1 – Initialize a Variable

Create an empty container to hold any useful data found.

let match = "";
Step 2 – Retrieve Text to Check

Obtain the text from another storage variable or direct platform input.

let textCheck = storage.utterance;
Step 3 – Optional Text Cleaning

Simplify your regex by removing non-alphanumeric characters.

textCheck = textCheck.replaceAll(/[^\w\d]/g, '');
Step 4 – Define the Regex Pattern

Store the regex pattern in a variable for readability.

const flightNum_pattern = /([A-z]\s?){2}([\d]\s?){2,4}/;
Step 5 – Test for Pattern Match

Check if the text contains a pattern match.

if (flightNum_pattern.test(textCheck)) {
    match = flightNum_pattern.exec(textCheck)[0];
}
Step 6 – Return the Match

Ensure the match variable is the last one the block sees.

match;

Example Code

// Initialize a variable to hold the match
let match = "";

// Retrieve the text to be checked
let textCheck = storage.utterance;

// Optional: Remove non-alphanumeric characters for simplicity
textCheck = textCheck.replaceAll(/[^\w\d]/g, '');

// Define the regex pattern
const flightNum_pattern = /([A-z]\s?){2}([\d]\s?){2,4}/;

// Check if the text contains a pattern match
if (flightNum_pattern.test(textCheck)) {
    match = flightNum_pattern.exec(textCheck)[0];
}

// Return the match
match;

Common Regex Patterns

Here are some common regex patterns you might encounter:

  • Flight Numbers (IATA): [A-z]\s?\w\s?([\d]\s?){2,4}

  • Flight Numbers (ICAO): ([A-z]\s?){3}([\d]\s?){2,4}

  • German License Plates: ([A-zÄÖÜäÃŧÃļ]\s?){1,3}([A-z]\s?){2}([\d]\s?){2,4}[EeHh]?

  • IBAN: ([A-z]\s?){2}([\d]\s?){20}

  • Insurance Numbers: ([\d]\s?){9}\w

Practical Examples

Example 1 – Flight Numbers

To handle flight numbers such as LH459, UA 59, TA1982, use the pattern:

  • Pattern: 2 letters + 2-4 digits

  • Example Regex: ([A-z]\s?){2}([\d]\s?){2,4}

Example 2 – Customer Numbers

For customer numbers starting with A, F, or H followed by a nine-digit number:

  • Pattern: [A|F|H][\d]{9}

Example 3 – Deconstructing Regex

Understanding project.json file patterns:

  • Pattern 1: \{\{.*?\}\}

  • Pattern 2: \{.*?\}

Last updated