LogoLogo
  • 👋START HERE
    • Welcome!
  • ℹ️General
    • Release Notes
      • Full Feature Base Template
      • Services
      • Rule-based Automation
        • May 2025
        • February 2025
        • January 2025
        • December 2024
        • November 2024
        • October 2024
        • September 2024
        • August 2024
        • July 2024
        • June 2024
        • May 2024
        • April 2024
        • March 2024
        • February 2024
        • January 2024
        • 2023
        • 2022
        • 2021
        • Dialog Design Update
    • Glossary of Terms
    • Authentication Methods
      • SSO (Single Sign-On)
      • Built-In User Management
    • Acceptable Use Policy
  • ⚙️Rule-based Automation
    • Overview
      • Account Settings
        • Profile
        • Team
        • Roles and Permissions
          • User Management
          • Project Permissions
      • Basic Concepts
        • Project Management
        • Version Management
        • Multi-Lingual Bots
          • Supported Languages
        • Managing User Interactions
          • Unexpected User Input
          • No User Input
    • Dialog Interface
      • Blocks
        • Conversation Logic
          • Start Conversation
          • Global
          • State
          • Intermediate Response
          • To Previous State
          • End Conversation
        • Subdialog
          • Reusable Subdialogs
        • Phone
          • Continue Listening
          • Call Control
        • Technical Logic
          • Service
          • Condition
          • Storage
        • Other
          • Note
      • Speech Assets
        • Intents
          • Utterances
          • Descriptions
        • Slots
          • Custom Slots
            • List Slots
            • Machine Learning Slots
            • Regex Slots
            • LLM Slots
          • Prebuilt Slots
            • DTMF Slot
        • Text Snippets
        • Dictionary
      • Variables
        • Intents
        • Slots
        • Storage
        • Text Snippets
        • Environments
        • Platform
        • Context
      • Services
        • Service Integration Guide
        • Service Development
        • Service Branches and Error Handling
        • Public Services
          • Date and Birthdate Recognition
          • Spelling Post-Processing for Phone
          • IBAN Validation
          • License Plate Validation
          • Address Search
          • Street Names per Postal Code
          • Email Service
          • SMS Service
          • API Adapter
          • Salesforce-Flow Connector
          • Opening Hours
          • Speech-to-Text Hints
          • Fuzzy Match Names
          • Delay Service
      • Debugger
        • Phone 2
        • WhatsApp
        • Textchat 2
    • Environments Interface
      • Service Keys
    • Deployments Interface
      • Creating a Release
      • Editing a Release
    • Text-to-Speech
      • Azure
      • ElevenLabs
      • OpenAI via Azure (Preview)
      • SSML
        • Audio
        • Break
        • Emphasis
        • Prosody
        • Say-as
        • Substitute
        • Paragraph and Sentence
        • Voice
    • Autocomplete
    • Parloa APIs
      • CallData Service and API
      • Conversation History API
      • Textchat V2 API
    • Phone Integrations
      • Genesys Cloud
        • Setting up the SIP Trunk
        • Sending/Receiving UUI Data
        • Creating a Script to Display UUI
      • SIP
      • Tenios
        • Setting Up an Inbound Connection
        • Setting Up an Outbound Connection
        • Transferring a Call
      • Twilio
      • Public IPs and Port Information
    • AI Integration Overview
      • Dual Intent Recognizer (DIR)
      • Dual Tone Multifrequency (DTMF) Intent
    • Analytics and Debugging
      • Understanding Conversations and Transactions
      • Managing Caller ID Data
      • Hangup Events and Triggered Analytics
      • Analytics Transactions: Data Structure and Insights
      • Dialog Analytics
      • Audit Logs
      • Parloa-hosted Analytics
    • Data Privacy
      • Anonymizing Personally Identifiable Information
    • NLU Training
      • NLU Training Best Practices
    • How To
      • Create a Scalable and Maintainable Bot Architecture
      • Implement OnError Loop Handling
      • Resolve the 'Service Unavailable' Error
    • Reference
      • Parloa Keyboard Shortcuts
      • Frequently Asked Questions (FAQ)
      • JavaScript Cheat Sheet
        • Using Regular Expressions (Regex)
  • 🧠Knowledge Skill
    • Introduction
    • Knowledge Collections
    • Knowledge Sources
    • Knowledge Skill Setup
      • Step 1 – Create a Knowledge Skill Agent
      • Step 2 – Configure a Knowledge Skill Agent
      • Step 3 – Configure a Knowledge Skill Agent
Powered by GitBook
On this page
  • Introduction
  • Training Best Practices
  • Training Data: Relevance, Diversity, and Accuracy
  • Maintaining and Updating Training Data

Was this helpful?

Export as PDF
  1. Rule-based Automation
  2. NLU Training

NLU Training Best Practices

PreviousNLU TrainingNextHow To

Last updated 1 year ago

Was this helpful?

Introduction

Understanding human language involves complex challenges that require sophisticated techniques and models. In this domain, two key concepts emerge: Natural Language Processing (NLP) and Natural Language Understanding (NLU). NLP involves techniques for breaking down natural language into components machines can learn from. In contrast, NLU focuses on interpreting the semantic meanings of these components.

At Parloa, our approach to NLU utilizes the , a versatile model adept at both intent classification and entity extraction. This process includes training the model with various examples, such as distinguishing between "cats" and "ponies," to accurately map new inputs to the correct intents based on learned patterns.

Training Best Practices

For effective training of an NLU system, adherence to several best practices is crucial. These practices ensure the model learns accurately and remains adaptable.

Training Data: Relevance, Diversity, and Accuracy

For effective training of an NLU system, adhering to several best practices is crucial. These practices ensure the model learns accurately and remains adaptable.

Your training examples must be closely aligned with the real-world scenarios the AI is expected to handle. Ensuring the relevance of these examples is crucial for the AI to accurately recognize and act upon the intents you need it to understand.

Examples

  • "Book a flight to New York" directly connects to a common travel-related request.

  • "Schedule a meeting for next Wednesday" ties into typical calendar management tasks.

  • "Order a large pepperoni pizza from the nearest pizzeria" reflects a frequent food ordering scenario.

Incorporating a wide range of expressions and linguistic styles caters to the variety in the language you and others use. This diversity ensures the AI can understand different ways someone might express the same intent.

Examples

  • "I need to reserve a plane ticket to NYC" shows a formal request.

  • "Can you help me book a flight to New York City?" offers a polite query.

  • "How do I get a flight ticket to New York?" poses a question in another common form.

  • "NYC flight booking" exemplifies a terse, keyword-based request.

The training data must accurately represent the intended meanings to avoid biases and misinterpretations. Ensuring sentences are clear and directly related to the intents they're meant to teach the AI is crucial.

Examples

  • "Can you help me find a flight to New York City?" clearly seeks assistance in flight booking.

  • "Show me the weather in New York for the next week" targets a different intent, focusing on weather information.

  • "I want to cancel my flight to New York" illustrates the need for handling cancellation requests accurately.

By focusing on relevance, diversity, and accuracy and providing clear, distinct examples for each, you ensure the AI is well-prepared to understand and act on the intents it will encounter in real-world scenarios.

Balancing the Training Data

Avoiding Over Engineering and Over-Fitting

Achieving the right balance between system complexity and adaptability is key. Avoid overly detailed examples that may not generalize well to variations of a query. For example, a request like "Book a luxury suite with a panoramic view of the Eiffel Tower in a five-star hotel in Paris for three nights" could restrict the system's ability to understand related but slightly different queries.

Out-of-Scope vs. Fallback Intent

Implementing special responses for queries that fall outside the bot’s capabilities (Out-of-Scope) or are unclear (Fallback) improves user interactions by providing clear guidance when the bot cannot fulfill a request.

Maintaining and Updating Training Data

Regular Updates

Avoiding Overcomplicated Phrases

Ensure training examples are straightforward, focusing on the main information. This approach helps the system learn more effectively by reducing confusion.

Simplify Training Data

Simplifying user queries for training purposes, by focusing on the essential elements, enables the AI to learn more efficiently. For example, "Book a five-star hotel in Miami" is more effective for training than a complex sentence with multiple specifications.

Balance is essential in training data. Each request type or intent should be equally represented, with a minimum of 50 per intent to ensure stability and prevent overfitting.

Regularly update the training data with new phrases and expressions that reflect evolving language trends and adjust for specific changes. Simplifying training data to focus on the most relevant information facilitates effective learning, making the AI more adaptable.

⚙️
RASA DIET classifier
utterances
intent