Want to make your bot smarter with Natural Language Understanding (NLU)?
Great! You’re in the right place. To help you get started, we've created a video series that gives you a brief overview of the NLU Training Dashboard and its functions. Please feel free to check it out our Certainly NLU tutorials on Certainly's YouTube channel.
The NLU Training Dashboard can be found in the right sidebar menu. In this article we will:
- Define the words used in the NLU Training Dashboard
- How to prepare for the implementation of NLU into your bot
- How to use the NLU Training Dashboard
Getting the vocabulary right
To be able to use the Natural Language Understanding Training Dashboard, we will need to go through a few terms first. This is important to have an understanding and be able to use the NLU Training Dashboard.
Domains are made up of intents and can be considered to be the overall theme or subject for the area that could be covered. An example could be “Shipping”, where there are loads of different sub-categories/intents that can apply to this domain.
Intents express what your users want to do and can be considered to be sub-categories to your domains. An intent can, for example, be shipping price, shipping location, shipping time or shipping tracking.
Example sentences can be considered to be ways to express an intent. Examples could be: How much do I have to pay for shipping? Do you provide free freight? What will we be charged for getting the goods delivered? Tell me about your shipping prices.
Preparation for using the NLU Training Dashboard
Before working in the NLU Training Dashboard it is important you have an overview of what you want to focus on. To do so, data analysis can be an important step, as you will quickly get a sense of what topics and questions your end users frequently ask.
The best way to get started is to do perform data analysis of your pre-existing data. Traditional chat services such as Facebook Messenger, Zendesk, Service Now, LiveGuide, Dixa and SnapEngage store logs of conversations. If you already have a bot that is live, then you can also use data from Reports Messages. By analyzing these logs, you will be able to see tendencies and therefore be able to determine a structure for your NLU implementation. The logs are especially valuable if they are categorized by user intent. Logs from some chat services, such as Zendesk, often include such categorization. Taking advantage of this “labeled data” can help you to make decisions on what to focus on and in that way improve the initial performance of the NLU. This includes:
- Determining which intents are most frequent and important
- Determining how intents should be separated
- Determining which intents can be answered automatically, and which require human takeover
- Gaining insights into what domain-specific phrases that are important for triggering certain intents
For Zendesk users; macros are often a good indicator of intents that are used frequently and can be answered in a standardized manner. We recommend extracting statistics about the most frequently used shared macros. It might also be the case, that some customer support departments have libraries of personal macros. In some cases, tags are used to indicate the intent of tickets. If tags are used consistently, they carry important information that can be used to classify the sentences.
If you are unsure about how to export data, please feel free to reach out to us.
From the data analysis, you will be able to define your list of intents. We recommend that you:
- Try to find questions that can be covered by the same intent
- Try to divide your intents into what, where, why and how questions to get an overview of the different types of questions
- Define somewhere between 10 to 100 intents to get an overview of areas, which you should focus on.
- Give your intents small, easily interpretable names
If you need more than 100 intents, you should consider to either merge some intents together or narrow the scope for your chatbot. You should add at least two intents, as we do not allow the NLU to operate with less than two intents. For example, asking if shipping is free, or how much something costs should be one intent, as these both concern questions about shipping price. You can also narrow the use case for the bot. Your chatbot is not an actual human and should only cover a narrow topic. It might be preferable to create multiple chatbots for different domains or leave out some intents entirely.
Defining a good set of intents is important for the NLU engine, and we recommend that you review your chosen list of intents a few times before moving on to the next step.
Writing example sentences
Once you've decided on a list of intents, you should provide the bot with no less than 30 example sentences for each intent. It is important to use as varied a language as possible when providing the bot with example sentences. For example, the terms "Getting the goods delivered", "freight" and "shipping" are all different ways of saying the same thing. By covering all of the different possible expressions, the NLU engine will perform much better.
Working in the NLU Training Dashboard
Now, you have all of your data prepared, you can start working in the NLU dashboard. At first, when you enter the NLU Dashboard you will go through a short tutorial. It is always possible for you to redo this tutorial if you click on the question mark in the right-hand corner of the dashboard.
Then you will be asked to choose the language, which you want to use your bot in. After choosing this, you will be taken to your dashboard.
Adding a domain
Remember that a domain is made up of intents and should be the overall theme of the intents. To add a domain, press “Add domain”, which can be seen in the bottom of the dashboard.
Then you will be taken to an overview, where you will have to choose a language for your domain and then press the box that says, “Start from scratch”.
A folder will be added to your dashboard. This folder is your domain. You can name that folder bydouble-clicking on the text. Please remember to create easily identifiable names for your domains.
Using public domains
You might have noticed that when you entered the dashboard, there’s already quite a lot of folders with stars in it. These are public domains. The public domains have been created by Certainly and contain pre-built intents and pre-built text sentences. It can, therefore, be used as a foundation for your work, and we recommend that you use the domains and the intents as much as possible, so you don’t have to start from scratch.
You can always add example sentences on top of these intents. All you have to do is hover your mouse next to an intent and then the possibility to add a text sentence will appear.
After you have created your domain(s), it is now possible to add intents to the domain. You have to add at least two intents for the NLU to work. If you hover your mouse over the domain folder, you will see a small plus sign. Click on this and an intent will be added. You can name and rename this as you wish.
Creating example sentences
For each intent you will have to provide a number of example sentences. To create an example sentence, you will need to hover over your intent, and you will see a small plus sign. Click this and insert your sentence. Please remember, you should at least have 30 example sentences for each intent.
Testing your NLU data
After providing the chatbot with an initial list of example sentences, it is time to run the first user test of the bot. In the right side of the NLU Training Dashboard, you will be able to see a test area. Now, this might look inactive, which means you haven’t activated the state of your domain. To activate a state of a domain you have to turn it on. When a domain has already been tested its state is marked as green and if it has not been tested its state is marked as yellow. Then you are ready to test your bot!
We recommend that you run 10 independent tests with 10 colleagues who have not been involved in the development of the bot. Ask them to write a sentence, which has something to do with your domains and intents. The NLU Training Dashboard will then calculate the likelihood of the question being an expression of your intents. In this way, you can quickly get a sense of how to improve the bot. For each test, notice if:
- There are any intents that you've failed to cover with your initial intent definitions.
- There are any user expressions that the bot fails to recognize.
When your initial tests are over, add any missing intents and example sentences to your data and retrain your bot. If you find the performance of the NLU system unsatisfactory, iterate this step until you reach the desired result. For complex use cases, the NLU system might require hundreds of example sentences for each intent to reach a satisfactory performance.
Inserting your intents in connections
To be able to use your work from the NLU Training Dashboard, you will have to insert the intents into connections. The intents can both be inserted in a module’s local connections, but also in Global Connections. Please be aware that to be able to insert the intents, you will have to activate their state and train your bot with the intents in the NLU Training Dashboard.
When inserting intents in a connection, you will have to choose “NLU Understands” then you will get a list of intents, which are activated in your bot and you will need to choose one of these. After that, you will need to select which module to recognize the intent in. Then you will need to choose a confidence level of the intent. This should be based on the number of sentences you have added to the intent and also the results you have gotten from training your bot. You also have the option of enabling the "Only validates if it has the Highest NLU Confidence" function to ensure that the intent is only recognized if the NLU has the highest confidence that this is the correct intent. Meaning, if the NLU has a higher confidence that another intent than the one selected is the right one, then it will not match the connection at all. Lastly, you will have to choose an action of the NLU Understand condition.