
You have probably yelled at one point, “Can I please talk to an actual person?” while using a customer service chatbot. You enter a simple question into the chatbox and receive a series of chatbot responses that are completely unrelated to your query. The issue isn’t yours; it’s the way that many companies design their chatbots, more like a vending machine (you put in a request and receive a canned response) than a helper.
However, on the other hand, if you were to ask a modern conversational Artificial Intelligence (ChatGPT) to write a poem about your dog, it will do so instantly! The difference between these two AIs is truly vast, and makes you question what causes some to completely lack understanding of their task, while others seem to understand or “get” it?
These two types of language processing systems aren’t magic; they’re fundamentally different ways of training systems to analyze language. The first one strictly adheres to a pre-defined menu of answers and therefore is unable to “think” or provide helpful responses outside of this menu, which is why chatbots frequently fail when you ask them something that isn’t on their menu. The second uses massive libraries of human language to teach the system to interpret and then generate flexible, spontaneous responses to what was asked. Understanding this difference between the two is key to understanding how each system really works.

The Most Basic ‘bots’ — How They Are Limited To Understanding Certain Key Words
These simple “bots” could be better described as digital vending machines rather than a partner for conversation. While they use language, they have no intention of understanding it; their intent is to match keywords to responses according to the rules programmed into them. When you enter an exact term (e.g., “hours,” “password”) programmed into their rules, the bot will return the pre-written response linked to that term. If you ask the bot something more complex (e.g., “When are you open?” or “I’m having trouble logging in.”), The bot may stop responding to your question because it hasn’t been programmed to recognize those exact words.
The reason these chatbots can feel so limiting is their rigid, rule-based structure. For simple, very predictable interactions, this structure works well (fast and efficient); however, it quickly breaks down when an interaction needs just a little flexibility. How then do the more sophisticated chatbots recognize that you’re really saying something different from what you’ve typed? They’ve learned to identify what you want to accomplish, or in other words, your ‘goal’, or ‘intent’.
What is Your Intent? The Secret to How Modern Chatbots Understand Your Goal
Your “intent” is how advanced chatbots will determine your goal. Instead of just searching for keywords, smart chatbots understand your intent by learning your overall goal. When you go into a store and ask for assistance, you can ask for help in a number of ways (“Can you tell me where the milk is?” or “I’m looking for milk”), even though the wording is different, a good employee will immediately know that your intent was to find a product. Modern chatbots work in the same way as an employee when they help you, as opposed to older chatbots that were based on rigid rules. They want to determine what you intend to do, instead of strictly focusing on the actual words you enter.
The purpose is not magic; it is from the process of intentional training. The developers provide the chatbots’ Artificial Intelligence with hundreds or thousands of examples of phrases that all point toward the same objective. In the case of the chatbot for an online shopping site, the AI finds out that the request, “Where is my package?”, “Track My Order” and “How long will I have to wait for my items to arrive?” both request the same thing: tracking an order. Through learning the pattern in all of these examples, the Artificial Intelligence can recognize where you want your request to go, regardless of the specific wording you use.
Since the chatbot focuses on achieving your goals, it doesn’t require you to figure out what the “right magic words” are to ask for assistance. With this, you’re able to communicate more freely, avoiding the frustrating loops that typically occur with less complex bots. Although the chatbot knows you want to book a flight, it still has to determine the other two main elements of that trip — the destination and the date/time of travel — before it fully understands you.
Identifying What Matters: How Chatbots Identify Relevant Information
After identifying your objective (your intent) with a chatbot, the next step is to fill in the blanks. Knowing you want to “book a flight” is an excellent first step; however, more is needed to perform the action. The chatbot will need the additional information (the details): where are you traveling to? When would you like to depart from? How many passengers are going on this trip? These are the key details that differentiate knowing what you want from doing something about it. In essence, they’re the necessary ingredients for the chatbot to collect prior to proceeding according to the recipe created by your intent.
To identify the individual parts that make up these sentences (ingredients), an artificial intelligence is trained to perform named entity recognition. As the name implies, this is a very sophisticated way of identifying individual pieces of information. However, in chatbots, each piece of information is called an “entity.”
- For instance, when the user states, “Book a flight to Paris for two people tomorrow,”
- The chatbot will identify the location, “Paris,” as a “Location.”
- the number, “two people” as a “Number”,
- the date, “tomorrow”, as a “Date”.
The chatbot is trained on various ways to categorize data, which allows it to extract the relevant data from your natural language statement, regardless of how you have phrased it.
The combination of understanding your intention and then extracting entities is the foundation for how most modern chatbots analyze your input and generate actionable commands. The chatbot analyzes your input using a two-stage process. In stage one, the chatbot identifies your overall intent, and in stage two, it pulls out the necessary information from your conversation about the entities involved in achieving that goal. Once the chatbot has identified both your intent and the entities involved, it can perform an action, such as finding flights. What’s happening within the “brain” of the chatbot to allow the identification of both intent and entities?
The Chatbot’s Brain: How NLP, NLU, and NLG Work Together
The “brain” of this chatbot is built around three key concepts: NLP, NLU & NLG. NLP is an area of Artificial Intelligence research that enables computers to work with human language—reading what you write/type, understanding its meaning, and writing back to you in a manner that makes sense. This is the core technology behind turning a simple program into a conversational partner.
For a chatbot to hold a productive conversation with a user, it must accomplish two primary objectives: comprehend what you have said and subsequently develop a helpful response. These two tasks are not singular events but rather a sequence of operations; First, the system will need to deconstruct the muddled mass of human language into understandable elements (i.e., “process” your human words), secondly, it must interpret the actual intent or meaning behind those human words (i.e., “grasp”), and lastly, it will need to craft its own sentence to respond (i.e., “develop”).
There are 3 distinct but interconnected components which lend themselves easily to being considered as 3 separate parts of an “entity” (the body):
1. Natural Language Processing (NLP): The Ears: The entire cycle starts with this component. NLP converts your raw words into a data format that computers can interpret/use. This is how the computer correctly processes your input.
2. Natural Language Understanding (NLU): The Brain: This component has the function of thinking about the input you’ve given it. NLU determines what you actually wanted; i.e., what was your objective/goal/wish and extracts the key entities from your input.
3. Natural Language Generation (NLG): The Mouth: Once NLU understands your input, NLG creates a natural language output (a human-like sentence).
Collectively, NLP-NLU-NLG (natural language processing, natural language understanding, natural language generation) enables a chatbot to hear/respond/think/speak back. Most customer service chatbots we see today are powered by this type of natural language framework. Recently, an emerging model has created a new level of this, enabling AI to appear nearly human-like.
The Artificial Intelligence Revolution: How “super powered autocomplete” has changed everything

Whereas the Natural Language Understanding (NLU) process of identifying intent works well for structured tasks (such as booking a flight), the most sophisticated chatbots we see today — including ChatGPT — operate on an entirely different premise. Rather than being trained manually with rules for every single possible objective, these systems are designed to self-learn language through consuming a massive
The idea behind this concept may be quite different from the autocomplete features on your Smartphone; however, the basic methodology is very much alike. In the same manner that your smartphone’s auto-complete provides possible alternatives for the next word as you are typing based upon the most commonly occurring combinations of words, this technique employs the same process but using a massive amount of information from hundreds of billions of web pages, books and articles to develop a model known as a large language model (LLM) that becomes an expert at predicting the next probable word regardless of context.
The fact that they can create text one word at a time gives them a sense of creativity and makes them feel very human-like. Rather than simply providing an existing, written answer or a particular intention based upon your input, the LLM creates a brand-new answer by stringing together the statistically probable words. The LLMs’ ability to generate is a major technological advancement; however, their generation is pattern-based, which helps explain why many of their responses will seem very fluid, while others will seem strangely off-base or nonsensical.
The Human Aspects of Artificial Intelligence That Advanced Chatbots Just Don’t Understand
Chatbots are amazingly adept at generating text, but they consistently trip up in non-robotic ways. Think back to the last time you followed up on a previous question (i.e., “What if it’s less expensive?”) only to find that the chatbot had absolutely no clue what “what” you were referring to? This is an example of the difficulty of understanding contextual references. Keeping track of a sequence of messages in a conversation is one of the biggest challenges for Artificial Intelligence, since most AI systems treat each incoming message independently of the others.
The issue can be addressed by using a “Dialogue Management” method used for chatbots to act as their memory (short-term) to attempt to remember all of the main items you’ve previously told them about (like the product you’re looking at or the city you are traveling to). This approach makes conversations seem natural when the Dialogue Management is successful, but unnatural when it is unsuccessful, since you’ll have to repeat yourself because the chatbot forgot the last conversation item. The limitations show the difference between identifying patterns and actual understanding.
Another area where A.I. is lacking is in its ability to grasp human nuance (subtlety). As an example, when you say “Oh, fantastic, my flight has been delayed, the average person would recognize the sarcasm behind the word choice; an AI system, because it was trained on connecting “fantastic” with a feeling of enjoyment, will entirely miss the irony and think you’re thrilled that your flight has been delayed. This is because Artificial Intelligence simply looks at the words literally and ignores the emotional tone, prior interactions, and the understanding between people that enables sarcasm to convey the true intent of the speaker. Therefore, we are seeing customer service chatbots respond to customer complaints by stating, “That’s great! Can I assist you further?”
The essence of this conflict exists in the realm of Sentiment Analysis, a process through which Artificial Intelligence (AI) attempts to determine the emotional content of what you write. While sentiment analysis has improved at identifying obvious expressions of happiness and anger, it can be easily misled when interpreting the complexities and/or sarcasm of the emotions we express in our daily writing. The AI does not actually experience your frustration; rather, it simply tries to make an informed decision about whether the frustration you are expressing in your words. Therefore, until AI can understand the complexity of human communication, chatbots will remain extremely effective but also extremely patient, requiring tools to communicate effectively and clearly.
How It Impacts Your World: How to Communicate Effectively with Smarter AI
Knowing how modern AI systems operate will allow you to recognize the difference in the levels of intelligence for today’s AI compared to yesterday’s AI (where answers were equally as mysterious as they were unclear). With this new insight, you can become a partner in conversations with AI rather than simply being the recipient of its output.
Your ability to improve your conversational flow with your chatbots is built on what we’ve learned today. Apply that learning when you next interact with an artificial intelligence using the following steps:
- Be as clear and specific as possible: Instead of saying “I am having a problem, tell the AI, “My password will not reset for the account at user@email.com.
- Use alternative wording if necessary: If the AI does not understand you, ask your question in another way. You now know that the AI is trying to identify intents and entities.
- Provide context: When performing more complex tasks, provide all the information you have about the task in one step; (i.e., “I want to book a hotel in Chicago for 2 adults from June 10th through June 12th.”)
This is no longer a mechanical machine; this is a directional tool. In the past, you were simply limited to communicating with your computer by means of spoken word; you may now be described as “participating in the meeting” because you are now able to aid in reaching a common understanding between people and machines and create an illusion that interacting with machines will appear to be more similar to interacting with people.
Conclusion
As chatbots improve at processing human language, we’re beginning to see how we can use them for online searching, learning, and getting help. Each successful conversation is made possible by a wide range of NLU technologies, including tokenization and embeddings, intent identification, and generative models that attempt to understand language. While today’s technology doesn’t “think” as humans do, recent advancements in AI language processing continue to close the gap on human-to-machine language usage. Businesses that have developed and continue to monitor their conversational AI as a continuously trainable system will also develop the most valuable conversational AI knowledge and generate the greatest customer value and innovation through each day’s conversations.
Q&A
Question: In what way do chatbots actually “understand” natural spoken language?Answer: By converting text into some sort of numerical representation (embedding), chatbots use a combination of various machine learning model types (e.g., especially, large language models (LLM)) that are trained with enormous volumes of written text data to identify possible patterns in, possible meanings of, and possible next word(s) and/or response(s). In other words, this is simply statistical pattern matching (and not “comprehension” as humans experience it), yet statistically-based pattern matching can mimic comprehension in many instances.
Question: How is Natural Language Processing (NLP) used by chatbots?Answer: The purpose of using NLP for a chatbot is to take a user’s input and break it down into its various components, including tokens (words), parts of speech (nouns, verbs etc.), entities (the people, places, organizations, etc., that are mentioned), intent (what the user wants), and sentiment (how the user feels). After breaking down the user’s input in this way, the chatbot will use this structured representation to select an answer from a database or generate a response based on the information provided.
Question: How do chatbots deal with ambiguity, slang, or typos?Answer: Modern Models are trained to learn and be able to make inferences about what a user means based upon the users’ context (even if the user made spelling mistakes or uses slang), based upon training data that includes all kinds of “noisy” (e.g., incomplete or grammatically incorrect) data from real users; and thus will find the most likely interpretation by comparing the user’s input to the hundreds of thousands of examples they have learned from in their database.
Question: How do the language understanding abilities of AI-powered chatbots differ from those of rule-based chatbots?Answer: Chatbots that are based on the “if–then” rule model will use static keywords to find a match (or lack thereof) for each user question/input, and these types of chatbots will quickly fall short when users ask the same question but with a different phrase. Chatbots that use machine learning and/or deep learning can learn to respond to many variations of the same question and allow users to communicate in an open-ended/conversational way.
Question: Are today’s AI-based chatbots able to understand language as we do?Answer: No—not yet. Today’s chatbots do not have the ability to be conscious, possess intentionality, or have a “grounded” experience (i.e., a direct experiential connection to the world around them). Instead, they work by statistically modeling the relationships within language. While today’s chatbots can reason with us and converse in ways that resemble human conversation, this “understanding” of ours is fundamentally different from what we call “cognitive understanding.”


































