What does Wysdom do? Get to know us! Watch the video.


Getting to the heart of bot health with BES and BAS

Share this 

6 key metrics every chatbot team should be measuring 

Virtual agents are never finished. They can always learn something new whether you launched them yesterday or 5 years ago. Knowing where a virtual agent is performing well, and not so well, is the job of a dedicated VA analytics suite. 

Based on a comprehensive survey of enterprise B2C chatbots, Wysdom has compiled the 6 key performance metrics you need to know, to correctly measure the effectiveness of your bot.

1. Bot Experience Score

Measuring a customer’s experience when they’re interacting with a chatbot can be a challenge. Explicit feedback is rare and is often and typically biased toward the negative, while surveys provide a very narrow view of actual engagement. This is for a couple of reasons: participation tends to be low, so you’re often working with a small sample size, and those that do participate aren’t always authentic in their responses (if they’re incentivized for example), thereby introducing unintended bias.

After many years of testing performance metric models, Wysdom has settled on a standard Bot Experience Score (BES) that can be used on any bot. This takes into account all customer conversations, to produce an unbiased score and a more accurate view of overall customer satisfaction. This score is purely measuring the experience and not the effectiveness of the bot.

The BES is a number that starts with a score of 100 and goes down in points every time there is a negative engagement signal within the bot. The negative signals used in the BES are:

  1. Bot repetition occured when the bot repeats itself for any reason during a conversation.
  2. Customer paraphrase occured when the customer uses a similar query twice or more in a conversation.
  3. Abandonment occured when the customer left the conversation mid-journey and did not reach a bot endpoint.
  4. Negative sentiment is detected using an AI-based sentiment model.
  5. Negative explicit feedback is received in the conversation.
  6. Profanity is present in the conversation.
  7. The customer used the word “agent” (or similar) more than once in a conversation. Note that using “agent” once and being directly escalated is not generally a bad experience.

The BES is based on an analysis of all conversations over a given period of time and reduces the score for negative experience signals that are common in all virtual agents. If a conversation has 1 negative signal it receives a score of 75, if 2 negative signals a score of 50, and if 3 signals or more a score of 0. All conversations are given a score and the average is used.

Using this formula across all conversations in a given period of time results in a very clear customer experience score. Providing the Bot Engagement Score by customer contact reason makes it actionable.

2. Bot Automation Score

The next most important metric for any bot program is how often the bot can satisfy the customer’s needs without the need for escalation to a live agent. We call this the Bot Automation Score (BAS).

The BAS is a binary metric that looks at whether the conversation was either fully automated or wasn’t. The BAS is not a measure of the experience itself, but rather how effective the bot is at completing tasks.

In our experience having analyzed the performance of dozens of bots, the most accurate measure of automation is derived using a formula that looks at negative signals. 

The score starts with all conversations in a given period of time and is reduced based on the negative signals. The negative signals used in the BAS are:

  1. The customer did not reach a bot endpoint, which is one of the final steps in a bot journey.
  2. The customer escalated to a live agent for any reason.
  3. The customer submitted any type of explicit negative feedback.
  4. The bot recorded a false positive.
  5. The customer requested an agent using any “agent”-like word, but wasn’t escalated to an agent.
  6. “Bad Containment” occurred when the conversation was not escalated, but the topic was one that we know the bot is not effective at automating. 

If a conversation has any negative signal from the list above it is considered not to be automated. Using this formula across all conversations in a given period of time results in a very clear measure of automation. This can also be used by contact reasons to deliver more actionable information.

In looking at the negative score, not only does it deliver a very conservative view of bot effectiveness, it quickly becomes obvious what actions can be taken to increase overall automation rates.

3. NLU (Natural-Language Understanding) Rate

The NLU rate is a common metric in the virtual agent industry. It is simply a measure of the rate that a classifier can match an utterance to a known intent at a given confidence level. 

4. False Positive Rate

The false positive rate is a measure of the rate that an utterance is classified by the model incorrectly although the model gives it a high confidence level. This is a difficult rate to measure and relies on an independent parallel NLP model, however lower false positive rates typically mean that the natural language understanding (NLU) set up for a chatbot is of good quality.

5. Bot Repetition Rate

Bot repetition is used in the Bot Experience Score (BES) but is also a good independent measure for all bots. A virtual agent theoretically should never repeat itself but this still happens regularly and identifying it will lead to quick improvements.

6. Positive Feedback Rate

Negative feedback is given in almost all situations at a multiple of positive feedback. The positive feedback rate is the rate of positive feedback divided by the total amount of feedback (positive, negative, neutral) to get a more useful rate.

One key aspect of bot measurements (on any chatbot platform) is that they need to be universal and must work on any use case that your bot may tackle. Whether you are working in customer service, revenue generation, employee services or any other special use case, in text or voice, the key metrics must be comparable across all bots so you can benchmark many bots against each other.

Getting to the heart of bot performance 

A good VA analytics suite will deliver many KPIs, but BES and BAS are still the 2 most important if you want to understand the overall quality of your bot. Many bot platforms are aligned with bot design and development but don’t have the level of analytics that can provide the true measure of bot performance. That’s where bot analytics software can provide the most accurate measure of performance. A mature chatbot platform will easily provide all the events (signals) required to produce the scores, and a bot analytics solution will provide deliver the clearest picture of bot quality. 

Curious to know how your bot stacks up? 

If you’re wondering how your bot stacks up, the BES and the BAS are two simple but important metrics that allow you to compare how your bot performs against others. 

Wysdom can measure your bot scores

If you have any questions about measuring your virtual agent quality please reach out to Wysdom. All we do is make chatbots great and we’d love to help you do the same.

We use cookies to ensure that we give you the best experience on our website. By clicking “I Accept” or if you continue to this site, we will assume that you consent to the use of cookies unless you have disabled them.

Data Scientist

As a Data Scientist, your day-to-day will involve writing queries, building dashboards, and preparing analytical reports about product performance for our clients and the Wysdom.AI team.  You will work with SQL, Tableau, and Python and ML Frameworks/libraries (among other tools) to create stunning visuals showing how Wysdom.AI is making their customers’ experience even better.

Implementation Engineer

As an Implementation Engineer you will be responsible for the solution integration of an enterprise grade conversational AI experience from a technical perspective. You will work closely with the lead solutions architect and be the technical face of the implementation team and lead the customer through the entire implementation cycle. You will work with one of the most diverse teams of linguists, data scientists, and innovators to deliver the best AI enabled customer experience.

Solution Architect

As a Solutions Architect within the Client Services team, you act as trusted advisor, responsible for the technical requirements and end to end solutions integration of Wysdom cognitive services within the client’s environment.  You will work with one of the most diverse teams of linguists, data scientists, and innovators to deliver the best AI enabled customer experience.

Cognitive Data Specialist

As a Cognitive Data Specialist, you will be responsible for the performance of the AI and quality of the corpus and will focus primarily on the VA training.  You will work with the client as required to ensure corpus is performing in an optimal manner.

Conversational Experience Designer

As a Conversational Experience Designer, you will be responsible for the designs of the overall customer experience, including the end-to-end dialog flows & journeys of the solution ensuring  design leverages  UxD best practices for optimal customer experience.

Conversational AI Specialist

As a Conversational AI Optimization Specialist, your responsibility will be to help drive the success of our solution for our clients. This involves building conversation flows, performing AI training, and partnering with clients to enhance their deployments.

Conversational AI Lead

As a Conversational AI Lead, you will be responsible for leading all Conversational AI program activities.  You will work with all team members to ensure deliverables are completed on time, with high quality and exceeds client expectations and goals.

Program Director

Responsible for the overall success for the client, including the end-to-end delivery and optimization of the solution, you will manage the sales process from pillar to post, including technical and commercial proposals, pipeline management, sales forecasting, and contractual documentation.

David Trotter, Wysdom

David Trotter

SVP, Sales & Marketing

David has 30 years of global sales leadership experience as a collaborative leader who believes in a strong team concept within sales and marketing organizations. David has spent many years working with growth companies and enjoys being face to face with customers and partners to create solutions that have a lasting effect on the customer’s business environment. 

Prior to joining Wysdom, David was the Vice President, Sales at Scalepad, and previously spent 11 years as Vice President for Latin America and Asia Pacific for Absolute Software. He also held senior sales management positions at GE Capital and Clevest.  

Michel Benitah

VP, Optimization & Delivery

Michel has 20 years of experience in leading the successful delivery of Conversational AI and Natural Language Customer Care solutions to some of the largest financial, telco, healthcare, utilities, and retail enterprises throughout North America. 


Prior to joining Wysdom, Michel spent 20 years at Nuance Communications, holding senior management and leadership positions within the enterprise division, most recently as director of the Toronto office and professional services team.

Frederic Lam

SVP, Sales

Fred brings in 25 years of international experience in sales and business development across North America, the Caribbean, Asia-Pacific, Europe, and the Middle-East.


Prior to Wysdom.AI, he held sales leadership positions at Oracle, Redknee, and Movius/Glenayre, successfully growing revenues in both large and small organizations. Fred has also been involved in the start-up community in the earlier stages of his career as an Investment Manager with SP Capital and was an alternate director on a few investee companies.

Karen Chan

Chief Engineering Officer, Co-Founder

With 20 years of experience in software and mobile, Karen has held senior technical roles at 5 startups, including Wysdom.AI, Clickfree, Mobile Diagnostix (HP), Teamatic, and Virtualthere.

Karthik Balakrishnan

Chief Technology Officer

Karthik has over a decade of hands-on, proven global expertise in emerging technologies and implementing intricate platforms and solutions for telecommunications and enterprise during his time at Amdocs, with senior positions in their India, Cyprus, America, and Canada offices.

Nitin Singhal

Chief Operating Officer

Nitin has over 20 years of success in global executions of business technology, driving operational efficiency and digital scalability for some of the world’s largest enterprise clients. 


Nitin spent 16 years at Redknee holding executive positions in Research and Development, Customer Operations, Partner Alliances, and most recently as COO.

Jeff Brunet​

President, Co-Founder

Jeff has more than 20 years of experience in the startup world, founding and growing 4 software companies: AracNet, Mobile Diagnostix (HP), ClickFree, and Wysdom.AI. 


His in-depth understanding of software development and the challenges in making new technologies successful in the startup world prove invaluable as he serves on the boards of XMG, SurfEasy (Opera), Locationary (Apple), Groupie, and as an advisor to Pushlife (Google), LogMeIn (IPO) and HP. 


Jeff holds 23 issued patents in the wireless and consumer electronics spaces and is the lead inventor on 30+ pending patents.

Ian Collins​

CEO, Co-Founder

Ian has founded and grown 6 technology companies over the past 20 years, primarily in the enterprise software space including Wyrex, Mobile Diagnostix (HP), Clickfree, and most recently Wysdom.AI. 


Ian invests, mentors, and sits on the boards of several startups in the Toronto area.