Wysdom shines in top NLU benchmarking study

Share this 

Share on facebook
Share on twitter
Share on linkedin

At Wysdom, two things never stop – experimentation and benchmarking.

Experimentation ensures that we continually challenge the way we do things and foster a playground of trials where we constantly develop, adapt, and adopt cutting-edge techniques that will allow enterprises to serve engaging interactions with their end customers using intelligent automation.

Benchmarking introduces rigour and discipline. As a team, we push hard to meet the KPIs that an experiment seeks to serve. After all, what gets measured is what gets optimized.

We found an NLU benchmarking test using data that is way out of our wheelhouse

A benchmarking exercise led by Nguyen Trong Canh that compares the leading NLU engines in the industry recently caught our attention. The exercise uses data aggregated from open data question-answer datasets in Ask Ubuntu, Stack Exchange and a German public transit chatbot to create 4 distinct corpus’ for testing. (The summary of the datasets can be found here.)

Given these datasets are distinctly different from the usual industries that Wysdom deals with, the Wysdom team geared up with excitement to complete the benchmarking exercise and compare our own NLU engine to the data. After all, recent enhancements to our multi-stage NLU pipeline allows us to use a combination of statistical approaches, boosting, and deep learning engines, and gives us the ability to automatically detect and trash garbage utterances, identify and respond to small talk, and more.

F1 scores: A measure of accuracy

Things like intent classification and entity extraction are critical components of a Natural Language Understanding (NLU) system in any bot platform, so ensuring accuracy is the most important goal.

An F-score is a measure of a test’s accuracy. It considers both the precision and the recall of the test to compute the score, where precision is the number of correct positive results divided by the number of all positive results returned by the classifier, and recall is the number of correct positive results divided by the number of all samples that should have been identified as positive. The F1 score is the average of precision and recall, where an F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0.

Wysdom shines

As you can see from the results comparison, Wysdom offers one of the best NLU classification performances in the industry.

F1-scores for intent classification for each corpus:

While a good f1-score alone does not guarantee an effective bot, a poor f1-score definitely guarantees an ineffective one.

We should also note that the Wysdom Exchange, which provides pretrained models and data across specialized enterprise verticals such as telecommunications, banking, insurance and more, was not at play for this benchmarking exercise given the nature of the benchmarking data. The researchers compared the f1-scores, a very well acknowledged machine learning metric, in the field of information retrieval, which is commonly used to measure the performance of NLU systems.

Wysdom’s NLU is among the best in the world

This exercise provided us with validation that Wysdom’s NLU is right up there with the best and when combined with prebuilt, industry specific knowledge from the Wysdom Exchange, it outperforms even the biggest players in AI.

Interested in learning more about the Wysdom Exchange? Request a demo to see our Conversational AI in action.


We use cookies to ensure that we give you the best experience on our website. By clicking “I Accept” or if you continue to this site, we will assume that you consent to the use of cookies unless you have disabled them.

Data Scientist

As a Data Scientist, your day-to-day will involve writing queries, building dashboards, and preparing analytical reports about product performance for our clients and the Wysdom.AI team.  You will work with SQL, Tableau, and Python and ML Frameworks/libraries (among other tools) to create stunning visuals showing how Wysdom.AI is making their customers’ experience even better.

Implementation Engineer

As an Implementation Engineer you will be responsible for the solution integration of an enterprise grade conversational AI experience from a technical perspective. You will work closely with the lead solutions architect and be the technical face of the implementation team and lead the customer through the entire implementation cycle. You will work with one of the most diverse teams of linguists, data scientists, and innovators to deliver the best AI enabled customer experience.

Solution Architect

As a Solutions Architect within the Client Services team, you act as trusted advisor, responsible for the technical requirements and end to end solutions integration of Wysdom cognitive services within the client’s environment.  You will work with one of the most diverse teams of linguists, data scientists, and innovators to deliver the best AI enabled customer experience.

Cognitive Data Specialist

As a Cognitive Data Specialist, you will be responsible for the performance of the AI and quality of the corpus and will focus primarily on the VA training.  You will work with the client as required to ensure corpus is performing in an optimal manner.

Conversational Experience Designer

As a Conversational Experience Designer, you will be responsible for the designs of the overall customer experience, including the end-to-end dialog flows & journeys of the solution ensuring  design leverages  UxD best practices for optimal customer experience.

Conversational AI Specialist

As a Conversational AI Optimization Specialist, your responsibility will be to help drive the success of our solution for our clients. This involves building conversation flows, performing AI training, and partnering with clients to enhance their deployments.

Conversational AI Lead

As a Conversational AI Lead, you will be responsible for leading all Conversational AI program activities.  You will work with all team members to ensure deliverables are completed on time, with high quality and exceeds client expectations and goals.

Program Director

Responsible for the overall success for the client, including the end-to-end delivery and optimization of the solution, you will manage the sales process from pillar to post, including technical and commercial proposals, pipeline management, sales forecasting, and contractual documentation.

David Trotter, Wysdom

David Trotter

SVP, Sales & Marketing

David has 30 years of global sales leadership experience as a collaborative leader who believes in a strong team concept within sales and marketing organizations. David has spent many years working with growth companies and enjoys being face to face with customers and partners to create solutions that have a lasting effect on the customer’s business environment. 

Prior to joining Wysdom, David was the Vice President, Sales at Scalepad, and previously spent 11 years as Vice President for Latin America and Asia Pacific for Absolute Software. He also held senior sales management positions at GE Capital and Clevest.  

Michel Benitah

VP, Optimization & Delivery

Michel has 20 years of experience in leading the successful delivery of Conversational AI and Natural Language Customer Care solutions to some of the largest financial, telco, healthcare, utilities, and retail enterprises throughout North America. 


Prior to joining Wysdom, Michel spent 20 years at Nuance Communications, holding senior management and leadership positions within the enterprise division, most recently as director of the Toronto office and professional services team.

Frederic Lam

SVP, Sales

Fred brings in 25 years of international experience in sales and business development across North America, the Caribbean, Asia-Pacific, Europe, and the Middle-East.


Prior to Wysdom.AI, he held sales leadership positions at Oracle, Redknee, and Movius/Glenayre, successfully growing revenues in both large and small organizations. Fred has also been involved in the start-up community in the earlier stages of his career as an Investment Manager with SP Capital and was an alternate director on a few investee companies.

Karen Chan

Chief Engineering Officer, Co-Founder

With 20 years of experience in software and mobile, Karen has held senior technical roles at 5 startups, including Wysdom.AI, Clickfree, Mobile Diagnostix (HP), Teamatic, and Virtualthere.

Karthik Balakrishnan

Chief Technology Officer

Karthik has over a decade of hands-on, proven global expertise in emerging technologies and implementing intricate platforms and solutions for telecommunications and enterprise during his time at Amdocs, with senior positions in their India, Cyprus, America, and Canada offices.

Nitin Singhal

Chief Operating Officer

Nitin has over 20 years of success in global executions of business technology, driving operational efficiency and digital scalability for some of the world’s largest enterprise clients. 


Nitin spent 16 years at Redknee holding executive positions in Research and Development, Customer Operations, Partner Alliances, and most recently as COO.

Jeff Brunet​

President, Co-Founder

Jeff has more than 20 years of experience in the startup world, founding and growing 4 software companies: AracNet, Mobile Diagnostix (HP), ClickFree, and Wysdom.AI. 


His in-depth understanding of software development and the challenges in making new technologies successful in the startup world prove invaluable as he serves on the boards of XMG, SurfEasy (Opera), Locationary (Apple), Groupie, and as an advisor to Pushlife (Google), LogMeIn (IPO) and HP. 


Jeff holds 23 issued patents in the wireless and consumer electronics spaces and is the lead inventor on 30+ pending patents.

Ian Collins​

CEO, Co-Founder

Ian has founded and grown 6 technology companies over the past 20 years, primarily in the enterprise software space including Wyrex, Mobile Diagnostix (HP), Clickfree, and most recently Wysdom.AI. 


Ian invests, mentors, and sits on the boards of several startups in the Toronto area.