1) Objective

The goal of the test is to have objective comparative evaluations of Krisp standalone SDK and top competitors in the market – Zoom, Teams, Meet, Webex.

  • For the sake of fairness and neutrality, we have collaborated with a 3rd party company, SigmaConnectivity Lab to develop the testing environment.
  • The test conforms with MicrosoftTeams specification v4 chapter 4.4.1 Send Path – Send Quality In The Presence of Ambient Noise.

2) About SigmaConnectivity lab

SigmaConnectivity is a Swedish global tech-house that offers a wide range of design, testing, and measuring services for Audio products.

They provide access to their Anechoic and ETSI Rooms equipped with Head and Torso Simulator, Acoustic Simulations, and Field Measurements system that allows performing tests according to global standards.

Their Audio and Acoustics labs are capable to measure sound and vibrations and do various acoustic simulations. They also have the capability to design complex acoustic and audio systems in all kinds of products like mobile phones, loudspeakers, and headsets.

 

3) Test Setup

3-1) Devices under test

The following devices have been used for the testing.

  • Headset: Jabra Evolve 40
  • Headset: Plantronics Voyager Focus
  • Personal Speakerphone: HP Firefly 15 G8 built-in Mic

krisp noise cancellation testing

3-2) Noises

Per MicrosoftTeams specification the noise types depend on the device type.

3-2-1) Noises for Headsets

  • Mensa Binaural
  • Work Noise Office Callcenter Binaural
  • Male Single voice Distractor Binaural

3-2-2) Noises for Personal Speakerphone

  • Mensa Binaural
  • Work Noise Office Callcenter Binaural
  • Male Single voice Distractor Binaural
  • Cafeteria noise Binaural
  • Train Station Binaural

3-3) App setup

For all the apps (except Krisp), we’ve made a call with the app and captured sending signal recorded from DUT on the Reference PC and used that as the processed file for 3QUEST evaluations. AGC was turned ON for each of the clients.

In order to get audio files with Krisp processing independent of the audio pipeline, we’ve used Krisp Test Noise Cancellation.

4) Test results

As per the specification, there are two quality levels – Standard and Premium which are defined based on minimal and average 3QUEST values of each use case.

3QUEST results

For reference note that:

  • N-MOS (Noise Mean Opinion Score) shows the quality of noise removal
  • S-MOS (Speech Mean Opinion Score) shows the quality of preserved voice
  • G-MOS (Global Mean Opinion Score) shows the overall quality including speech and background noise

Note that the difference of 0.2 points in any of the scores is perceived by humans and makes a real difference.

Below we’ll bring some conclusions based on the results. The processed audio quality depends on the captured signal, i.e. DUT. So we’ll analyze the results for each device separately.

4-1) Test results with Jabra Evolve 40

This chart illustrates the difference in the default settings of all the apps under test. It clearly indicates that with this device Krisp leads the list and has an impressive difference in all 3 scores.

  • ahead by 0.15 points for G-MOS
  • ahead by 0.2 points for S-MOS
  • ahead by 0.13 points for N-MOS

krisp NC average results

It’s clear that there is always a tradeoff between the Noise removal and Voice keeping qualities of the models. Chat clients provide different options that allow controlling the NC level. When increasing the noise removal aggressiveness, there is always a decrease in speech quality. In the below chart, we bring comparative results for all available modes of the chat clients where we see this phenomenon – wherever N-MOS is increased we see a decrease in S-MOS value.

Nevertheless, Krisp leads the list and has the highest G-MOS score among the others.

Below is a detailed report per average and per minimum values for all the apps under test with the following color coding

  • Green → Premium Quality
  • White → Standard Quality
  • Red → Below Microsoft spec standards

4-2) Test results with Plantronics Voyager Focus

The chart illustrating the difference in the default settings of all the apps indicates that Krisp has a huge in G-MOS, it’s:

  • ahead by 0.4 points for G-MOS
  • ahead by 0.13 points for S-MOS
  • ahead by 0.13 points for N-MOS

Again Krisp leads the list of all possible modes of all apps.

Moreover, with the Plantronics headset, not all chat clients meet the minimum voice quality requirements, and Krisp is ahead here as well. You can see the overall distribution of scores in the table below.

4-3) Personal speakerphone: HP Firefly 15 G8 built-in Mic

The chart illustrating the difference in the default settings of all the apps in this case too indicates that Krisp has a huge in G-MOS, it’s:

  • ahead by 0.42 points for G-MOS
  • ahead by 0.14 points for S-MOS
  • ahead by 1.08 points for N-MOS

Actually, the chart below shows that in the cases where the S-MOS value of other apps is close to Krisp’s result, N-MOS is extremely low (2.7-2.8) indicating a bad noise canceling quality of the model.

In fact, the built-in mic is the hardest case for all the NC algorithms, but again Krisp is the winner here. Interestingly Teams Auto outperforms Webex here which was in second place for both headsets.

5) Conclusion

What is clearly visible in all charts is that Krisp HQ is the absolute winner in all cases. Even Krisp LP outperforms the competitors – among all 3 devices and all test cases there’s only one case (min value with Built-in mic) where Krisp LP yields its honorable second place to Teams Auto with a difference of just 0.1 points in G-MOS and still outperforming the majority of others in both N-MOS and S-MOS.

The built-in mic is the hardest case for all the apps as there is lots of reverberation and noise in this case compared to headsets. And Krisp has a huge advantage in noise removal quality here, at the same time keeping good Speech quality.

In general, the performance of all the apps highly depends on the captured signal by the headset. And that’s the reason for the difference in results for Jabra and Plantronics. With Jabra, the results are flatter, which indicates a good quality of captured audio. Krisp’s advantage is in providing a similar NC experience with different devices.

ALL THE REFERENCE FILES AND RECORDINGS ARE AVAILABLE AND CAN BE SHARED UPON REQUEST – Krisp team