krisp
0:00
0:00
1.0x

What’s behind great Accent Conversion technology

 

This document is intended for contact center operators to assess the quality and performance of Krisp AI Accent Conversion (also known as Accent Translation, Accent Neutralization, or Accent Smoothing) with Sanas’s offering. 

 

Enhanced voice quality in agent-customer interactions, driven by accent conversion, generates ROI based on lower AHT, faster FCR, and improved CSAT and ESAT scores. 

 

Krisp and Sanas applications are deployed on the agents’ desktops and function as virtual microphones and speakers, working as companion applications with calling platforms. Delivering on the promise of smoothing difficult accents while maintaining clear voice quality in real-time is challenging and takes years to perfect.

 

There are a few technical challenges that make the task difficult:

 

  • Removing background noises and voices
  • Synthesizing a natural-sounding speech
  • Ensuring accurate pronunciation of different words
  • Conveying emotions
  • Doing all the above in various real-life situations (fast speech, acoustic conditions, different speakers, etc)

 

Krisp launched its first commercial application in contact centers in 2019 and has processed over 1 trillion minutes of voice calls. Today, Krisp is deployed across many BPOs and top-tier enterprise call centers, along with also being integrated into voice applications with more than 200 million users across both desktop and mobile devices.

 

The table below highlights the key performance and management requirements for delivering tier-1 voice fidelity that scales globally within contact centers.

 

Krisp vs Sanas

Krisp
Sanas
Current deployments
  • Over 200 million desktop and mobile devices
  • Over 200K contact center agents
  • Over 1 trillion minutes of Krisp-processed voice
  • Embedded into world-class services such as Vonage, RingCentral, Zoho, Aircall, Discord, others
  • Over 30K agents

Accent Conversion robustness

Supported accent packs
  • Indian English
  • Filipino English
  • Indian English
  • Filipino English
Modes of operation
  • Voice Preservation mode – fully preserves the user’s voice
  • Voice Profiles mode – allows the user to choose a natural-sounding output voice
Voice Preservation mode – somewhat preserves the user’s voice
Scalable range of output voices Yes
Can generate new voices in Voice Profiles mode
No
Limited to the user’s voice
Accent leakage
  • Some leakage in Voice Preservation mode
  • No leakage in Voice Profiles mode
Consistently observed leakage
Background noise and voice cancellation robustness Highly robust, automatically included in the Accent Conversion models Very limited
Agent and customer-side noise cancellation Bidirectional, automatically included in the Accent Conversion models Customer-side only
Headset robustness Highly robust Requires specific headsets
Built-in microphone compatibility Compatible Non-compatible, gives an error message
Speech naturalness Natural Natural
Wrong pronunciations Some Noticeably more frequent
Preserves user’s voice Yes Limited
User enrollment needed No Unknown
Dynamic adaptation to new speakers Yes, within the same or different call, regardless of the gender Unknown

Requires an output voice gender selection

Voice quality 16khz (wide-band, VOIP, industry-leading voice quality) 8kHz only

Noise Cancellation robustness

Voice quality and noise cancellation World’s best, based on objective and subjective tests (see and hear) New entrant, tests show noise leakage and voice quality degradation (see and hear)
Agent-side Background Voice Cancellation Worlds’ best (see test measurements) Other voices and background chatter leakage when in a typical loud call center
Agent-side Noise Cancellation Worlds’ best (see test measurements) Adequate performance for low-volume noises (fan, for example)

Noise leakage and voice degradation in contact center environments (other voices, loud chatter)
Customer-side Noise Cancellation Included
Optimized for inbound voice from mobile or landline. Pass-through of ringtones, dialtones, etc.
Not available
Acoustic Echo Cancellation Included
Optimized for call center use cases
Not available
Voice quality
  • 8kHz (narrow-band, standard telephony, good voice quality)
  • 16khz (wide-band, VOIP, industry-leading voice quality)
  • 32kHz (full-band, best voice quality – near studio-grade)
8kHz only

Application and audio drivers robustness

CPU utilization
  • Supports range of CPUs typically in agent desktops
  • Supports older, lower-end CPUs through smaller models
  • Has auto-switching between models based on CPU load
  • Single model uses 2x more than Krisp on i5-8th Gen CPU
  • Error message in Sanas app with older CPUs
  • Slightly higher CPU utilization for CPUs beyond i5 12th gen
Audio drivers  Highly reliable and tested for 7+ years Users often need to restart the drivers to avoid breakdown of mic and speaker audio streams.
Headset and application compatibility Compatible and tested with most headsets and voice applications used in call centers New entrant, minimal deployments and testing

Management and deployment at scale

Supported platforms Win, Mac, Linux, Chrome, VDI Win
SSO authentication
  • Available for agents, per the enterprise customers’ requirements
  • SSO/SCIM for automated provisioning and deprovisioning, saving admins’ time
  • Not available for users (agents)
  • Only available for admins
Remote deployment and settings for admins Available Limited
App version management and auto-update Available Limited
Analytics for Accent Conversion, Noise Cancellation and platform usage Available Not available
Enterprise-Grade Support 24/7

Application and IT infrastructure expertise during pilots and post-launch, including VDI

24/7

Limited

Krisp is a trusted vendor on G2

With over 500 reviews on G2, Krisp consistently excels in enhancing customer interactions for service teams. G2, a trusted platform for software reviews and assessments, showcases Krisp’s exceptional 4.7 rating—earned through the trust and endorsement of hundreds of verified professionals across diverse industries.

 

Check Krisp’s page on G2 here.

 

Krisp Voice AI Platform for call centers

Krisp is the only real-time Voice AI platform that covers every stage of the agent experience—before, during, and after the call—within a single, lightweight application. It eliminates the need to juggle multiple tools and services by delivering core capabilities like Noise Cancellation, Accent Conversion, Live Interpretation, real-time agent assist, and post-call summaries in one seamless interface.

 

Agents with pronounced English accents can benefit from Accent Conversion, which enhances comprehension in calls without altering their original voice. The same agents can handle international calls using Live Interpreter, enabling real-time multilingual conversations across 80+ languages with one click, directly in the Krisp app. This flexibility removes hiring constraints, the need for the standard language line services, and allows teams to scale globally without friction.

 

During the call, Krisp Agent Copilot provides real-time transcripts, key moment capture, and access to company and industry-specific knowledge via AI Chat, boosting confidence and precision. After the call, automatic summaries and reports help streamline follow-ups and coaching. All of this is centrally managed, with analytics and policy controls available in a unified Admin Portal.

 

Krisp platform easily integrates with the agent’s desktop to seamlessly work with all CCaaS and calling applications, delivering call quality that translates to much better CSAT and related contact center KPIs.

 

Conclusion

Both Krisp and Sanas are pioneers in Accent Conversion technology. 

 

Krisp’s solution is more robust given its superiority in core noise and voice cancellation as well as the ability to synthesize highly natural speech with minimum accent leakage. Krisp’s platform offers unmatched scalability and ease of deployment and management in call centers as well as multiple features within the same application.

Related Articles