Every contact center agent who picks up a call today faces a threat they were never trained to handle.
The caller sounds like a real person. They know account details. They apply exactly the right amount of pressure. And in a growing number of cases, the voice on the other end is not human at all — or it belongs to someone other than the person it claims to be.
Voice fraud has evolved faster than the tools designed to address it.
Today, Krisp is introducing Voice Security, available in early access, real-time protection at the agent level, covering the threat vectors that now define risk in the contact center voice channel: synthetic caller fraud, social engineering, and agent identity fraud.
Why the voice channel became a target
Contact centers are financial control points. Customer identity is verified here. Transactions are authorized here. Sensitive data is accessed here. That concentration of high-value actions in a single channel, handled at scale by agents who must make trust decisions in seconds, creates an obvious attack surface.
Several threat vectors have converged to make that surface more exposed than ever.
Voice cloning is cheap and accessible. Today it requires a $5 monthly subscription. Human agents correctly identify AI-generated voices approximately 60% of the time, approaching chance, with no meaningful improvement from training.¹ Deepfakes grew from 0.1% to 6.5% of all detected fraud attempts in three years, a 22x increase.²
Social engineering at scale. Not every attack uses synthetic voices. Real human callers manufacture urgency, exploit agent empathy, and manipulate agents into bypassing standard protocols. These attacks are harder to train against because they exploit human behavior, not technical vulnerabilities.
Agent substitution. In distributed operations, there is no reliable way to confirm the person handling calls under an agent’s credentials is actually that agent. Unauthorized substitution creates compliance and fraud exposure that grows with remote workforce scale.
The financial picture. AI-enabled fraud contributed to more than $30 million in documented business email compromise losses in the US in 2025 alone.³ Gen-AI-enabled fraud losses are projected to reach $40 billion by 2027.⁴ These figures reflect reported cases only.
How Krisp Voice Security works
Voice Security runs within the existing Krisp Call Center AI deployment on the agent’s device. It covers all three threat vectors described above.
Deepfake Detection
Deepfake Detection analyzes inbound caller audio from the start of every call. Within seconds, it produces a verdict on whether the voice is synthetic or real. On a high-confidence detection, the agent sees an alert in the Krisp widget, prompting them to follow their organization’s fraud escalation protocol, before any account access, transaction, or credential reset takes place.
Every detection event is logged automatically, whether or not an alert fires. Fraud teams get a dashboard showing detection volume, trends, and individual flagged calls, making every Krisp deployment fraud-aware from day one.
Deepfake Detection is available as an add-on to existing Krisp Call Center AI seats.
Fraud Detection
Fraud Detection addresses the attack that voice analysis cannot catch: social engineering by real human callers.
It works on the live audio. From the moment a call begins, Krisp analyzes the conversation for 18 behavioral and linguistic signals associated with manipulation: urgency pressure, refusal to verify identity, unusual knowledge of account details, requests to skip standard protocols. A risk score updates continuously and maps to one of four bands. At elevated risk, the agent receives an advisory prompt to follow company-approved escalation steps.
The detection is advisory. Agents remain in control. After the call, a full fraud risk summary with the marker timeline is available for supervisor review.
Agent Voice Verification
Agent Voice Verification addresses the threat from inside the operation.
In distributed and remote-first contact centers, there is no reliable way to confirm that the person handling calls under an agent’s credentials is actually that agent. Unauthorized substitution, where a hired agent allows someone else to use their workstation and account, is a compliance and fraud risk that grows with remote workforce scale.
Agent Voice Verification builds a voice profile for each agent from production calls and continuously compares each session against that baseline. Meaningful deviations surface as alerts in the supervisor dashboard for review. For high-security roles or new hires, a supervised onboarding session can establish a verified baseline before monitoring begins.
Why Krisp is positioned to build this
Already in the call. Krisp runs on the agent’s device, processing audio Krisp runs on the agent’s device, as a layer between the headset and any softphone. This is the same layer where noise cancellation and accent conversion run — the same layer where both the agent’s voice and the caller’s voice are accessible. Building voice security here is architecturally different from building it on top of recordings or at the IVR.
Every CCaaS platform. Krisp runs across all CCaaS environments with no telephony integration required. Voice Security is available wherever Krisp is deployed, enabled by configuration rather than implementation.
All threat vectors, one deployment. Deepfake Detection, Fraud Detection, and Agent Voice Verification run within the same platform, on the same call, without separate contracts, integrations, or infrastructure.
How this compares to what’s in the market today
Some detection tools exist at the IVR layer: they analyze inbound calls before they reach an agent and catch some synthetic voices at the front door. They have significant limitations.
IVR-layer detection tools
Krisp Voice Security
Where it runs
Before the call reaches the agent
On the agent’s device, during the call
CCaaS coverage
3–5 specific platforms
All CCaaS platforms
Synthetic voice detection
Yes
Yes
Social engineering detection
Varies by vendor and platform
Yes, at the agent level
Agent identity verification
No
Yes
All three threats in one deployment
No
Yes
Integration required
Telephony integration, weeks to months
No
Pricing
Enterprise deal floors, excludes most of market
Add-on to existing Krisp seats
Voice Security is available in early access
We are working with a select group of early partners to deploy, validate, and refine Voice Security in production environments.
If your operation carries financial, compliance, or reputational risk from voice fraud, we want to hear from you.
Any attempt to exploit the voice call to gain unauthorized access, trigger transactions, or obtain sensitive data. It takes several forms: AI-generated voices impersonating legitimate callers, real human callers using psychological manipulation, and agents being replaced by unauthorized individuals using stolen credentials.
How common are AI voice cloning scams?
Growing fast. Deepfakes grew from 0.1% to 6.5% of all detected fraud attempts in three years — a 22x increase. AI and voice cloning contributed to more than $30 million in documented BEC losses in the US in 2025 alone.
Can humans tell if a caller's voice is AI-generated?
Barely. Human agents correctly identify AI-generated voices approximately 60% of the time — approaching chance. Training doesn’t meaningfully improve that. Modern voice cloning tools produce output that is indistinguishable to the human ear, which is why detection tooling exists.
What's the difference between deepfake fraud and social engineering?
Deepfake fraud is a technical attack: an AI-generated or cloned voice impersonates a caller. Social engineering is a behavioral attack: a real human voice is used to pressure, manipulate, or deceive an agent into bypassing standard protocols. Both target the contact center voice channel but through different mechanisms.
Where can voice fraud be caught — IVR vs the agent level?
Both layers catch different things. IVR-layer detection catches some synthetic voices before a call reaches an agent — but stops there. It doesn’t detect social engineering, agent substitution, or synthetic voices on platforms it doesn’t integrate with. Agent-level detection covers the full conversation across all CCaaS platforms, including threats the IVR never sees.
¹ Barrington, Cooper & Farid, UC Berkeley. Nature Scientific Reports, 2025. ² Signicat, The Battle Against AI-driven Identity Fraud, 2024. ³ FBI Internet Crime Complaint Center (IC3), 2025 Internet Crime Report, 2025. ⁴ Deloitte Center for Financial Services, 2024.