A finance employee at Arup joined a video conference, saw his CFO’s face, heard his voice, and wired $25.6 million across fifteen transactions before realising every person on that call was AI-generated.
Your instinct is to think your team would catch it. But roughly one in two companies were hit by deepfake fraud attacks in the past year, at an average cost of around $450,000 per incident. And 80% had no established protocol for handling one when it landed.
Most published deepfake defence guidance assumes you have a CISO, a security operations centre, and dedicated budget lines for specialised tooling. If security sits alongside every other operational responsibility rather than in a dedicated team, that guidance does not apply to you.
This roadmap does. Phase 1 (this week, near-zero cost) covers controls you can put in place before your next standup. Phase 2 (one to three months) adds vendor governance and training. Phase 3 (three to six months) handles regulatory and insurance questions that need external engagement. Start with the baseline assessment — you need to know where you stand before committing to a sequence. For context on why the threat landscape escalated this quickly, the full picture is in why deepfake fraud is outpacing institutional defences.
How Do You Assess Your Current Deepfake Exposure Before Building a Defence Plan?
Run a structured self-assessment before committing to any controls. Most teams who do this discover the same thing: their highest-value financial workflows rely on “I recognised the voice” as a verification step, and there is no documented process for what happens next.
The baseline maps four domains. Answer Yes / Partial / No for each:
Authentication Architecture: Is there a documented process for verifying identity during wire transfers above $5,000? Does any step rely on face or voice recognition? Are dual-approval requirements enforced above $20,000?
Incident Response Readiness: Does a written incident response plan exist? Does it explicitly address AI-generated audio or deepfaked video? Is there a named contact who can assess whether media is synthetic? Have you identified legal counsel for deepfake takedown requests?
Employee Awareness: When was the last security session covering voice or video impersonation? Can three random employees describe what they would do if they suspected a cloned voice call?
Vendor AI Tool Inventory: Does a list exist of every third-party tool that generates, manipulates, or processes audio, video, or images using AI? Do vendor contracts include any restrictions on synthetic media creation?
Any “No” in authentication or incident response is a Phase 1 priority. Partial answers in vendor and awareness sections feed Phase 2. This takes two to four hours. No external consultants required.
What Are the Immediate Deepfake Defences You Can Implement This Week?
Phase 1 requires no procurement, no vendor sales cycle, and no specialised security knowledge. Three deliverables, all implementable within one to four weeks:
- A deepfake-specific incident response plan — two to five pages any team member can follow without security expertise
- MFA architecture change — replace video or voice call verification with passkeys or FIDO2 for high-value decisions
- Safe word system — pre-agreed verbal codes at the start of any voice call involving financial instructions, rotated weekly for high-value operations
A single, non-negotiable rule — out-of-band voice confirmation on a pre-registered number for any fund transfer over $10,000 — would have stopped the Arup attack cold. The weakness was a missing process step, not a detection failure.
“A little process friction in the right spots kills most of the risk,” as Avani Desai, CEO of Schellman, puts it. All three Phase 1 controls work together: the IR plan documents what to do when an attack occurs, MFA redesign prevents a whole category of attacks entirely, and safe words provide the human backstop between them.
How Do You Build a Deepfake-Specific Incident Response Plan Without a Security Team?
A deepfake-specific IR plan adds three things standard IR plans omit: media authentication procedures, legal takedown triggers for synthetic content on external platforms, and a communication strategy for scenarios involving fabricated audio or video of company personnel.
Keep it to two to five pages any team member can follow without security expertise. Here is the structure:
Trigger Criteria: An unexpected video or voice request for financial action above your threshold. Synthetic media featuring company personnel circulating externally. A caller who cannot answer the pre-agreed safe word. Unusual urgency combined with a request to bypass standard approval.
Containment — immediate, before anything else: Freeze the transaction. Do not delete anything. Preserve all evidence — recordings, emails, chat logs, metadata. Brief the response team without using the compromised channel.
Media Authentication: Identify in advance who will assess whether media is synthetic — internal staff with metadata analysis tools, or a pre-identified external service. Basic audio checks: unnatural pauses, pitch inconsistencies, breathing patterns. Basic video checks: lighting mismatches, facial blurring during movement. Do not rely on human judgement alone — detection accuracy in operational conditions sits at around 55–60%.
Communication Protocol: Establish a “do not confirm or deny” default for media inquiries involving synthetic content claims. Pre-draft internal notification templates now — you will not be calm enough to write them in the moment.
Legal Takedown: Identify legal counsel with deepfake takedown experience before you need them. Platform response times range from hours to weeks.
Recovery and Review: Analyse what was exploited and update the IR plan. Run a tabletop exercise within 30 days.
Speed is the governing constraint. The IR plan’s value is measured in minutes, not pages.
Why Should You Redesign MFA to Eliminate Video Verification — and How Do Passkeys Replace It?
Video verification is now a vulnerability, not a control. Attackers can generate real-time synthetic video convincing enough to pass human judgement. Human detection accuracy in operational conditions sits at around 55–60% — marginally above chance. Asking employees to visually identify fakes is not a security control.
FIDO2 and passkeys work because authentication is device-bound and cryptographic. A deepfake cannot generate a valid signature from a hardware key or device secure enclave, regardless of video quality. Deploy passkeys for internal high-value authorisation workflows first — wire transfers, access grants, contract approvals — using Microsoft Authenticator or YubiKey as your reference implementations.
While deploying passkeys, implement safe word systems as an immediate backstop for voice verification scenarios you cannot migrate yet. Pre-agree a verbal code with anyone who may call you to request sensitive actions. Exchange it at the start of any voice call involving financial instructions. An attacker cloning a voice cannot know the current code.
Add a rule that no single person can authorise high-value transactions based on a single communication, and that transactions above $20,000 require two approvers plus out-of-band confirmation. No technology required — it is a process rule.
For the architecture decision on whether to add active deepfake detection to authentication workflows, that analysis is in which detection and provenance architecture to choose.
What Should Phase 2 Cover — Vendor Governance, Employee Training, and Architecture Decisions?
Phase 2 runs one to three months after Phase 1 controls are in place. Three workstreams, running in parallel where possible.
Vendor Due Diligence for AI Tools
Start with an inventory you almost certainly do not have: every third-party tool across your organisation that generates, manipulates, or processes audio, video, or images using AI. Marketing teams routinely adopt tools like HeyGen or ElevenLabs without any security review.
For each tool on that list, require five commitments before renewal: a prohibited-use policy for synthetic media of real individuals without consent; C2PA-compliant watermarking; audit rights over how your data and likenesses are used; takedown cooperation within a defined SLA; and contractual indemnification for damages from synthetic content misuse. Add these to standard vendor onboarding for any AI tool going forward.
Employee Training That Actually Changes Behaviour
Annual compliance training does not work. AI-enhanced phishing achieves a 54% click-through rate compared to 12% for human-written content. What works: quarterly, scenario-based modules of 15–20 minutes including at least one deepfake simulation — a realistic synthetic voice note or video call your employees have to respond to correctly.
Training needs to cover how deepfake technology works, the specific red flags (audio artefacts, visual inconsistencies, behavioural pressure), verification procedures, and clear reporting pathways. Platforms like KnowBe4 and Jericho Security offer deepfake-specific simulation modules.
Architecture Decisions
This is a Phase 2 evaluation, not a Phase 1 purchase. Read which detection and provenance architecture to choose before committing to any detection tooling. Process controls from Phase 1 address the same attack vectors as most detection tools at near-zero cost.
When Should an SMB Invest in Detection Tools — and Is Building In-House Ever the Right Call?
Later than most vendors will tell you, and only under specific conditions.
Commercial deepfake detection tools claim 95–98% accuracy in lab settings. In real-world environments, accuracy drops to 50–65%. CSIRO research found leading tools collapsed below 50% when confronted with deepfakes produced by tools they had not been trained on. Under targeted attacks — where an attacker tests their deepfake against the detection system before launching — accuracy can fall to near zero.
The arms race is structural: detection tools learn to identify artefacts in current-generation synthetic media. When generation models improve, those signatures become obsolete. Process controls — out-of-band verification, safe words, dual-approval workflows — do not degrade with model improvements.
If Phase 1 and Phase 2 controls are in place, residual risk is high, and you have budget: evaluate detection tools. If those controls are not yet in place: build process first. Building in-house makes sense almost never at SMB scale. Reality Defender is the most referenced option for SMBs if you reach the evaluation stage.
What Does Phase 3 Look Like — Compliance, Insurance, and Residual Risk?
Phase 3 is strategic rather than operational. It requires external engagement — legal counsel and your insurance broker — and longer decision cycles. Timeline: three to six months from roadmap start. For the full policy and threat context that informs these decisions, the pillar article covers the complete landscape.
Compliance Matrix
Map applicable regulations before a regulatory inquiry forces the exercise. The EU AI Act‘s deepfake labelling requirements are now in force. The US TAKE IT DOWN Act (May 2025) mandates 48-hour platform removal of non-consensual synthetic content. The UK Online Safety Act holds platforms legally responsible for illegal deepfake content. HealthTech operators face HIPAA breach notification exposure; SaaS and FinTech operators need jurisdictional mapping. The full compliance matrix is in the compliance matrix your roadmap needs to address — use it as your external legal consultation briefing document.
Insurance Review
Seek a Social Engineering Fraud Endorsement on your existing cyber policy. Without it, standard policies frequently exclude deepfake-enabled losses under the “voluntary parting” exclusion. Negotiate override of that exclusion for deepfake fraud scenarios, adequate sublimits against your highest single-transaction exposure, and clear incident reporting requirements. Some policies require notification within 24–72 hours of a suspected incident — your Phase 1 IR plan must accommodate that. The full insurance process is in adding deepfake-specific insurance coverage.
Residual Risk Acceptance
After all three phases, document what risk remains and why the organisation accepts it. Include remaining attack vectors, business justification for not addressing them, and the conditions that would trigger you to revisit the decision. Get it signed off by leadership. Build a review cadence in from the start: quarterly IR plan review, annual reassessment of the baseline checklist. The roadmap is not a project with an end date — it is an ongoing practice.
Frequently Asked Questions
Where do I start with deepfake defence if I have no security team?
Run the baseline assessment above — two to four hours, no external consultants. Then Phase 1: draft a deepfake-specific IR plan, replace video verification with passkeys for high-value authorisations, and deploy safe word systems for voice calls. All three are implementable within four weeks.
What is a deepfake incident response plan and how does it differ from a standard IR plan?
A deepfake-specific IR plan adds three elements standard IR plans omit: media authentication procedures, legal takedown triggers for synthetic media on external platforms, and a communication strategy for scenarios where fabricated audio or video of company personnel is circulating. If your existing IR plan does not address these, it is not deepfake-ready.
Are safe word systems genuinely effective against voice clone fraud?
Yes, in the specific scenarios they address. A pre-agreed verbal code defeats voice cloning because the attacker does not know the current code. The limitation is consistency — safe words only work when both parties follow the protocol every time. They remain effective even as generation quality improves.
How do I know if my MFA setup is vulnerable to deepfake attacks?
Audit every authentication workflow that uses face recognition, voice recognition, or video call verification as an identity factor. If any authorise transactions above your defined threshold, they are vulnerable. Replace biometric factors with cryptographic factors — FIDO2 and passkeys — for high-value workflows.
When should an SMB invest in a deepfake detection tool?
Only after Phase 1 and Phase 2 controls are fully implemented, and only if threat modelling shows residual risk that process controls cannot address. Real-world detection accuracy sits 30–50% below vendor laboratory claims. Process controls do not degrade as generation models improve.
What should I do immediately if my company is targeted by a deepfake fraud attempt?
Follow your IR plan: freeze the transaction, isolate the communication channel, preserve all evidence, and notify your pre-identified media authentication contact. If you do not yet have an IR plan, freeze the transaction first. The assessment of whether it was a deepfake comes after containment.
What does a realistic deepfake defence budget look like for a 200-person company?
Phase 1 costs are near zero — process documentation and configuration changes. Phase 2 includes training platform licensing (KnowBe4 SMB pricing typically runs $15–25 per user annually) and vendor due diligence time. Phase 3 depends on insurance adjustments and legal consultation. Detection tools, if warranted, run mid-four to low-five figures annually. Against a $450,000 average incident cost, Phase 1 is measured in days of engineering time.
How do I adapt enterprise deepfake defence recommendations to an organisation without a CISO?
Assign an explicit security owner — a realistic acknowledgement of how most 50–500 person organisations operate. Phase 1 controls do not require a CISO. Phase 2 training can be managed by engineering leadership. Phase 3 compliance and insurance decisions are executive-level tasks you can drive with legal and finance support.