Insights Business| SaaS| Technology Voice Agent Compliance: TCPA, HIPAA, PCI and What Comes Next
Business
|
SaaS
|
Technology
May 15, 2026

Voice Agent Compliance: TCPA, HIPAA, PCI and What Comes Next

AUTHOR

James A. Wondrasek James A. Wondrasek
Graphic representation of voice agent compliance with TCPA, HIPAA, and PCI regulations

Voice AI compliance is not a lighter version of chatbot compliance. It spans three independent federal frameworks — TCPA, HIPAA, and PCI DSS — plus a growing layer of state biometric laws, each with its own documentation, vendor, and architecture requirements.

Here is the fact that matters most: under the FCC’s February 2024 declaratory ruling, AI voice callers face stricter consent requirements than human agents. Not equivalent. Strictly higher. TCPA violations run 500–1,500 per call. PCI non-compliance fines reach 5, 000–100,000 per month. Illinois BIPA violations carry 1, 000–5,000 per incident. These are board-level risk numbers, not engineering checkboxes. This guide is part of our broader coverage of enterprise voice AI compliance and deployment — the landscape is bigger than any single compliance layer.

Why Is Voice AI Compliance More Complex Than Chatbot Compliance?

There are three reasons voice AI compliance is structurally harder than text chatbot compliance — and none of them have a chatbot analogue.

First, voice AI generates biometric data. Every call produces audio a speech-to-text system can process for speaker identification, which means voiceprints under Illinois BIPA. Text chatbots simply do not do this. Second, TCPA classifies AI-generated speech as “artificial voice” under Section 227(b), which subjects outbound AI calling to a consent standard a website chatbot never faces. Third, voice agents operate on live telephony infrastructure. A disclosure failure cannot be recalled the way a text response can be edited — the error happened, it was recorded, and the per-call exposure is already locked in.

What Did the FCC’s 2024 Ruling Change for Outbound AI Voice Calling?

The FCC’s February 2024 declaratory ruling classifies AI-generated speech as “artificial voice” under Section 227(b), closing the legal grey area that had let some operators argue AI calls fell outside TCPA’s strictest consent requirements.

💡 PEWC (Prior Express Written Consent) is the highest consent tier under TCPA: a written record showing the consumer agreed, with timestamp, IP address, and the exact language of the consent they reviewed. Verbal opt-ins do not satisfy it.

Human sales agents need only express consent — oral or written. AI voice callers need PEWC: written, timestamped, IP address, exact consent language, retained for 4–7 years. A human telemarketer makes roughly 80 calls per day. An AI voice agent can make 50,000+. Violations run $500 per non-wilful call and $1,500 per wilful call — and settlements in the 20–60M range have already been reached.

For an account of what happens when these requirements are not met, see compliance failures as a production failure mode.

How Do You Build a TCPA-Compliant Outbound AI Calling Programme? The 90-Day Roadmap

The 90-day roadmap from Caller Digital sequences TCPA compliance work so that litigation exposure is blocked before a single call goes out. Skipping straight to national rollout without the foundation in place is the most common TCPA failure pattern.

Days 1–14: Audit your contact lists for PEWC records. If you cannot produce a timestamped, IP-logged, exact-language consent record for a contact, that contact cannot be called. Build per-call audit-trail infrastructure: recording, transcript, consent record, timezone log, and DNC scrub log. Get DNC Registry scrubbing integrated before any dial goes out.

Days 15–30: Single-state pilot — and not California, Florida, or Illinois, which each add state-level requirements on top of the federal TCPA baseline. Confirm the AI disclosure fires within the first ten seconds of every call.

Days 31–60: Expand to 5–10 states, adding state-specific consent requirements on top of the federal baseline.

Days 61–90: National rollout with automated compliance monitoring and a quarterly consent audit cadence confirmed.

What Does a HIPAA BAA Actually Cover — and What Does It Cost?

Any voice AI vendor that processes audio containing Protected Health Information must execute a Business Associate Agreement before going live. The BAA requirement is triggered by PHI processing, not PHI storage — a cloud STT service that transcribes a patient scheduling an appointment is processing PHI even if the transcript is deleted immediately.

💡 PHI (Protected Health Information) includes any individually identifiable health information — spoken patient data in a voice call counts. Patient name, appointment date, and the healthcare service scheduled are all PHI.

Here is what the major vendors actually cost once you factor in compliance.

Retell AI ($0.07/min): BAA via self-service portal, no enterprise contract required. SOC 2 Type II, HIPAA, GDPR, PCI. Lowest friction for teams that need to move quickly.

Vapi ($0.05/min): HIPAA is a $1,000/month add-on — true healthcare cost runs 0.25–0.33/min. SOC 2 Type II, HIPAA.

PolyAI ($150K+/year): BAA included in enterprise contract. SOC 2 Type II, HIPAA, GDPR, PCI DSS, ISO 27001.

Deepgram (STT layer): BAA available, zero-retention mode (audio deleted after transcription). HIPAA, SOC 2, GDPR, PCI.

Verify TLS 1.3 in transit and AES-256 at rest against actual vendor configuration, not certification claims. PHI access logging is a HIPAA technical safeguard requirement that exists independently of the BAA.

💡 SOC 2 Type II validates a vendor’s security controls over a 6–12 month audit period. It does not guarantee HIPAA compliance — treat it as the baseline enterprise security qualification and require BAA availability separately.

When Do You Need Air-Gapped STT — and What Does It Mean in Practice?

Air-gapped STT means the speech-to-text model runs entirely on infrastructure you control, with no audio leaving your network.

💡 Air-gapped STT is the compliance ceiling for use cases where even a BAA-covered cloud vendor introduces unacceptable PHI or payment data exposure — the transcription model runs on your own servers, and audio never crosses a network boundary you do not own.

There are two scenarios that make it a requirement rather than a preference. HIPAA PHI: if your organisation or its counsel has determined that no cloud processing of PHI audio is acceptable, cloud STT with a BAA is not sufficient — air-gapped STT eliminates the third party entirely. PCI DSS payment card data: when a caller speaks their card number and cloud STT transcribes it, the Primary Account Number enters the cloud provider’s infrastructure and PCI scope expands. The alternative is DTMF masking at the network level.

💡 DTMF masking suppresses the touch-tone signals generated when a caller enters digits on a keypad, preventing card numbers from appearing in recordings, transcripts, or AI agent logs — keeping PAN out of PCI scope entirely.

DTMF masking must be at the network layer, not just the recording layer. IVR segregation carries this further: route payment capture to a dedicated DTMF IVR path the AI agent never touches, keeping PAN out of the Cardholder Data Environment entirely.

Hybrid STT routing is the practical answer for most organisations: on-premise for PHI and PAN audio, cloud API for everything else. For a detailed look at how compliance shapes architecture decisions, see compliance as a foundational architecture constraint.

What Do Illinois BIPA, California CIPA, and Florida FTSA Require for Voice AI?

State biometric and telephone solicitation laws are additive, not alternatives to federal TCPA. Each adds requirements on top of the federal baseline.

Illinois BIPA: Any voice AI performing speaker identification or diarisation on audio involving an Illinois-based participant collects a voiceprint — a biometric identifier under BIPA. The geographic trigger is the participant’s physical location, not the employer’s or platform’s. You need written notice before biometric collection, explicit written consent, a published retention and destruction policy, and no sale of voiceprint data. Per-violation damages run 1, 000–5,000 and accrue each time biometric data is unlawfully collected.

The Cruz v. Fireflies.AI Corp. class action (December 2025) establishes the key liability point: an organisation that selects or encourages an AI tool that records Illinois participants may be held liable — not just the vendor. The Seventh Circuit heard oral arguments February 12, 2026 on whether the 2023 BIPA amendment limiting per-scan liability applies retroactively — engage BIPA-specialised counsel rather than relying on published settlement figures until the ruling issues.

Enterprise action items: Audit all STT pipelines for speaker identification and diarisation. A BIPA consent workflow must be operational before the first call involving an Illinois participant.

California CIPA: A two-party consent state. The TCPA AI disclosure does not substitute for a separate recording-specific disclosure — they are two distinct obligations.

Florida FTSA: Mirrors TCPA PEWC language but applies independently at state level. A plaintiff can sue under both TCPA and FTSA for the same call, effectively doubling per-call exposure.

What Is Zero-Trust Architecture and Why Does Voice AI Need It?

Zero-trust requires continuous verification of every user, device, and service regardless of network location — no implicit trust based on perimeter authentication or prior session state. For voice agents executing financial or healthcare actions, it satisfies both HIPAA’s technical safeguard requirements and PCI DSS’s access control requirements in one architecture.

“Authenticate once and trust” fails when sessions escalate. Zero-trust requires verification before every sensitive action: identity verification before every system action; role-based access control scoped to the minimum PHI or financial data required; immutable audit logs; automatic session termination on anomaly detection. SOC 2 Type II covers vendor infrastructure, not your deployment of it. For more on building an architecture that addresses these requirements from the ground up, see compliance as a foundational architecture constraint.

What Do You Need to Have in Place Before Day One of Production Deployment?

This is a minimum-viable gate, not an aspirational list. Every item must be in place before the first production call. Items are organised by regulatory trigger so you can identify which gates apply to your deployment.

TCPA Gate (outbound AI voice)

HIPAA Gate (healthcare deployments)

PCI Gate (payment-handling deployments)

State Biometric Gate (deployments using STT speaker identification or diarisation)

For detailed guidance on the architecture behind these gates, see compliance as a foundational architecture constraint. For the full landscape of enterprise voice AI deployment, see the full picture on voice agents in production.

Frequently Asked Questions

Each call is a separate violation: $500 per non-wilful call, $1,500 per wilful call. At AI call volume, a non-compliant campaign generates class-action exposure fast — settlements in the 20–60M range have been reached. Stop the campaign, preserve all call records and consent documentation, and engage TCPA litigation counsel before the 30-day window on any demand letter closes.

Do I need a BAA if my voice agent only schedules appointments and never stores patient records?

Yes. The BAA requirement is triggered by PHI processing, not PHI storage. A cloud STT service transcribing a patient scheduling an appointment is processing PHI — patient name, provider, and the nature of the service are all PHI regardless of whether the transcript is retained.

What is DTMF masking and why does it matter for PCI compliance?

DTMF masking suppresses touch-tone signals when a caller enters digits, preventing the Primary Account Number from entering any system that would fall under PCI scope. It must happen at the telephony and carrier layer — apply it only at the recording layer and the AI agent has already received the signal.

Which voice AI platform has the best HIPAA compliance for a healthcare startup?

Retell AI: self-service BAA portal, SOC 2 Type II, HIPAA, GDPR, and PCI at $0.07/min, no enterprise contract required. Vapi at $0.05/min but the $1,000/month BAA add-on raises total healthcare cost significantly. PolyAI at $150K+/year suits deployments requiring ISO 27001 and a managed service model. For the STT layer, Deepgram offers BAA, zero-retention processing, and air-gapped deployment support.

Can I use speaker identification (diarisation) in my voice agent without triggering BIPA?

Speaker identification triggers BIPA if any participant is physically in Illinois at the time of the interaction. Disabling it removes the voiceprint collection trigger; transcription without speaker attribution does not create a voiceprint under current BIPA interpretation.

Is air-gapped STT required for all HIPAA voice deployments?

No. Cloud STT with a valid BAA is compliant for most HIPAA voice deployments. Air-gapped STT is required only when the organisation or its counsel determines no third-party cloud exposure of PHI audio is acceptable — typically government health contexts or where client contracts prohibit it. Hybrid STT routing is the practical middle path.

How does PCI IVR segregation reduce compliance scope?

Route payment capture to a dedicated IVR path the AI agent never touches — the caller enters their card number via DTMF in the IVR, and the AI agent resumes after the IVR confirms payment. When the AI agent never processes PAN, it sits outside the Cardholder Data Environment and is not subject to PCI DSS assessment. That reduces the applicable SAQ category and cuts audit complexity considerably.

AUTHOR

James A. Wondrasek James A. Wondrasek

SHARE ARTICLE

Share
Copy Link

Related Articles

Need a reliable team to help achieve your software goals?

Drop us a line! We'd love to discuss your project.

Offices Dots
Offices

BUSINESS HOURS

Monday - Friday
9 AM - 9 PM (Sydney Time)
9 AM - 5 PM (Yogyakarta Time)

Monday - Friday
9 AM - 9 PM (Sydney Time)
9 AM - 5 PM (Yogyakarta Time)

Sydney

SYDNEY

55 Pyrmont Bridge Road
Pyrmont, NSW, 2009
Australia

55 Pyrmont Bridge Road, Pyrmont, NSW, 2009, Australia

+61 2-8123-0997

Yogyakarta

YOGYAKARTA

Unit A & B
Jl. Prof. Herman Yohanes No.1125, Terban, Gondokusuman, Yogyakarta,
Daerah Istimewa Yogyakarta 55223
Indonesia

Unit A & B Jl. Prof. Herman Yohanes No.1125, Yogyakarta, Daerah Istimewa Yogyakarta 55223, Indonesia

+62 274-4539660
Bandung

BANDUNG

JL. Banda No. 30
Bandung 40115
Indonesia

JL. Banda No. 30, Bandung 40115, Indonesia

+62 858-6514-9577

Subscribe to our newsletter