/neuronio ›› services ›› 04 · voice-conversation

Voice & Conversation
agents that don't sound like 2018.

realtime · turn-taking · barge-in · escalating

Inbound and outbound voice agents with sub-400ms latency, native turn-taking, and escalation paths to humans in three rings or fewer. We handle the SIP, the eval, the prompts, and the transcripts.

// 01 intent

Voice is a different problem than chat.

Latency dominates. Turn-taking is brittle. Barge-in breaks every naive pipeline. We've shipped enough of these to have opinions: realtime APIs first, streaming TTS where possible, and an escalation path that's actually rehearsed — not bolted on at the end.

// 02 capabilities

What we actually build.

▣

Realtime pipeline

Speech-to-speech via realtime APIs when accent, latency, and naturalness matter. Token-streamed TTS with mid-sentence interruption.

openai-rtcartesia

▤

Turn-taking

Native VAD with prosody awareness; the agent learns when 'um' is a hold and when it's a release. Tunable per persona.

livekitsilero

▦

Barge-in

Fast user override. The agent shuts up within 80ms of detecting fresh speech. We've measured every alternative; this one wins.

livekitcustom-vad

▥

Tool calls

The agent can read your CRM, your booking system, your knowledge base — mid-call. Latency budgets are explicit per tool.

mcpfastapi

▧

Escalation

Warm transfers to humans, with the transcript and intent summary already in their headset. Three rings, max.

twiliosip

▨

Eval & QA

Recorded calls, redacted, scored on completion, satisfaction proxies, and policy adherence. Drift surfaces in days, not quarters.

braintrustwhisper

// 03 artifact

A peek at real output.

transcript · inbound 800-line · acme support · 02:41↻ neuronio.ai

[00:02] acme-bot "Hi, this is Acme support. How can I help?" [00:06] caller "Yeah, my package — uh — said delivered but it's not here." [00:11] acme-bot "Sorry to hear it. Can I get your order number?" [00:14] caller "It's um, eight, two — wait, hold on…" [00:19] acme-bot "Take your time." // hold-detected, no barge [00:24] caller "Okay, A-7-2-2-5-1-9." [00:27] → tool order.lookup(A722519) // 142ms [00:28] acme-bot "Got it — that's a UPS shipment marked delivered Tuesday at the front door. Did you check with neighbors or a porch?" [00:36] caller "Yeah, I — actually no, I want to talk to a person." [00:39] → escalate tier=missed-delivery, ctx=A722519 [00:39] acme-bot "Of course — connecting you to Maya. One moment." [00:41] → ring human-agent · 2.4 rings · accept

// 04 deliverables

What lands in your repo.

Voice agent

Persona, scripts, tools, guardrails, and the SIP plumbing. Drop-in to your existing carrier.

Escalation rig

Warm transfers with full context. We rehearse this with your humans before launch.

Eval suite

Synthetic call generation plus red-team prompts. Latency, completion, and policy on every PR.

Recording & QA

Compliant recording, redaction, and a review tool tuned to your QA rubric.

Dashboard

Live dashboards: call volume, completion rate, escalation rate, sentiment proxy, drift alerts.

// 05 questions

Things people actually ask.

Q-01Will callers know it's a bot?+

Yes, by design — the opening line discloses. Callers tolerate it surprisingly well when the agent is fast and the escalation path is real. Hiding it backfires.

Q-02What languages?+

English and Brazilian Portuguese in production today; Spanish, Japanese, and German have shipped on past engagements. Accent handling is a per-engagement eval.

Q-03Can we use our existing call center / Twilio / Genesys?+

Yes. We've integrated with Twilio Flex, Genesys Cloud, NICE CXone, and a half-dozen smaller PBXes. The agent slots in as a queue member.

Q-04What about regulated calls (HIPAA, PCI, recording disclosure)?+

Recording disclosure is automated per state. PCI is handled by routing payment moments to DTMF and never letting the model see the digits. HIPAA-compliant deploys available.

Q-05How do you size for surge?+

Realtime sessions autoscale; SIP trunking has burst headroom. We load-test to 10× normal volume before launch and document the cliff.

Tell us the work. We'll tell you the agent.

Open a Channel → All Services ↘

Voice & Conversationagents that don't sound like 2018.