/neuronio ›› services ›› 01 · operator-agents

Operator Agents
that drive real software.

browser-native · multimodal · audited

Agents that pilot your CRM, your portal, your archaic vendor dashboard the same way humans do — clicking, scrolling, typing, reading. They escalate to humans only when policy, ambiguity, or low confidence demands it.

// 01 intent

Tell us the workflow you want retired.

An operator agent is the right tool when the work lives behind a UI we can't replace, when the volume is high enough to hurt, and when the consequence of getting it wrong is small enough to recover from. We refuse the engagement when any of those three is missing.

// 02 capabilities

What we actually build.

Plan & decompose

A planner LLM compiles intent into a step graph: each step is a tool call with preconditions and a recovery branch. Replanning is cheap; mid-run pivots are normal.

claude-plannerstep-graphretry-policy

Drive a browser

Sandboxed, headed Chromium with proxy rotation, persistent profiles, and CAPTCHA delegation. We model the page as accessibility tree first, screenshot second.

browserbaseplaywrighta11y-tree

Read the screen

Multimodal vision for forms, dashboards, and PDFs the page renders inline. Layout-aware OCR with field-level confidence.

gpt-4o-vclaude-vision

Operate tools

Beyond the browser: shell, SQL, internal HTTP APIs, and human-in-the-loop slots. Each tool is sandboxed and rate-limited.

mcptemporalslack-handoff

Verify the work

Every action recorded with prompt, model, inputs, screenshot, and result. We replay any run; you replay any run.

langsmiths3-audit

Hand off cleanly

When the agent stalls, hesitates, or hits policy — Slack, email, or pager. Humans resume mid-task without re-explaining context.

slackpagerhandoff-card
// 03 artifact

A peek at real output.

step-plan · scout-07 · purchase-order flow↻ neuronio.ai
// agent compiles intent into 84 steps before the first click PLAN flow="vendor-mfg → submit PO" PLAN budget=$0.42 deadline=12m retries=3 // phase 01 — credentials 01 navigate https://portal.vendor-mfg.example/login 02 read vault/credentials/vendor-mfg 03 type [name=email] "ops@acmecorp.com" 04 type [name=password] {redacted} 05 wait_for [data-totp=true] 06 read vault/totp/vendor-mfg // → 419 882 07 click [type=submit] // phase 02 — read inventory · 18 rows 08 navigate /orders/new 09 vision capture {viewport} 10 extract table {schema: line_item} // 18 rows// phase 03 — classify ambiguous SKUs · escalate if <0.85 22 classify item[7] → "freight, expedited" conf=0.62 23 ESCALATE human="@acme-ops" timeout=15m // phase 04 — fill, verify, submit · steps 24–84
// 04 deliverables

What lands in your repo.

01
Working agent
Code in your repo, deployed to your cloud, behind your auth. We don't ship SaaS-only.
02
Eval suite
60–200 golden cases, adversarial probes, drift checks. Runs every commit. Blocks deploy on regression.
03
Runbook
Ten-page operations doc — every observed failure mode, every recovery path, every on-call escalation.
04
Observability
Per-step traces, screenshots, costs. LangSmith or your stack. Auditor-friendly out of the box.
05
Two trained engineers
Two of your engineers, paired with us through the build, ready to operate and extend on day one.
// 05 questions

Things people actually ask.

Q-01What's the floor on volume that justifies an operator agent?+
Roughly: if a human spends more than 30 minutes per repeated transaction and you do the transaction more than 100 times a month, the math usually works. We say no when either of those numbers is much smaller.
Q-02How do you handle CAPTCHAs?+
Three layers, in order: residential proxy + clean profile (often enough), browser-native solving, and last-resort human delegation through a vendor with sub-30-second SLAs. We log which path was taken.
Q-03What happens when the portal changes?+
The agent runs an aliveness probe before every batch; on layout drift, it falls back to vision and flags maintainers. We've found this beats brittle selectors and beats blind retries.
Q-04Can the agent take destructive actions?+
Only if the runbook explicitly authorizes it. By default, anything irreversible — submit, send, pay — sits behind a typed confirmation or a human approval. Auditors prefer this. So do we.
Q-05Who owns the code?+
You do, from the first commit. We work in your repo, your cloud, your CI. Steward-tier engagements are about keeping the system sharp; you can take the keys whenever.

Tell us the work. We'll tell you the agent.

Open a Channel All Services