Five phases
from signal to silence.
We're allergic to discovery decks and 12-week scoping exercises. Below is the actual rhythm of an engagement, start to handoff.
Listen
One-hour call. We listen for the seam — the place where humans are doing pattern-matching at scale. We say "no" if there isn't one.
Specify
72 hours later: a written agent specification. Inputs, outputs, tools, eval criteria, failure modes, escalation paths. No mockups.
Prototype
Two engineers, two weeks. We build a working agent against real data, with the eval suite, and a candid memo on what we learned.
Harden
Production deployment, observability, runbooks, on-call rotation, and a 30-day shadow period before traffic flips over.
Tend
Ongoing: drift monitoring, model migrations, prompt regressions. We hand off the keys when you ask, not before.
A working system. Source. Eval. Runbook.
- ▸Source code in your repo, your CI, your cloud. We don't gate on a SaaS.
- ▸An eval suite that runs on every commit and gates the deploy.
- ▸A 12-page runbook covering every observed failure mode.
- ▸Two of your engineers trained to operate and extend it.
- ▸Optional ongoing stewardship — you own the system either way.
How we actually work.
Eval first, prompt second.
We write the eval before we write the prompt. If we can't measure good, we can't ship good.
Boring before clever.
If a SQL query, a regex, or a Zapier zap will do the job, that's what we ship. The agent shows up where reasoning is required.
No mystery boxes.
Every decision the agent makes is logged with the prompt, the model, the inputs, and the rationale. Your auditor will love us.