Overview
Models change behavior. Source documents go stale. User queries shift. Costs creep. The team that built your AI hands you a runbook and moves on. We don't — this is the retainer that keeps your production AI working, year after year.
Honest about fit
A fit if…
- You shipped a production AI system — built by us or someone else — and need ongoing quality discipline
- Your CEO or board wants a monthly observability report on AI performance and cost
- You don't want to staff a full internal AI quality team but still need the discipline of one
Not a fit if…
- You need 24/7 NOC service or sub-1-hour SLAs — we're not an MSP
- Your system has no eval harness or gold set — we'll build those first (separate engagement)
- You want a feature factory under the banner of "retainer hours" — features are scoped separately
What you get
Concrete deliverables. No hand-waving.
- Daily automated eval runs against your gold set, with regression alerting
- Weekly retrieval-quality review (Standard tier and above)
- Monthly observability report: cost, latency, accuracy, adoption, incidents, recommendations
- Quarterly business review with your sponsor — what's working, what to evolve
- Model-update management — every new Claude, GPT, or Gemini is eval-gated before production
- Gold-set evolution — ~20 new questions per quarter, sourced from real failure cases
- Incident response within SLA, on the eval/observability platform you already run