Skip to content
Xplore
Deploy + Control

Agents only go live when they pass. You know before your users do.

Promotion gates, versioning, live certifications, and drift detection. Every agent earns its way to production. When scores drop, you see it first.

Gated
promotions
Only versions that meet your criteria reach production
Deploy
Versioned
releases
History of what shipped, with one-click rollback
Deploy
4
policies
Promotion strategies from manual review to automated gates
Control plane
<1min
Drift and regression alert targets
Observe
Deploy

Agents go live when they're ready. Not before.

Auto-promote the best, define a score threshold, or keep it manual. Every version is tracked. Roll back in one click. No agent reaches production without meeting your criteria.

Training run configuration — promote policy, trainer strategy, mutation knobs
Agent overview — Logistics v7, 148 runs, 6 versions, performance over time
Deploy

You decide what 'ready' means.

From full manual review to Pareto-optimal auto-promotion — match the strategy to your risk tolerance.

Manual

You review and approve each change. Full human oversight.

Auto-promote best

Best-scoring variant promoted automatically after each cycle.

Above threshold

Promoted only if score exceeds your configured threshold.

Pareto dominance

Promoted only if it dominates on all objectives simultaneously.

Control

You know before your users do.

Live certifications score every production run against your eval chain. Drift detection catches degradation the moment it starts. Threshold breaches trigger alerts immediately — not in tomorrow's report.

Production controls · live
Compliance gate enabled
agent: trade-screener-v5
last 24h: 892 runs · avg: 0.96
alerts: 0
Diagnostic accuracy enabled
agent: helpdesk-v2
last 24h: 3,105 runs · avg: 0.87
⚠ drift: resolution_rate 0.82 → 0.74
Safety & PII enabled
agent: clinical-agent-v3
last 24h: 445 runs · avg: 0.98
alerts: 0
Control

Start monitoring in minutes.

Start with proven templates for common domains, then customise thresholds and evaluators to match your requirements.

RAG Quality

Context adherence, citation accuracy, retrieval relevance — all scored live.

Safety

Injection resistance, data exfiltration, access boundaries — continuous verification.

Cost Guard

Token budget, API cost, latency thresholds — every dollar tracked.

Compliance

Policy adherence, audit trail, regulatory checks — always certification-ready.