Trust-Based Review Model
Human code review is the single largest bottleneck in AI-accelerated delivery. This model defines how to progressively reduce human review overhead without sacrificing safety — by letting agents earn trust through demonstrated reliability.
The Core Principle
Trust is earned per agent, per stage, per project. An agent that demonstrates consistent quality at one stage does not automatically earn trust at another. Trust levels are not organizational — they are scoped.
Four Trust Levels
Trust Level 0: Full Supervision (Default)
Every new agent starts here. All outputs require human review before proceeding.
| Aspect | Requirement |
|---|---|
| Human review | 100% of agent outputs reviewed by qualified human |
| Approval | Explicit human approval required at every gate |
| Monitoring | Every agent action logged and reviewed weekly |
| Duration | Minimum 2 weeks or 20 agent runs (whichever is longer) |
| Promotion criteria | ≥90% first-pass acceptance rate, 0 critical defects escaped, complete handoff artifacts |
What this looks like in practice:
developer-agentproduces code → Senior Developer reviews every lineqa-agentproduces test matrix → QA Lead validates every test casesecurity-agentproduces findings → Security Engineer verifies every finding- Every handoff artifact is manually checked for completeness
Trust Level 1: Guided Autonomy
Agent handles routine work autonomously. Human reviews exceptions and high-risk outputs.
| Aspect | Requirement |
|---|---|
| Human review | ~60% of outputs — focus on Tier 2+ risk, new patterns, and flagged items |
| Approval | Auto-approve for Tier 1 work within established patterns; human approval for Tier 2+ |
| Monitoring | Weekly sampling of 30% of auto-approved outputs |
| Duration | Minimum 4 weeks or 50 agent runs at Level 1 |
| Promotion criteria | ≥95% first-pass acceptance on sampled outputs, 0 critical defects escaped, ≤2 medium defects per 50 runs |
What this looks like in practice:
developer-agentproduces a standard CRUD endpoint → auto-reviewed by CI gates onlydeveloper-agentproduces authentication logic → routed to Senior Developer for reviewqa-agentproduces test matrix for known patterns → auto-acceptedqa-agentproduces test matrix for new integration → QA Lead reviews
Risk-based routing rules at Level 1:
| Condition | Action |
|---|---|
| Tier 1 work + established pattern | Auto-approve via CI gates |
| Tier 1 work + new pattern | Route to human reviewer |
| Tier 2+ work | Always route to human reviewer |
| Agent flags uncertainty | Always route to human reviewer |
| Architecture-impacting change | Always route to Solution Architect |
| Auth/crypto/PII handling | Always route to Security Engineer |
Trust Level 2: Supervised Autonomy
Agent operates autonomously for most work. Human oversight shifts to exception-based and statistical sampling.
| Aspect | Requirement |
|---|---|
| Human review | ~25% of outputs — only Tier 3+, flagged items, and random sampling |
| Approval | Auto-approve for Tier 1-2 work; human approval for Tier 3+ |
| Monitoring | Bi-weekly sampling of 15% of auto-approved outputs; automated anomaly detection |
| Duration | Minimum 8 weeks or 100 agent runs at Level 2 |
| Promotion criteria | ≥98% first-pass acceptance on sampled outputs, 0 critical/high defects escaped, automated anomaly detection in place |
What this looks like in practice:
developer-agenthandles most feature implementation autonomouslyqa-agentgenerates and validates test matrices without human review for Tier 1-2security-agentauto-classifies and routes findings; human reviews only critical/high- Statistical sampling catches drift before it becomes systemic
Trust Level 3: Autonomous with Guardrails
Agent operates with full autonomy within its contract boundaries. Human involvement is limited to non-negotiable checkpoints and anomaly response.
| Aspect | Requirement |
|---|---|
| Human review | ~10% — only non-negotiable checkpoints (see below) + anomaly triggers |
| Approval | Auto-approve for Tier 1-3; human approval for Tier 4 only |
| Monitoring | Continuous automated monitoring; monthly human audit of 5% of outputs |
| Duration | No minimum — maintained as long as metrics hold |
| Demotion criteria | Any critical defect escaped, ≥3 high defects in 30-day window, anomaly detection triggers |
Non-negotiable human checkpoints (even at Level 3):
| Checkpoint | Reason |
|---|---|
| Requirements approval (Gate 1) | Business intent must be human-owned |
| Architecture approval for Tier 4 | Irreversible at enterprise scale |
| Critical security findings | False negatives have outsized impact |
| Production deployment approval | Last line of defense |
| Incident response decisions | Human accountability required |
Trust Promotion Process
┌──────────────────────────────────────────────────────────────────┐
│ TRUST PROMOTION WORKFLOW │
│ │
│ Level 0 Level 1 Level 2 Level 3 │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ FULL │─────>│ GUIDED │───────>│SUPERV. │─────>│ AUTO │ │
│ │ SUPER │ │ AUTON. │ │ AUTON. │ │ WITH │ │
│ │ │ │ │ │ │ │ GUARD │ │
│ └────────┘ └────────┘ └────────┘ └────────┘ │
│ │ │ │ │ │
│ │ 20 runs │ 50 runs │ 100 runs │ │
│ │ ≥90% pass │ ≥95% pass │ ≥98% pass │ │
│ │ 0 critical │ 0 critical │ 0 crit/high │ │
│ │ │ ≤2 medium │ anomaly det. │ │
│ │ │ │ │ │
│ └──── DEMOTION (any critical defect escaped) ──────┘ │
│ │
│ Demotion is IMMEDIATE and drops agent to Level 0. │
│ Re-promotion follows the full progression from Level 0. │
└──────────────────────────────────────────────────────────────────┘
Promotion Requirements
| From → To | Minimum Runs | Min Duration | Pass Rate | Defect Threshold | Additional |
|---|---|---|---|---|---|
| 0 → 1 | 20 | 2 weeks | ≥90% | 0 critical | Complete handoffs |
| 1 → 2 | 50 | 4 weeks | ≥95% | 0 critical, ≤2 medium | — |
| 2 → 3 | 100 | 8 weeks | ≥98% | 0 critical, 0 high | Anomaly detection active |
Demotion Triggers
| Trigger | Action | Recovery |
|---|---|---|
| Critical defect escaped to production | Immediate demotion to Level 0 | Full re-promotion from Level 0 |
| 3+ high defects in 30-day window | Demotion to Level 0 | Full re-promotion from Level 0 |
| Anomaly detection fires | Pause agent, investigate | Resume at current level if false alarm; demote if real |
| Contract violation (forbidden action attempted) | Immediate suspension | Root cause analysis required before reactivation |
| Handoff completeness drops below 90% | Demotion by 1 level | Fix handoff generation, re-promote |
Trust Metrics Dashboard
Track these metrics per agent to manage trust levels:
| Metric | Formula | Target (Level 3) |
|---|---|---|
| First-pass acceptance rate | Approved outputs / Total outputs | ≥98% |
| Defect escape rate | Defects found after gate / Total outputs | <0.5% |
| Handoff completeness | Complete handoffs / Total handoffs | 100% |
| Mean time to gate pass | Avg time from output to gate approval | Decreasing trend |
| Anomaly trigger rate | Anomaly detections / Total runs | <2% |
| Human intervention rate | Outputs requiring human review / Total outputs | <10% at Level 3 |
Per-Stage Trust Configuration
Not all stages reach the same trust level at the same pace. Typical progression:
| Stage | Typical Time to Level 1 | Typical Time to Level 2 | Typical Time to Level 3 | Notes |
|---|---|---|---|---|
| Requirements (Stage 1) | 2 weeks | 6 weeks | 12 weeks | Slower — business intent requires careful validation |
| Design (Stage 2) | 3 weeks | 8 weeks | 16 weeks | Slowest — architecture decisions are high-impact |
| Implementation (Stage 3) | 2 weeks | 4 weeks | 10 weeks | Fastest for routine code patterns |
| Testing (Stage 4) | 2 weeks | 5 weeks | 10 weeks | Moderate — test quality has compounding effect |
| Security (Stage 5) | 3 weeks | 8 weeks | Never for critical findings | Critical findings always require human review |
| Deployment (Stage 6) | N/A | N/A | N/A | Production deployment always requires human approval |
| Operations (Stage 7) | 2 weeks | 4 weeks | 8 weeks | Fast for monitoring; slow for incident response |
Relationship to PRD-STD-002 (Code Review)
PRD-STD-002 mandates human review for production-bound code. The trust model does not override this standard. Instead, it defines how the scope and depth of human review changes:
| Trust Level | PRD-STD-002 Compliance | Review Scope |
|---|---|---|
| Level 0 | Full manual review of all code | Line-by-line review |
| Level 1 | Full review for Tier 2+; sampling for Tier 1 | Logic and architecture focus |
| Level 2 | Full review for Tier 3+; statistical sampling for Tier 1-2 | Exception and anomaly focus |
| Level 3 | Full review for Tier 4; automated gates + sampling for Tier 1-3 | Non-negotiable checkpoints only |
At every level, the human reviewer retains the authority to reject any agent output and trigger re-work.
Trust Level 3 requires a formal variance approval from the CTO or equivalent authority, documented in the governance record. This acknowledges that the organization accepts reduced human review coverage in exchange for demonstrated agent reliability.