Trust-Based Review Model

Human code review is the single largest bottleneck in AI-accelerated delivery. This model defines how to progressively reduce human review overhead without sacrificing safety — by letting agents earn trust through demonstrated reliability.

The Core Principle

Trust is earned per agent, per stage, per project. An agent that demonstrates consistent quality at one stage does not automatically earn trust at another. Trust levels are not organizational — they are scoped.

Four Trust Levels

Trust Level 0: Full Supervision (Default)

Every new agent starts here. All outputs require human review before proceeding.

Aspect	Requirement
Human review	100% of agent outputs reviewed by qualified human
Approval	Explicit human approval required at every gate
Monitoring	Every agent action logged and reviewed weekly
Duration	Minimum 2 weeks or 20 agent runs (whichever is longer)
Promotion criteria	≥90% first-pass acceptance rate, 0 critical defects escaped, complete handoff artifacts

What this looks like in practice:

developer-agent produces code → Senior Developer reviews every line
qa-agent produces test matrix → QA Lead validates every test case
security-agent produces findings → Security Engineer verifies every finding
Every handoff artifact is manually checked for completeness

Trust Level 1: Guided Autonomy

Agent handles routine work autonomously. Human reviews exceptions and high-risk outputs.

Aspect	Requirement
Human review	~60% of outputs — focus on Tier 2+ risk, new patterns, and flagged items
Approval	Auto-approve for Tier 1 work within established patterns; human approval for Tier 2+
Monitoring	Weekly sampling of 30% of auto-approved outputs
Duration	Minimum 4 weeks or 50 agent runs at Level 1
Promotion criteria	≥95% first-pass acceptance on sampled outputs, 0 critical defects escaped, ≤2 medium defects per 50 runs

What this looks like in practice:

developer-agent produces a standard CRUD endpoint → auto-reviewed by CI gates only
developer-agent produces authentication logic → routed to Senior Developer for review
qa-agent produces test matrix for known patterns → auto-accepted
qa-agent produces test matrix for new integration → QA Lead reviews

Risk-based routing rules at Level 1:

Condition	Action
Tier 1 work + established pattern	Auto-approve via CI gates
Tier 1 work + new pattern	Route to human reviewer
Tier 2+ work	Always route to human reviewer
Agent flags uncertainty	Always route to human reviewer
Architecture-impacting change	Always route to Solution Architect
Auth/crypto/PII handling	Always route to Security Engineer

Trust Level 2: Supervised Autonomy

Agent operates autonomously for most work. Human oversight shifts to exception-based and statistical sampling.

Aspect	Requirement
Human review	~25% of outputs — only Tier 3+, flagged items, and random sampling
Approval	Auto-approve for Tier 1-2 work; human approval for Tier 3+
Monitoring	Bi-weekly sampling of 15% of auto-approved outputs; automated anomaly detection
Duration	Minimum 8 weeks or 100 agent runs at Level 2
Promotion criteria	≥98% first-pass acceptance on sampled outputs, 0 critical/high defects escaped, automated anomaly detection in place

What this looks like in practice:

developer-agent handles most feature implementation autonomously
qa-agent generates and validates test matrices without human review for Tier 1-2
security-agent auto-classifies and routes findings; human reviews only critical/high
Statistical sampling catches drift before it becomes systemic

Trust Level 3: Autonomous with Guardrails

Agent operates with full autonomy within its contract boundaries. Human involvement is limited to non-negotiable checkpoints and anomaly response.

Aspect	Requirement
Human review	~10% — only non-negotiable checkpoints (see below) + anomaly triggers
Approval	Auto-approve for Tier 1-3; human approval for Tier 4 only
Monitoring	Continuous automated monitoring; monthly human audit of 5% of outputs
Duration	No minimum — maintained as long as metrics hold
Demotion criteria	Any critical defect escaped, ≥3 high defects in 30-day window, anomaly detection triggers

Non-negotiable human checkpoints (even at Level 3):

Checkpoint	Reason
Requirements approval (Gate 1)	Business intent must be human-owned
Architecture approval for Tier 4	Irreversible at enterprise scale
Critical security findings	False negatives have outsized impact
Production deployment approval	Last line of defense
Incident response decisions	Human accountability required

Trust Promotion Process

┌──────────────────────────────────────────────────────────────────┐
│                    TRUST PROMOTION WORKFLOW                       │
│                                                                  │
│  Level 0          Level 1           Level 2          Level 3     │
│  ┌────────┐      ┌────────┐        ┌────────┐      ┌────────┐  │
│  │  FULL  │─────>│ GUIDED │───────>│SUPERV. │─────>│  AUTO  │  │
│  │ SUPER  │      │ AUTON. │        │ AUTON. │      │  WITH  │  │
│  │        │      │        │        │        │      │ GUARD  │  │
│  └────────┘      └────────┘        └────────┘      └────────┘  │
│     │                │                  │               │        │
│     │  20 runs       │  50 runs         │ 100 runs      │        │
│     │  ≥90% pass     │  ≥95% pass       │ ≥98% pass     │        │
│     │  0 critical    │  0 critical      │ 0 crit/high   │        │
│     │                │  ≤2 medium       │ anomaly det.  │        │
│     │                │                  │               │        │
│     └──── DEMOTION (any critical defect escaped) ──────┘        │
│                                                                  │
│  Demotion is IMMEDIATE and drops agent to Level 0.              │
│  Re-promotion follows the full progression from Level 0.        │
└──────────────────────────────────────────────────────────────────┘

Promotion Requirements

From → To	Minimum Runs	Min Duration	Pass Rate	Defect Threshold	Additional
0 → 1	20	2 weeks	≥90%	0 critical	Complete handoffs
1 → 2	50	4 weeks	≥95%	0 critical, ≤2 medium	—
2 → 3	100	8 weeks	≥98%	0 critical, 0 high	Anomaly detection active

Demotion Triggers

Trigger	Action	Recovery
Critical defect escaped to production	Immediate demotion to Level 0	Full re-promotion from Level 0
3+ high defects in 30-day window	Demotion to Level 0	Full re-promotion from Level 0
Anomaly detection fires	Pause agent, investigate	Resume at current level if false alarm; demote if real
Contract violation (forbidden action attempted)	Immediate suspension	Root cause analysis required before reactivation
Handoff completeness drops below 90%	Demotion by 1 level	Fix handoff generation, re-promote

Trust Metrics Dashboard

Track these metrics per agent to manage trust levels:

Metric	Formula	Target (Level 3)
First-pass acceptance rate	Approved outputs / Total outputs	≥98%
Defect escape rate	Defects found after gate / Total outputs	<0.5%
Handoff completeness	Complete handoffs / Total handoffs	100%
Mean time to gate pass	Avg time from output to gate approval	Decreasing trend
Anomaly trigger rate	Anomaly detections / Total runs	<2%
Human intervention rate	Outputs requiring human review / Total outputs	<10% at Level 3

Per-Stage Trust Configuration

Not all stages reach the same trust level at the same pace. Typical progression:

Stage	Typical Time to Level 1	Typical Time to Level 2	Typical Time to Level 3	Notes
Requirements (Stage 1)	2 weeks	6 weeks	12 weeks	Slower — business intent requires careful validation
Design (Stage 2)	3 weeks	8 weeks	16 weeks	Slowest — architecture decisions are high-impact
Implementation (Stage 3)	2 weeks	4 weeks	10 weeks	Fastest for routine code patterns
Testing (Stage 4)	2 weeks	5 weeks	10 weeks	Moderate — test quality has compounding effect
Security (Stage 5)	3 weeks	8 weeks	Never for critical findings	Critical findings always require human review
Deployment (Stage 6)	N/A	N/A	N/A	Production deployment always requires human approval
Operations (Stage 7)	2 weeks	4 weeks	8 weeks	Fast for monitoring; slow for incident response

Relationship to PRD-STD-002 (Code Review)

PRD-STD-002 mandates human review for production-bound code. The trust model does not override this standard. Instead, it defines how the scope and depth of human review changes:

Trust Level	PRD-STD-002 Compliance	Review Scope
Level 0	Full manual review of all code	Line-by-line review
Level 1	Full review for Tier 2+; sampling for Tier 1	Logic and architecture focus
Level 2	Full review for Tier 3+; statistical sampling for Tier 1-2	Exception and anomaly focus
Level 3	Full review for Tier 4; automated gates + sampling for Tier 1-3	Non-negotiable checkpoints only

At every level, the human reviewer retains the authority to reject any agent output and trigger re-work.

warning

Trust Level 3 requires a formal variance approval from the CTO or equivalent authority, documented in the governance record. This acknowledges that the organization accepts reduced human review coverage in exchange for demonstrated agent reliability.

The Core Principle​

Four Trust Levels​

Trust Level 0: Full Supervision (Default)​

Trust Level 1: Guided Autonomy​

Trust Level 2: Supervised Autonomy​

Trust Level 3: Autonomous with Guardrails​

Trust Promotion Process​

Promotion Requirements​

Demotion Triggers​

Trust Metrics Dashboard​

Per-Stage Trust Configuration​

Relationship to PRD-STD-002 (Code Review)​

The Core Principle

Four Trust Levels

Trust Level 0: Full Supervision (Default)

Trust Level 1: Guided Autonomy

Trust Level 2: Supervised Autonomy

Trust Level 3: Autonomous with Guardrails

Trust Promotion Process

Promotion Requirements

Demotion Triggers

Trust Metrics Dashboard

Per-Stage Trust Configuration

Relationship to PRD-STD-002 (Code Review)