Fairness & Bias Assessment
AI models can encode and amplify biases from training data, producing unfair outcomes across demographic groups, languages, geographies, or user segments. Fairness assessment must be proactive, systematic, and continuous — not a one-time checkbox.
Fairness Definitions & Metrics
| Metric | Definition | When to Use |
|---|---|---|
| Demographic Parity | Equal positive prediction rates across groups | Allocation decisions (hiring, lending) |
| Equalized Odds | Equal true positive and false positive rates across groups | Classification where errors have differential impact |
| Calibration | Equal precision (positive predictive value) across groups | Risk scoring, probability estimation |
| Individual Fairness | Similar individuals receive similar predictions | Personalization, recommendations |
| Counterfactual Fairness | Predictions unchanged when protected attributes are counterfactually changed | Causal reasoning about attribute influence |
Important: Some fairness metrics are mathematically incompatible — it is generally impossible to satisfy Demographic Parity and Calibration simultaneously. Organizations SHOULD document which metrics are prioritized, the rationale, and the trade-offs accepted.
For multilingual products, fairness assessment SHOULD include cross-language performance parity. See PRD-STD-015 for multilingual quality requirements.
Pre-Release Bias Assessment Process
Every model at Tier 2 or Tier 3 risk SHOULD undergo structured bias assessment before production release.
Step-by-step process:
- Identify protected attributes relevant to the use case and jurisdiction (e.g., age, gender, ethnicity, language, geography, disability status).
- Define evaluation segments — demographic groups, language groups, geographic regions, or user cohorts.
- Run segmented model evaluation — standard evaluation suite with results broken down by segment.
- Calculate fairness metrics — compute selected metrics per segment.
- Compare against fairness thresholds — assess whether disparities exceed acceptable thresholds.
- Document findings — record all results in a fairness assessment report.
- Mitigation or risk acceptance — if thresholds are exceeded, apply mitigation or document risk acceptance with business justification and stakeholder approval.
Fairness Assessment Summary Template:
Fairness Assessment Summary
Model: <model name and version>
Date: <ISO 8601>
Assessor: <name and role>
Use Case: <description>
Risk Tier: <1|2|3>
Protected Attributes Evaluated: <list>
Fairness Metrics Used: <list with rationale>
Results:
- <Segment A>: <metric> = <value>
- <Segment B>: <metric> = <value>
- Disparity: <value> (threshold: <value>)
Threshold Exceeded: Yes / No
Mitigation Applied: <description or "None">
Decision: PASS / CONDITIONAL PASS / FAIL
Approver: <name and role>
Bias Mitigation Strategies
Pre-processing (data-level)
- Resampling — Oversample underrepresented groups or undersample overrepresented groups.
- Reweighting — Assign higher training weights to underrepresented groups.
- Data augmentation — Generate synthetic examples for underrepresented segments.
In-processing (model-level)
- Fairness constraints — Add fairness penalty terms to the loss function.
- Adversarial debiasing — Train an adversary to predict protected attributes; penalize leakage.
- Fair representation learning — Learn latent representations invariant to protected attributes.
Post-processing (output-level)
- Threshold adjustment — Apply group-specific decision thresholds to equalize outcomes.
- Calibration — Recalibrate model scores per group.
- Rejection option — Route uncertain predictions to human review.
For each mitigation applied, document: strategy used, effect on fairness metrics (before/after), accuracy trade-offs, and whether the trade-off was accepted.
Ongoing Fairness Monitoring
Production fairness monitoring SHOULD be implemented for all Tier 2 and Tier 3 models.
- Fairness dashboards — Track fairness metrics by segment over time.
- Drift detection for fairness — A model can maintain aggregate accuracy while fairness degrades for specific groups.
- Alert thresholds — Critical fairness threshold breaches SHOULD trigger incident response.
- Scheduled re-assessment — Quarterly, or after major model changes.
Fairness Documentation
Every model SHOULD have an attached Fairness Card:
| Field | Value |
|---|---|
| Model name | |
| Model version | |
| Use case | |
| Protected attributes evaluated | |
| Fairness metrics calculated | |
| Results per segment | (attach detailed table) |
| Mitigation applied | |
| Residual bias (known) | |
| Assessment date | |
| Assessor |
Fairness Cards SHOULD be attached to model registry entries (see Model Registry & Versioning) and referenced in release gate evidence.