Review Protocol
Verdicts
Every reviewer produces one of three verdicts:
| Verdict | Meaning | Action |
|---|---|---|
| APPROVE | Work is sound. Proceed to next step. | Move forward |
| REVISE | Fixable issues found. Developer addresses findings. | Loop back |
| REJECT | Fundamental flaw. Needs rethinking. | Escalate to user |
Review Report Format
# {Domain} Review --- YYYY-MM-DD
## Artifact: {what was reviewed}
## Verdict: APPROVE / REVISE / REJECT
## Findings
| # | Dimension | Severity | Finding |
|---|-----------|----------|---------|
| 1 | Correctness | High | {description} |
| 2 | Completeness | Medium | {description} |
## Recommendations
1. {specific, actionable fix for each finding}
## Questions
{Any questions that need clarification before proceeding}
Severity Levels
| Level | Meaning | Blocks approval? |
|---|---|---|
| High | Correctness, safety, or fundamental design issue | Yes |
| Medium | Completeness, performance, or design quality issue | No (but should fix) |
| Low | Style, naming, minor improvements | No |
Important
APPROVE requires: zero High findings remaining.
Review Dimensions by Domain
Research Review
| Dimension | What to check |
|---|---|
| Correctness | Is the proof valid? Are bounds tight? |
| Completeness | Are all edge cases covered? |
| Novelty | Is this actually new or already known? |
| Applicability | Can this be implemented? |
Architecture Review
| Dimension | What to check |
|---|---|
| Correctness | Does design faithfully translate research? |
| Completeness | Are all affected files/functions identified? |
| Feasibility | Can code-developer implement without ambiguity? |
| Minimality | Is the abstraction the simplest that works? |
| Safety | Does refactoring preserve correctness at each step? |
Code Review
| Dimension | What to check |
|---|---|
| Correctness | Does code match the architecture spec? |
| Testing | Are tests sufficient? Do they pass? |
| Performance | No unnecessary allocations, copies, or complexity? |
| Security | No buffer overflows, undefined behavior, or injection? |
| Style | Follows project conventions? |
KB Review
See Scoring System for the five-dimension scoring system.
Shadow Review
| Dimension | What to check |
|---|---|
| Test pass rate | Does shadow code pass the original test suite? |
| API fidelity | Do function signatures match? |
| Algorithm correctness | Is the reimplemented logic equivalent? |
| Gap classification | Is each divergence a KB gap, valid alternative, or shadow error? |
Adversarial Checks
In adversarial environments, reviewers must specifically look for:
| Risk | Check |
|---|---|
| Subtly wrong proof | Verify key steps independently, check boundary conditions |
| Correct-looking buggy code | Write targeted test cases, test edge cases |
| Misleading documentation | Cross-check every claim against source code |
| Missing files in design | Grep for all callers/callees of modified functions |
| Silent numerical degradation | Run benchmark before and after |
| Over-abstraction | Count concrete variants — no abstraction for one variant |
| Incomplete migration | Check for code paths that bypass new abstractions |
Rule
Warning
Every developer output gets reviewed. No exceptions. The reviewer step is not optional — it is the primary defense against errors.