Skip to content

Shadow Validation

Purpose

The shadow team reimplements modules from KB artifacts alone, writing code and KB cards inside the same files as the original, guarded by a unified macro. Comparing the two sides improves both the KB and the code itself.

How It Works

Manager assigns module (e.g., S01)
    ↓
Shadow-developer reads KB only
    ↓
Writes shadow code inside #ifdef SHADOW_S01 blocks (same source file)
Writes shadow KB sections inside <!-- #ifdef SHADOW_S01 --> blocks (same card)
    ↓
Shadow-reviewer compares both sides (has full access)
    ↓
Produces: merge recommendation + KB gap report + code improvement report
    ↓
Manager routes findings:
    KB gaps      → kb-developer
    Code fixes   → code-developer
    Merge decision → user

The SHADOW Macro

A single macro name SHADOW_{CARD_ID} controls both code and KB.

In source code (C/C++)

int solve(sparse_matrix_t *A, double *b, double *x)
{
#ifdef SHADOW_S01
    /* Shadow reimplementation based on KB card S01 */
    ...shadow implementation...
#else
    /* Original implementation */
    ...original implementation...
#endif
}

Compile with -DSHADOW_S01 to activate shadow for that module.

In KB cards (markdown)

## Algorithm

<!-- #ifdef SHADOW_S01 -->
**[SHADOW]** Shadow's understanding of the algorithm from KB alone:
1. ...shadow's interpretation...
<!-- #else -->
1. ...original description...
<!-- #endif -->

In math documents (LaTeX)

% #ifdef SHADOW_S01
% [SHADOW] Shadow derivation from KB
\begin{theorem}...shadow version...\end{theorem}
% #else
\begin{theorem}...original version...\end{theorem}
% #endif

Naming Convention

Card ID Macro Code flag Meaning
S01 SHADOW_S01 -DSHADOW_S01 Shadow for symbolic factorization
N02 SHADOW_N02 -DSHADOW_N02 Shadow for numeric factorization
V03 SHADOW_V03 -DSHADOW_V03 Shadow for triangular solve
(all) SHADOW_ALL -DSHADOW_ALL Shadow for everything

The macro name always matches the KB card ID. This makes it trivial to trace between code and documentation.

Rules

  1. One macro per module. No nesting, no combining.
  2. #else is mandatory. Both sides must be present until merge.
  3. Original goes in #else. Shadow goes in #ifdef. This way, the default build (no flags) always produces the original.
  4. Same macro in code and card. If SHADOW_S01 appears in solve.c, it must also appear in the S01 card.
  5. Function-level granularity. Wrap whole functions or significant blocks, not individual lines.

What Shadow Produces

Shadow improves both KB and code:

KB Improvements

Finding Action
Shadow couldn't reimplement → KB gap kb-developer fixes the card
Shadow card is clearer than original kb-developer adopts shadow wording
Shadow card reveals implicit assumptions kb-developer makes them explicit

Code Improvements

Finding Action
Shadow code is cleaner/simpler code-developer considers adopting
Shadow code passes more edge cases code-developer investigates why
Shadow code is faster (benchmark) code-developer evaluates trade-offs
Shadow reveals original has dead code code-developer cleans up

The shadow team is not just a KB validator — it is a second opinion on the implementation.

Merge Protocol

Shadow blocks are temporary. Every SHADOW_{ID} block must eventually be resolved:

Merge decision Action
Keep original Remove #ifdef/#else/#endif, keep #else content
Adopt shadow Remove #ifdef/#else/#endif, keep #ifdef content
Hybrid Combine best parts, remove markers

Warning

Merge decisions require user approval. The shadow-reviewer recommends, the user decides.

After merge:

  1. Remove all SHADOW_{ID} markers from code and KB
  2. Update the card to reflect the merged state
  3. Log the merge in SYNCLOG
  4. Re-run benchmark to confirm no regression

Teams

Manager (coordinates both teams)
│
├── dev
│   ├── research dev/rev
│   ├── architect dev/rev
│   ├── code dev/rev
│   ├── kb dev/rev
│   └── benchmark
│
└── shadow
    ├── shadow-developer   (reimplements from KB, writes shadow code + shadow KB)
    └── shadow-reviewer    (compares both sides, recommends merge)

Access Constraints

Agent Can see Cannot see
shadow-developer KB cards, math docs, Doxygen headers, test suite Source code (the #else blocks)
shadow-reviewer Everything (both sides) ---

Transfer Score

Per-module scoring

\[T_\text{test} = \frac{\text{shadow tests passed}}{\text{total tests}}, \quad T_\text{api} = \frac{\text{matching signatures}}{\text{total signatures}}, \quad T_\text{algo} = \frac{\text{correct algorithms}}{\text{total algorithms}}\]
\[T_\text{module} = 0.50 \times T_\text{test} + 0.25 \times T_\text{api} + 0.25 \times T_\text{algo}\]

Aggregate

\[T_\text{total} = \overline{T_\text{module}}\]

Gap Classification

Gap type Meaning Fix target
Missing KB has no information about this KB
Ambiguous KB describes it but shadow misinterpreted KB
Incorrect KB describes it wrong KB
Implicit KB assumes knowledge not documented KB
Code-improvement Shadow found a better approach Code

When to Run

Trigger Scope
After major KB update Weakest-scoring modules
Before release Critical-path modules
On demand User-selected modules
After new algorithm The new algorithm's module

The manager assigns modules incrementally — one non-trivial module per shadow cycle, prioritized by KB score and criticality.

Integration with Scoring

Shadow validation adds a 5th dimension to the KB scoring system:

\[\text{Composite} = 0.25 Q + 0.25 C + 0.15 K + 0.15 F + 0.20 T\]

See Scoring System for the full system.

Feedback Loop

Shadow reviewer produces gap report + code improvement report
    ↓
Manager routes:
    KB gaps           → kb-developer
    Code improvements → code-developer
    Merge decisions   → user
    ↓
After fixes, kb-reviewer re-scores
    ↓
(optional) Shadow re-validates same module