What is the verification boundary in backtesting?

A conventional backtest engine returns a Sharpe ratio and calls it a result. Pancake's verification boundary is a structured alternative: a 3-tuple that makes explicit exactly what the engine knows, what it accepted on trust, and what it cannot assess.

The verified layer covers two sub-categories. Structural: schema_match, lookahead prevention, monotonicity, value range checks, and required column presence — hard invariants that abort the run if violated. Runner math: the engine re-derives the P&L ledger, fee application, slippage, and all statistics from the declared inputs — no number is accepted from the agent at this layer.

The agent-supplied evidence layer names what the agent provided that the engine cannot re-derive independently: feature column values, entry price source (observed / agent_estimate / last_trade / mid / vwap), and liquidity source. These appear verbatim in the receipt so a reader can assess them directly.

The unmodeled risks layer is a fixed list of what the current engine version does not model: market_impact (the strategy's own order flow moving prices), resolution_lag (final resolution diverging from price at resolution_time), resolver_risk (the venue resolving differently than implied), and small_sample (statistical noise below 10 trades). This list appears in every receipt regardless of trade count.