Permutation Test

A Permutation Test assesses statistical significance by repeatedly shuffling the labels (trade outcomes) and measuring how often the shuffled Sharpe equals or exceeds the observed Sharpe.

The math

p = #{shuffles with Sharpe* ≥ Sharpe_obs} / K

K = number of shuffles (Pancake uses 10,000), Sharpe* = Sharpe on shuffled trade sequence, Sharpe_obs = Sharpe on actual sequence.

Why it matters

A low p-value (e.g. p < 0.05) means fewer than 5% of random shuffles achieved your Sharpe — the observed performance is unlikely to be due to luck in the order of trades. Unlike a t-test, the permutation test makes no assumptions about the return distribution. Good (2005) is the standard reference for permutation inference in finance.

Permutation tests shuffle trade outcomes independently, which breaks serial correlation. A strategy whose edge is purely a streak of early wins (not reproducible) may still show a low p-value if its Sharpe happens to be high. P-value also depends heavily on N — with fewer than 20 trades, even a p of 0.05 is borderline meaningless.

Published source

Good, P. (2005). Permutation, Parametric and Bootstrap Tests of Hypotheses (3rd ed.). Springer.

See it in a real receipt

Open receipt /r/MupOp1tS