Detecting Crashes with Fat-Tail Statistics

February 23, 2026 11 min read Keywords: fat tails, crash detection, LPPLS, extreme value theory, Taleb, Sornette, bitcoin, finance, rust Edit this page

A pastoral landscape with a ploughman, shepherd, and ships while Icarus drowns unnoticed in the sea — *Landscape with the Fall of Icarus*, Pieter Bruegel the Elder, c. 1560

Financial markets don’t follow normal distributions. This isn’t a theoretical curiosity. It’s a statement about how often catastrophic events happen. Under a Gaussian model, the 2008 financial crisis was a 25-sigma event. Something that should happen once every $10^{138}$ years. It happened on a Tuesday.

The problem is that we keep using tools designed for thin-tailed worlds. VaR models that assume normality. Risk metrics that treat the 2008 crash as an “outlier” rather than a regular feature of financial returns.

I built fatcrash, a Rust+Python toolkit, to test whether fat-tail statistical methods can detect crashes before they happen. The performance-critical math (fitting, simulation) runs in Rust via PyO3; everything else (data, viz, CLI) is Python.

#The four methods

#Hill estimator: measuring tail heaviness

The Hill estimator answers: how fat are the tails?

It fits a power law to the extreme values of a distribution: $P(X > x) \sim x^{-\alpha}$. The exponent $\alpha$ (the “tail index”) tells you how heavy the tail is:

$\alpha < 2$: Infinite variance. The distribution is so fat-tailed that the variance doesn’t converge with more data. This is Cauchy territory.
$\alpha$ between 2 and 4: Finite variance but infinite kurtosis. This is where most financial assets live.
$\alpha > 4$: Relatively thin tails. Still fatter than Gaussian, but manageable.

The estimator works by sorting the data, taking the $k$ largest observations, and computing:

$$\hat{\alpha} = k \left( \sum_{i=1}^{k} \ln \frac{X_{(i)}}{X_{(k)}} \right)^{-1}$$

where $X_{(1)} \geq X_{(2)} \geq \ldots$ are the order statistics.

The choice of $k$ matters. Too small and the estimate is noisy. Too large and you’re including observations from the body of the distribution, not the tail. A Hill plot (alpha vs. k) helps find the plateau.

In practice: Hill alpha is useful for characterization: it tells you what kind of distribution you’re dealing with. But as a standalone crash predictor, it’s noisy. In our tests, it only detected 28% of drawdowns.

#Kappa metric: how far from Gaussian?

Taleb’s kappa metric takes a different approach. Instead of estimating a tail parameter, it asks: does this distribution behave like a Gaussian?

Split your data into $n$ subsamples. Find the maximum of each subsample. Compare the mean of those maxima to the overall maximum:

$$\kappa = \frac{\text{mean of subsample maxima}}{\text{overall maximum}}$$

For a Gaussian distribution, this ratio converges to a known benchmark (approximately $1/\sqrt{n}$ for large samples). For fat-tailed distributions, $\kappa$ falls below the benchmark because extreme observations are much more extreme than what you’d see in any subsample.

The ratio $\kappa / \text{benchmark}$ is the signal:

Near 1.0: behaves Gaussian
Below 0.8: significantly fat-tailed
Below 0.5: extremely fat-tailed, crisis regime

In practice: Kappa was the best tail-based method in our tests (49% overall detection rate). It’s more robust than Hill because it doesn’t depend on choosing $k$, and it directly benchmarks against Gaussian via Monte Carlo simulation.

#EVT (Extreme Value Theory): quantifying worst-case scenarios

EVT is the standard framework for modeling tail risk. Instead of fitting a distribution to all the data (where the bulk dominates), EVT focuses only on the extremes.

Two complementary approaches:

GPD (Generalized Pareto Distribution): Pick a high threshold $u$. Fit the GPD to losses that exceed $u$. The Pickands-Balkema-de Haan theorem guarantees that for sufficiently high $u$, the exceedances follow a GPD regardless of the underlying distribution. The GPD has two parameters: scale ($\sigma$) and shape ($\xi$). From these you get:

$$\text{VaR}_p = u + \frac{\sigma}{\xi}\left[\left(\frac{n}{N_u(1-p)}\right)^\xi - 1\right]$$

$$\text{ES}_p = \frac{\text{VaR}_p + \sigma - \xi u}{1 - \xi}$$

VaR tells you the loss you won’t exceed with probability $p$. ES tells you the average loss when you do exceed VaR.

GEV (Generalized Extreme Value): Instead of exceedances, fit to block maxima (e.g., the worst loss each month). The shape parameter $\xi$ tells you the tail type:

$\xi > 0$: Frechet (fat tail, power-law decay), typical for finance
$\xi \approx 0$: Gumbel (exponential tail)
$\xi < 0$: Weibull (bounded tail)

In practice: GPD VaR detected 42% of drawdowns. It works well for medium corrections but struggles with major crashes because the pre-crash period is itself volatile, making the baseline VaR already elevated.

#LPPLS: detecting bubbles before they burst

The Log-Periodic Power Law Singularity model takes a different approach. Instead of measuring statistical properties of returns, it detects a specific pattern in prices: the bubble signature.

The theory (Didier Sornette, ETH Zurich): during a bubble, prices follow super-exponential growth decorated with accelerating oscillations that converge toward a critical time $t_c$, the most likely crash date.

$$\ln p(t) = A + B(t_c - t)^m + C(t_c - t)^m \cos(\omega \ln(t_c - t) + \phi)$$

The seven parameters encode specific physics:

$t_c$: critical time (when the bubble is most likely to end)
$m$: power law exponent (must be 0.1-0.9 for a valid bubble)
$\omega$: log-periodic frequency (must be 2-25)
$B < 0$: required, indicates super-exponential growth
$A, C, \phi$: amplitude and phase parameters

Fitting this is computationally expensive. For each candidate $(t_c, m, \omega)$, the linear parameters $(A, B, C_1, C_2)$ are solved analytically via OLS. The nonlinear search uses a population-based stochastic optimizer over the 3D space. The Sornette filter rejects fits that don’t satisfy the physical constraints.

The DS LPPLS confidence indicator fits this model across many overlapping time windows. If a high fraction of windows produce valid bubble fits, confidence is high.

In practice: LPPLS had a 97% detection rate across all drawdown sizes. It detects the bubble regime itself (super-exponential growth + oscillations), which precedes both small corrections and major crashes.

#Results

We tested all four methods on 35 drawdowns across three assets:

Method	Small (<15%)	Medium (15-30%)	Major (>30%)	Overall
LPPLS	100%	100%	88%	97%
Kappa	57%	47%	38%	49%
GPD VaR	40%	55%	—	42%
Hill	29%	29%	25%	28%

#Major known crashes (detailed)

Testing on four major crashes with pre-crash vs. calm period comparison:

Crash	Kappa	GPD VaR	LPPLS	Hill
2017 BTC Bubble	detected	detected	detected	missed
2021 BTC Crash	detected	detected	detected	missed
2008 Financial Crisis	detected	detected	detected	detected
COVID Crash 2020	detected	—	detected	missed

Kappa, GPD VaR, and LPPLS each scored 100% on known major crashes. Hill scored 25%.

#Why Hill underperforms

Hill measures the tail index of the return distribution, but this property changes slowly. A 6-month pre-crash window doesn’t necessarily have thinner tails than a 6-month calm window because the calm window might include its own mini-shocks. The Hill estimator is useful for long-term characterization (this asset has alpha=3, that one has alpha=4) but not for short-term prediction.

#Why LPPLS dominates

LPPLS detects structure, not statistics. It’s looking for a specific pattern: accelerating growth with log-periodic oscillations. This pattern appears before both 10% corrections and 80% crashes. The tail-based methods need to see the tail thickening, which requires the crash to already be underway. LPPLS sees the bubble building.

The downside: LPPLS has a high false positive rate. It frequently detects “bubble signatures” during normal bull markets. The solution is combining it with the tail-based methods. If LPPLS says “bubble” and kappa says “tails thickening” and VaR is elevated, the signal is more reliable.

#Long timescales: 54 years of forex

We tested on GBP/USD daily data from 1971 to 2025 (13,791 trading days):

Decade	Hill alpha	Kappa/benchmark	Notable
1970s	2.92	0.78	Oil crises, IMF bailout
1980s	4.36	0.68	Plaza Accord
1990s	4.51	0.83	Black Wednesday
2000s	2.90	0.57	2008 crisis
2010s	3.86	0.35	Brexit
2020s	3.39	0.71	Truss mini-budget

Every decade shows fat tails. Every known GBP/USD crisis was detected (6/6):

1976 IMF Crisis
1985 Plaza Accord
1992 Black Wednesday
2008 Financial Crisis
2016 Brexit Vote
2022 Truss Mini-Budget

#All methods on daily forex (1971-2025)

We ran Hill, Kappa, GPD, GEV, and LPPLS on 25 currency pairs from FRED daily data:

Pair	N days	Hill $\alpha$	$\kappa$/bench	GPD $\xi$	VaR 99%	GEV $\xi$	LPPLS
VEF/USD	6,511	1.20	0.28	-0.46	253.7%	1.00	bubble
HKD/USD	11,290	1.73	0.18	0.56	26.3%	1.00	bubble
KRW/USD	11,176	1.90	0.29	-0.37	4.3%	0.37	bubble
MXN/USD	8,058	2.04	0.36	-0.34	4.6%	0.28	bubble
AUD/USD	13,783	2.58	0.46	-0.37	4.2%	0.15	bubble
INR/USD	13,282	2.62	0.40	0.22	10.4%	0.28	bubble
BRL/USD	7,773	2.80	0.63	0.39	30.2%	0.11	bubble
CAD/USD	13,796	3.84	0.52	0.20	6.0%	0.16	bubble
GBP/USD	13,790	4.13	0.53	0.17	7.8%	0.03	bubble
EUR/USD	6,769	4.88	0.62	-0.05	3.4%	0.02	bubble

All 25 pairs show fat tails. Every kappa ratio is below 1 (below the Gaussian benchmark). GEV shape parameter $\xi > 0$ for nearly all pairs, confirming Frechet-type fat tails. LPPLS detected bubble signatures in 14 of 25 pairs.

Venezuela (VEF/USD) is the extreme case: $\alpha = 1.20$ with a 99% VaR of 253.7%. On the worst 1% of days, you’d lose more than 2.5x your position.

#500 years, 138 countries

Using the Clio Infra exchange rate dataset (1500-2013), we ran Hill, Kappa, and GEV on every country with 50+ years of data.

Results:

Tail regime	Countries	Percentage
$\alpha < 2$ (infinite variance)	98	71%
$\alpha$ 2-4 (fat tails, finite variance)	37	27%
$\alpha > 4$ (moderate tails)	3	2%

71% of countries have exchange rate distributions with infinite variance. The median $\alpha$ across all 138 countries is 1.57. GEV confirms this: 81% of countries show Frechet-type (fat) tails with median $\xi = 0.76$.

The most extreme cases:

Country	Years of data	Hill $\alpha$	Worst year	Best year
Syria	61	0.32	-2%	+105%
Iraq	61	0.40	-38%	+883%
Germany	153	0.52	-2,748%	+2,104%
Nicaragua	76	0.52	-2,159%	+787%
Zimbabwe	56	0.55	-12%	+1,345%
Hungary	66	0.56	-944%	+247%
Peru	64	0.72	-1,660%	+426%
Bolivia	63	0.76	-1,794%	+494%
Brazil	129	1.03	-3,536%	+318%
Argentina	102	1.28	-2,748%	+388%

Germany’s -2,748% in a single year is the Weimar hyperinflation. Brazil’s -3,536% reflects the cruzeiro collapse. These aren’t outliers. They’re exactly what a distribution with $\alpha < 1$ predicts.

#Century-by-century: United Kingdom (1789-2013)

The UK has 224 years of continuous exchange rate data:

Century	Hill $\alpha$	Regime
1800s	1.19	Infinite variance (Napoleonic wars, banking crises)
1900s	3.17	Fat but finite (Bretton Woods stability)
2000s	2.04	Back to borderline infinite variance

Even within a single country, tail regimes shift across centuries.

#Inflation: 500 years, 82 countries

Inflation data from Clio Infra (1500-2010):

Statistic	Value
Countries analyzed	82
Countries with hyperinflation (>100%/yr)	32 (39%)
Countries with $\alpha < 2$	36 (44%)
Median $\alpha$	2.14

The most extreme inflation tails:

Country	Years	Hill $\alpha$	Max inflation
Nicaragua	71	0.30	13,110%/yr
Zimbabwe	83	0.44	24,411%/yr
Germany	494	0.57	211,427,400,000%/yr
Brazil	226	0.63	2,948%/yr
Peru	363	0.80	7,482%/yr
Argentina	274	0.85	3,079%/yr
China	336	0.86	1,579%/yr
Poland	414	0.97	4,738%/yr

Germany has 494 years of inflation data with $\alpha = 0.57$. Its maximum annual inflation was 211 billion percent (Weimar 1923). With $\alpha < 1$, neither the mean nor the variance of this distribution converges. You cannot compute a confidence interval. You cannot build a VaR model. The standard toolkit breaks down.

#Conclusions

Fat tails are universal. Every asset, every timescale, every country. From daily BTC returns to 500 years of exchange rates. 71% of countries have exchange rate distributions with infinite variance. This isn’t an anomaly. It’s the default state of financial markets.
LPPLS is the best single crash detector (97%) because it detects bubble structure, not tail statistics. But it has false positives. It flags bubble signatures during normal bull markets too.
Kappa is the best tail-based method (49%). More robust than Hill because it benchmarks against Gaussian rather than estimating a single parameter.
The multi-method approach works. LPPLS catches bubbles, kappa catches tail regime changes, EVT quantifies worst-case losses, Hill tracks long-term tail behavior. No single method is sufficient.
Standard risk models are wrong for most countries. Modern Portfolio Theory, CAPM, Black-Scholes: all assume finite variance. For 71% of the world’s currencies, this assumption is empirically false. The math doesn’t just give wrong answers; it gives answers to the wrong question.
Hyperinflation isn’t rare. 39% of countries experienced >100% annual inflation at some point. Germany’s 211 billion percent is extreme, but dozens of countries experienced four- and five-digit inflation. Any model that treats these as “outliers” is a model that doesn’t understand the data it’s modeling.

The code is open source: github.com/unbalancedparentheses/fatcrash. The forex data comes from forex-centuries.

Discuss on X