Detecting Crashes with Fat-Tail Statistics
Financial markets don’t follow normal distributions. This isn’t a theoretical curiosity. It’s a statement about how often catastrophic events happen. Under a Gaussian model, the 2008 financial crisis was a 25-sigma event. Something that should happen once every $10^{135}$ years. It happened on a Tuesday.
The problem is that we keep using tools designed for thin-tailed worlds. Value at Risk (VaR) models that assume normality. Risk metrics that treat the 2008 crash as an “outlier” rather than a regular feature of financial returns.
I built fatcrash, a Rust+Python toolkit with 13 methods, to test whether fat-tail statistical methods can detect crashes before they happen. The performance-critical math (fitting, simulation, all rolling estimators) runs in Rust via PyO3; everything else (data, viz, CLI) is Python.
#What are fat tails?
A fat-tailed distribution is one where extreme events happen far more often than a bell curve (Gaussian distribution) would predict. In a normal distribution, an event five standard deviations from the mean is essentially impossible, roughly a one-in-3.5-million chance. In a fat-tailed distribution, such events are uncommon but not rare. They show up regularly in financial data.
The technical way to describe this is through the tail index, usually written $\alpha$. A fat-tailed distribution follows a power law in the extremes: the probability of a loss larger than $x$ decays as $P(X > x) \sim x^{-\alpha}$. The smaller the $\alpha$, the fatter the tail and the more likely extreme events are.
Here is a rough guide to what different values of $\alpha$ mean:
- $\alpha < 2$: Infinite variance. The distribution is so fat-tailed that the variance doesn’t converge with more data. Standard statistics like standard deviation and correlation become unreliable. This is Cauchy distribution territory.
- $\alpha$ between 2 and 4: Finite variance but infinite kurtosis. Kurtosis measures how “peaked” a distribution is and how heavy its tails are. When kurtosis is infinite, sample estimates of it are unstable and misleading. This is where most financial assets live.
- $\alpha > 4$: Relatively thin tails. Still fatter than Gaussian, but manageable with conventional tools.
The 13 methods in fatcrash fall into three categories: bubble detection (finding the specific price pattern that precedes a crash), regime detection (spotting shifts in how the market behaves over time), and tail estimation (measuring how fat the tails actually are). Let’s walk through each group.
#Bubble detection
These methods look for structural patterns in prices, not statistics of returns. A bubble is a regime of super-exponential growth, prices rising faster and faster, that eventually becomes unsustainable.
#LPPLS: detecting bubbles before they burst
The Log-Periodic Power Law Singularity model takes a fundamentally different approach from statistical methods. Instead of measuring properties of returns, it detects a specific pattern in prices: the bubble signature.
The theory, developed by Didier Sornette at ETH Zurich and described in his book Why Stock Markets Crash: during a bubble, prices follow super-exponential growth decorated with accelerating oscillations that converge toward a critical time $t_c$, the most likely crash date. Think of it like a wine glass vibrating at increasing frequency before it shatters.
$$\ln p(t) = A + B(t_c - t)^m + C(t_c - t)^m \cos(\omega \ln(t_c - t) + \phi)$$
In plain language: the logarithm of price is the sum of a smooth power-law growth (the $B$ term) and an oscillation whose frequency accelerates as you approach $t_c$ (the cosine term). The seven parameters encode specific dynamics:
- $t_c$: critical time (when the bubble is most likely to end)
- $m$: power law exponent (must be 0.1-0.9 for a valid bubble)
- $\omega$: log-periodic frequency (must be 6-13)
- $B < 0$: required, indicates super-exponential growth
- $A, C, \phi$: amplitude and phase parameters
Fitting this is computationally expensive. For each candidate $(t_c, m, \omega)$, the linear parameters $(A, B, C_1, C_2)$ are solved analytically via OLS. The nonlinear search uses a population-based stochastic optimizer over the 3D space. The Sornette filter rejects fits that don’t satisfy the physical constraints.
The DS LPPLS confidence indicator fits this model across many overlapping time windows. If a high fraction of windows produce valid bubble fits, confidence is high.
In practice: LPPLS had a 100% detection rate across all drawdown sizes. It detects the bubble regime itself (super-exponential growth + oscillations), which precedes both small corrections and major crashes. The downside is a high false positive rate: it frequently detects “bubble signatures” during normal bull markets.
#GSADF: explosive unit roots
The Generalized Sup ADF (Augmented Dickey-Fuller) test, introduced by Phillips, Shi, and Yu (2015), detects explosive behavior in prices. To understand it, you need one concept: a unit root. A time series has a unit root if it follows a random walk, meaning today’s value is yesterday’s value plus random noise, with no tendency to return to a long-run average. An explosive series goes further: it grows faster than a random walk, each day’s value is a multiple of yesterday’s, like compound interest gone wild.
GSADF runs backward-expanding ADF unit root tests across all possible start and end dates, taking the supremum (the largest test statistic). If this supremum exceeds Monte Carlo critical values, the series is explosive.
GSADF is complementary to LPPLS. LPPLS detects the specific log-periodic oscillation pattern. GSADF detects any form of explosive growth, regardless of the oscillation structure.
In practice: GSADF detected 38% of drawdowns overall, but 59% of medium-sized drawdowns (15-30%). It is more conservative than LPPLS and has fewer false positives.
#Regime detection
These methods look at how the temporal structure of returns changes over time. Before a crash, markets often shift from noisy, mean-reverting behavior to strong, persistent trending, a regime change that tail estimators can’t see because they only measure distributional shape, not temporal dependence.
#DFA: detrended fluctuation analysis
Detrended Fluctuation Analysis, introduced by Peng et al. (1994), measures long-range dependence in non-stationary time series, that is, whether today’s returns are correlated with returns from days or weeks ago. The method works by dividing the integrated series (cumulative sum of returns) into windows, fitting a local polynomial trend in each window, computing the root-mean-square residual (how much the data deviates from the local trend), and checking how that residual scales with window size:
$$F(n) \sim n^{\alpha_{\text{DFA}}}$$
This formula says: the fluctuation $F$ at scale $n$ (window size) grows as a power of $n$. The scaling exponent $\alpha_{\text{DFA}}$ classifies the dynamics:
- $\alpha_{\text{DFA}} = 0.5$: uncorrelated (random walk), no memory in the series
- $\alpha_{\text{DFA}} > 0.5$: persistent (trends tend to continue), an up day makes another up day more likely
- $\alpha_{\text{DFA}} < 0.5$: anti-persistent (mean-reverting), an up day makes a down day more likely
Before a crash, markets often transition from mean-reverting to persistent dynamics. DFA picks up this regime shift. The key advantage over simpler methods is the detrending step: by removing local polynomial trends before measuring fluctuations, DFA separates genuine long-range dependence from spurious correlations caused by local trends.
In practice: DFA was the best non-bubble crash detector in our tests (82% overall detection rate). It handles non-stationarity better than Hurst’s R/S analysis because the detrending step removes local polynomial trends before measuring fluctuations.
#Hurst exponent: persistence detection
The Hurst exponent, introduced by Harold Edwin Hurst in 1951 while studying Nile river flooding patterns, measures long-range dependence via rescaled range (R/S) analysis. For a time series of length $n$, compute the range of cumulative deviations from the mean, rescale by the standard deviation, and measure how $R/S$ scales with $n$:
$$\frac{R}{S} \sim n^H$$
This formula says: the rescaled range grows as a power of the sample size. $H = 0.5$ is a random walk. $H > 0.5$ is persistent. $H < 0.5$ is mean-reverting. Financial assets typically show $H$ between 0.55 and 0.85. A shift toward higher $H$ before a crash means the market is trending more strongly, which often accompanies bubble formation.
In practice: Hurst detected 59% of drawdowns. It is simpler than DFA but less robust to non-stationarity, because it doesn’t remove local trends before measuring the range.
#Spectral exponent: frequency domain
The GPH log-periodogram regression, introduced by Geweke and Porter-Hudak (1983), estimates the long-memory parameter $d$ from the frequency domain. Instead of looking at how correlations decay over time (as DFA and Hurst do), it looks at how much power the signal has at different frequencies.
The relationship to Hurst: $d = H - 0.5$. Positive $d$ indicates long memory (persistence). Think of it as measuring the same phenomenon (long-range dependence) but through a different lens: time domain vs. frequency domain.
In practice: It detected 28% of drawdowns, comparable to Hill. The frequency-domain approach is theoretically elegant but doesn’t add much beyond what DFA already captures for crash detection.
#Tail estimation
These methods directly measure the fatness of the tails, how extreme the extremes really are. They answer questions like: “How often should we expect a 10% daily loss?” and “Is the variance of this distribution even finite?”
#Hill estimator: measuring tail heaviness
The Hill estimator (Hill, 1975) is the most widely used tail index estimator. It fits a power law to the extreme values of a distribution and estimates the exponent $\alpha$.
The estimator works by sorting the data from largest to smallest, taking the $k$ largest observations (called order statistics, which is just a fancy name for sorted values), and computing:
$$\hat{\alpha} = k \left( \sum_{i=1}^{k} \ln \frac{X_{(i)}}{X_{(k)}} \right)^{-1}$$
where $X_{(1)} \geq X_{(2)} \geq \ldots$ are the order statistics. In words: take the $k$ biggest values, compute how far each is from the $k$-th largest (in log scale), average those distances, and invert. A small average log-gap means the extreme values are tightly packed (thin tail), and a large gap means they’re spread out (fat tail).
The choice of $k$ matters enormously. Too small and the estimate is noisy (not enough data points). Too large and you’re including observations from the body of the distribution, not the tail. A Hill plot ($\alpha$ vs. $k$) helps find the plateau where the estimate stabilizes.
In practice: Hill alpha is useful for characterization: it tells you what kind of distribution you’re dealing with. But as a standalone crash predictor, it’s noisy. In our tests, it only detected 28% of drawdowns.
#Kappa metrics: how far from Gaussian?
Two metrics answer this question from different angles.
Taleb’s kappa, introduced in Statistical Consequences of Fat Tails by Nassim Nicholas Taleb, measures how fast the mean absolute deviation (MAD), the average distance of observations from their mean, converges as you add more data. For well-behaved distributions, the MAD stabilizes quickly. For fat-tailed ones, it doesn’t, because new extreme observations keep pulling the average around. The formula compares the MAD at two sample sizes $n_0$ and $n$:
$$\kappa = 2 - \frac{\log n - \log n_0}{\log M(n) - \log M(n_0)}$$
where $M(n)$ is the MAD for $n$ summands. For a Gaussian, $\kappa = 0$ (fast convergence). For a Cauchy distribution, $\kappa = 1$ (no convergence at all). Values between 0 and 1 measure the degree of fat-tailedness.
Max-stability kappa takes a different approach rooted in extreme value theory. The intuition: in a fat-tailed distribution, the single most extreme observation dominates everything. If you split your data into subsamples and find the maximum of each subsample, those subsample maxima will be much smaller than the overall maximum, because the one truly extreme value ended up in just one subsample. In a Gaussian distribution, the subsample maxima would be closer to the overall maximum.
Formally: split your data into $n$ subsamples. Find the maximum of each subsample. Compare the mean of those maxima to the overall maximum:
$$\kappa_{\text{max}} = \frac{\text{mean of subsample maxima}}{\text{overall maximum}}$$
For a Gaussian distribution, this ratio converges to a known benchmark (approximately $1/\sqrt{n}$ for large samples). For fat-tailed distributions, $\kappa_{\text{max}}$ falls below the benchmark because extreme observations are much more extreme than what you’d see in any subsample.
The ratio $\kappa_{\text{max}} / \text{benchmark}$ is the signal:
- Near 1.0: behaves Gaussian
- Below 0.8: significantly fat-tailed
- Below 0.5: extremely fat-tailed, crisis regime
fatcrash implements both variants.
In practice: Max-stability kappa was the best tail-based method in our tests (49% overall detection rate). It’s more robust than Hill because it doesn’t depend on choosing $k$, and it directly benchmarks against Gaussian via Monte Carlo simulation. Taleb’s kappa detected 33% of drawdowns but is more useful for long-term characterization than short-term prediction.
#Pickands estimator: domain-agnostic tail index
The Pickands estimator (Pickands, 1975) estimates the extreme value index $\gamma$ using just three order statistics, making it valid for all three domains of attraction (Frechet, Gumbel, Weibull, explained below in the EVT section):
$$\hat{\gamma} = \frac{1}{\ln 2} \ln \frac{X_{(k)} - X_{(2k)}}{X_{(2k)} - X_{(4k)}}$$
In words: look at the gaps between the $k$-th, $2k$-th, and $4k$-th largest values. If the gap between the top values is much larger than the gap further down, the tail is fat ($\gamma > 0$). If the gaps are similar, the tail is exponential ($\gamma \approx 0$). If the top gap is smaller, the tail is bounded ($\gamma < 0$).
Unlike Hill, which assumes the tail is Pareto (Frechet domain only), Pickands works regardless of the tail type.
In practice: Pickands detected 49% of drawdowns, matching max-stability kappa. Its domain-agnostic nature makes it a useful cross-check on Hill.
#DEH moment estimator
The Dekkers-Einmahl-de Haan moment estimator (Dekkers, Einmahl, and de Haan, 1989) uses first and second moments (averages and averages of squares) of log-spacings between order statistics. Like Pickands, it is valid for all domains of attraction, but it uses more data points from the tail, which makes it less volatile.
In practice: It detected 46% of drawdowns.
#QQ estimator
The QQ estimator computes the tail index from the slope of a log-log QQ plot (quantile-quantile plot) against exponential quantiles. A QQ plot compares the observed distribution against a theoretical one; if the points fall on a straight line, the distributions match. The slope of that line in log-log space gives you the tail index.
In practice: It detected 38% of drawdowns.
#Maximum-to-Sum ratio
The Maximum-to-Sum ratio is a direct diagnostic for infinite variance. For $n$ observations, compute:
$$R_n = \frac{\max(|X_i|)}{\sum(|X_i|)}$$
In words: what fraction of the total absolute value comes from the single largest observation? If one observation dominates the entire sum, the distribution likely has infinite variance. If $R_n$ stays bounded away from zero as $n$ grows, the variance is infinite.
In practice: It detected 31% of drawdowns.
#EVT: quantifying worst-case scenarios
Extreme Value Theory (EVT) is the standard mathematical framework for modeling tail risk. Instead of fitting a distribution to all the data (where the bulk dominates and extreme events are treated as noise), EVT focuses only on the extremes.
Two complementary approaches:
GPD (Generalized Pareto Distribution): Pick a high threshold $u$ (say, losses worse than the 95th percentile). Fit the GPD to losses that exceed $u$. The Pickands-Balkema-de Haan theorem guarantees that for sufficiently high $u$, the exceedances follow a GPD regardless of the underlying distribution. The GPD has two parameters: scale ($\sigma$, how spread out the exceedances are) and shape ($\xi$, how fat the tail is). From these you get:
$$\text{VaR}_p = u + \frac{\sigma}{\xi}\left[\left(\frac{n}{N_u}(1-p)\right)^{-\xi} - 1\right]$$
$$\text{ES}_p = \frac{\text{VaR}_p + \sigma - \xi u}{1 - \xi}$$
VaR (Value at Risk) tells you the loss you won’t exceed with probability $p$. For example, a 99% VaR of 5% means that on 99% of days, you’ll lose less than 5%. ES (Expected Shortfall, also called Conditional VaR) tells you the average loss when you do exceed VaR, answering the question “when things go badly, how bad do they get on average?”
GEV (Generalized Extreme Value): Instead of exceedances over a threshold, fit to block maxima (e.g., the worst loss each month). The Fisher-Tippett-Gnedenko theorem guarantees that block maxima converge to a GEV distribution. The shape parameter $\xi$ tells you the tail type:
- $\xi > 0$: Frechet (fat tail, power-law decay), typical for finance
- $\xi \approx 0$: Gumbel (exponential tail)
- $\xi < 0$: Weibull (bounded tail, there’s a maximum possible value)
In practice: GPD VaR detected 42% of drawdowns. It works well for medium corrections but struggles with major crashes because the pre-crash period is itself volatile, making the baseline VaR already elevated.
#Results on 39 drawdowns
We tested all 13 methods on 39 drawdowns across three assets (BTC, SPY, Gold). A drawdown is defined as a peak-to-trough decline in daily close; the pre-crash window is the 6 months before the peak, and the calm window is a 6-month period ending 12 months before the peak. A method “detects” a crash if its signal during the pre-crash window is significantly elevated compared to the calm window.
These percentages are recall-style detection rates on labeled events, not full confusion-matrix metrics over all days. In particular, LPPLS can have high false-positive rates in bull markets, so the 100% figure should not be read as standalone trading precision.
| Method | Small (<15%) | Medium (15-30%) | Major (>30%) | Overall |
|---|---|---|---|---|
| LPPLS | 100% | 100% | 100% | 100% |
| DFA | 86% | 88% | 62% | 82% |
| Hurst | 57% | 65% | 50% | 59% |
| Max-Stability Kappa | 57% | 47% | 38% | 49% |
| Pickands | 43% | 53% | 50% | 49% |
| DEH | 43% | 41% | 62% | 46% |
| GPD VaR | 40% | 55% | 0% | 42% |
| GSADF | 14% | 59% | 38% | 38% |
| 36% | 35% | 50% | 38% | |
| Taleb Kappa | 21% | 35% | 50% | 33% |
| Max-to-Sum | 36% | 29% | 25% | 31% |
| Spectral | 21% | 29% | 38% | 28% |
| Hill | 29% | 29% | 25% | 28% |
#Major known crashes
Testing on four major crashes with pre-crash vs. calm period comparison:
| Crash | Max-Stability Kappa | GPD VaR | LPPLS | Hill |
|---|---|---|---|---|
| 2017 BTC Bubble | detected | detected | detected | missed |
| 2021 BTC Crash | detected | detected | detected | missed |
| 2008 Financial Crisis | detected | detected | detected | detected |
| COVID Crash 2020 | detected | — | detected | missed |
Max-stability kappa and LPPLS each scored 100% on known major crashes. GPD VaR detected 3 of 4. Hill scored 25%.
#Why Hill underperforms
Hill measures the tail index of the return distribution, but this property changes slowly. A 6-month pre-crash window doesn’t necessarily have thinner tails than a 6-month calm window because the calm window might include its own mini-shocks. The Hill estimator is useful for long-term characterization (this asset has $\alpha=3$, that one has $\alpha=4$) but not for short-term prediction.
#Why LPPLS dominates
LPPLS detects structure, not statistics. It’s looking for a specific pattern: accelerating growth with log-periodic oscillations. This pattern appears before both 10% corrections and 80% crashes. The tail-based methods need to see the tail thickening, which requires the crash to already be underway. LPPLS sees the bubble building.
The downside: LPPLS has a high false positive rate. It frequently detects “bubble signatures” during normal bull markets. The solution is combining it with the tail-based methods. If LPPLS says “bubble” and kappa says “tails thickening” and VaR is elevated, the signal is more reliable.
#Why DFA is the best non-bubble method
DFA detects regime shifts in the correlation structure of returns. Before a crash, markets transition from noisy mean-reverting behavior to strongly persistent trending. This transition is invisible to tail estimators like Hill or kappa, which measure distributional shape. DFA measures temporal dependence. The detrending step gives DFA an edge over Hurst’s R/S analysis (82% vs 59%) because raw R/S conflates local trends with long-range dependence. DFA strips out the local trends and measures the residual scaling.
The 82% detection rate puts DFA between LPPLS (100%, but with false positives) and the tail-based methods (28-49%). DFA is particularly strong on small and medium drawdowns (86% and 88%), where tail-based methods struggle because the distributional shift is subtle.
#Combined detector
The 13 methods fall into four independent categories: bubble detection (LPPLS, GSADF), tail estimation (Hill, Pickands, DEH, QQ, Max-to-Sum, Taleb Kappa, Max-Stability Kappa, GPD VaR), regime detection (DFA, Hurst, Spectral), and structural (multiscale agreement). When three or more categories independently signal elevated risk, the combined detector applies a bonus to the crash probability.
| Small (<15%) | Medium (15-30%) | Major (>30%) | Overall | |
|---|---|---|---|---|
| Combined (agreement bonus) | 64% | 94% | 75% | 79% |
The combined detector reaches 79% overall, with 94% on medium-sized drawdowns. The agreement requirement filters out most of LPPLS’s false positives while retaining most of its true positives. The gap between small (64%) and medium/major (94%/75%) drawdowns reflects the fact that small corrections often happen without prior tail thickening or regime change. They are genuine surprises, and no method should be expected to predict all of them. A stricter production evaluation should report precision, false-positive rate per year, and detection lead time under fixed thresholds.
#Long timescales
#54 years of forex (GBP/USD)
We tested on GBP/USD daily data from 1971 to 2025 (13,791 trading days):
| Decade | Hill alpha | Kappa/benchmark | Notable |
|---|---|---|---|
| 1970s | 2.92 | 0.78 | Oil crises, IMF bailout |
| 1980s | 4.36 | 0.68 | Plaza Accord |
| 1990s | 4.51 | 0.83 | Black Wednesday |
| 2000s | 2.90 | 0.57 | 2008 crisis |
| 2010s | 3.86 | 0.35 | Brexit |
| 2020s | 3.39 | 0.71 | Truss mini-budget |
Every decade shows fat tails. In our labeling, all six GBP/USD crisis events were detected (6/6):
- 1976 IMF Crisis
- 1985 Plaza Accord
- 1992 Black Wednesday
- 2008 Financial Crisis
- 2016 Brexit Vote
- 2022 Truss Mini-Budget
#All methods on daily forex (1971-2025)
We ran Hill, Kappa, Taleb Kappa, GPD, GEV, and LPPLS on 25 currency pairs from FRED daily data:
| Pair | N days | Hill $\alpha$ | $\kappa$/bench | Taleb $\kappa$ | GPD $\xi$ | VaR 99% | GEV $\xi$ | LPPLS |
|---|---|---|---|---|---|---|---|---|
| VEF/USD | 6,511 | 1.20 | 0.28 | — | -0.46 | 253.7% | 1.00 | bubble |
| HKD/USD | 11,290 | 1.73 | 0.18 | — | 0.56 | 26.3% | 1.00 | bubble |
| KRW/USD | 11,176 | 1.90 | 0.29 | 0.28 | -0.37 | 4.3% | 0.37 | bubble |
| MXN/USD | 8,058 | 2.04 | 0.36 | 0.26 | -0.34 | 4.6% | 0.28 | bubble |
| AUD/USD | 13,783 | 2.58 | 0.46 | 0.37 | -0.37 | 4.2% | 0.15 | bubble |
| INR/USD | 13,282 | 2.62 | 0.40 | 0.38 | 0.22 | 10.4% | 0.28 | bubble |
| BRL/USD | 7,773 | 2.80 | 0.63 | 0.44 | 0.39 | 30.2% | 0.11 | bubble |
| CAD/USD | 13,796 | 3.84 | 0.52 | 0.00 | 0.20 | 6.0% | 0.16 | bubble |
| GBP/USD | 13,790 | 4.13 | 0.53 | 0.00 | 0.17 | 7.8% | 0.03 | bubble |
| EUR/USD | 6,769 | 4.88 | 0.62 | 0.29 | -0.05 | 3.4% | 0.02 | bubble |
All 25 pairs show fat tails. Every kappa ratio is below 1 (below the Gaussian benchmark). GEV shape parameter $\xi > 0$ for nearly all pairs, confirming Frechet-type fat tails. LPPLS detected bubble signatures in 14 of 25 pairs. VaR 99% is reported as a one-day loss as a percentage of price.
Venezuela (VEF/USD) is the extreme case: $\alpha = 1.20$ with a 99% VaR of 253.7%. On the worst 1% of days, you’d lose more than 2.5x your position.
#500 years, 138 countries
Using the Clio Infra exchange rate dataset (1500-2013), we ran Hill, Kappa, Taleb Kappa, and GEV on every country with 50+ years of data.
Results:
| Tail regime | Countries | Percentage |
|---|---|---|
| $\alpha < 2$ (infinite variance) | 98 | 71% |
| $\alpha$ 2-4 (fat tails, finite variance) | 37 | 27% |
| $\alpha > 4$ (moderate tails) | 3 | 2% |
71% of countries have exchange rate distributions with infinite variance. The median $\alpha$ across all 138 countries is 1.57. GEV confirms this: 81% of countries show Frechet-type (fat) tails with median $\xi = 0.76$.
The most extreme cases:
| Country | Years of data | Hill $\alpha$ | Taleb $\kappa$ | Worst year | Best year |
|---|---|---|---|---|---|
| Syria | 61 | 0.32 | — | -2% | +105% |
| Iraq | 61 | 0.40 | — | -38% | +883% |
| Germany | 153 | 0.52 | 1.00 | -2,748% | +2,104% |
| Nicaragua | 76 | 0.52 | — | -2,159% | +787% |
| Zimbabwe | 56 | 0.55 | — | -12% | +1,345% |
| Hungary | 66 | 0.56 | — | -944% | +247% |
| Peru | 64 | 0.72 | — | -1,660% | +426% |
| Bolivia | 63 | 0.76 | — | -1,794% | +494% |
| Brazil | 129 | 1.03 | — | -3,536% | +318% |
| Argentina | 102 | 1.28 | 1.00 | -2,748% | +388% |
Germany’s -2,748% in a single year is the Weimar hyperinflation. Brazil’s -3,536% reflects the cruzeiro collapse. These aren’t outliers. They’re exactly what a distribution with $\alpha < 1$ predicts.
#Century-by-century: United Kingdom (1789-2013)
The UK has 224 years of continuous exchange rate data:
| Century | Hill $\alpha$ | Regime |
|---|---|---|
| 1800s | 1.19 | Infinite variance (Napoleonic wars, banking crises) |
| 1900s | 3.17 | Fat but finite (Bretton Woods stability) |
| 2000s | 2.04 | Back to borderline infinite variance |
Even within a single country, tail regimes shift across centuries.
#Inflation: 500 years, 82 countries
Inflation data from Clio Infra (1500-2010):
| Statistic | Value |
|---|---|
| Countries analyzed | 82 |
| Countries with hyperinflation (>100%/yr) | 32 (39%) |
| Countries with $\alpha < 2$ | 36 (44%) |
| Median $\alpha$ | 2.14 |
The most extreme inflation tails:
| Country | Years | Hill $\alpha$ | Max inflation |
|---|---|---|---|
| Nicaragua | 71 | 0.30 | 13,110%/yr |
| Zimbabwe | 83 | 0.44 | 24,411%/yr |
| Germany | 494 | 0.57 | 211,427,400,000%/yr |
| Brazil | 226 | 0.63 | 2,948%/yr |
| Peru | 363 | 0.80 | 7,482%/yr |
| Argentina | 274 | 0.85 | 3,079%/yr |
| China | 336 | 0.86 | 1,579%/yr |
| Poland | 414 | 0.97 | 4,738%/yr |
Germany has 494 years of inflation data with $\alpha = 0.57$. Its maximum annual inflation was 211 billion percent (Weimar 1923). With $\alpha < 1$, neither the mean nor the variance of this distribution converges. You cannot compute a confidence interval. You cannot build a VaR model. The standard toolkit breaks down.
#Conclusions
-
Fat tails are universal. Every asset, every timescale, every country. From daily BTC returns to 500 years of exchange rates. 71% of countries have exchange rate distributions with infinite variance. This isn’t an anomaly. It’s the default state of financial markets.
-
LPPLS is the best single crash detector (100%) because it detects bubble structure, not tail statistics. But it has false positives. It flags bubble signatures during normal bull markets too.
-
DFA is the best non-bubble method (82%). It detects regime shifts in temporal dependence, not distributional shape. The detrending step makes it robust to non-stationarity where simpler methods like Hurst (59%) are confused by local trends.
-
Max-stability kappa is the best tail-based method (49%). More robust than Hill because it benchmarks against Gaussian rather than estimating a single parameter. Taleb’s kappa (33%) is weaker for short-term crash detection but excels at characterizing long-term tail regimes: it saturates at 1.0 for countries with extreme tails like Germany, Argentina, and Austria.
-
The combined detector reaches 79%. When bubble, tail, and regime methods independently agree, the signal is reliable. The agreement requirement filters most of LPPLS’s false positives while preserving 94% detection on medium-sized drawdowns. No single method is sufficient, but 13 methods organized into four independent categories provide practical crash detection.
-
Standard risk models are wrong for most countries. Modern Portfolio Theory, CAPM, Black-Scholes: all assume finite variance. For 71% of the world’s currencies, this assumption is empirically false. The math doesn’t just give wrong answers; it gives answers to the wrong question.
-
Hyperinflation isn’t rare. 39% of countries experienced >100% annual inflation at some point. Germany’s 211 billion percent is extreme, but dozens of countries experienced four- and five-digit inflation. Any model that treats these as “outliers” is a model that doesn’t understand the data it’s modeling.
The code is open source: github.com/unbalancedparentheses/fatcrash. The forex data comes from forex-centuries.