The Limits of Knowing

A muscular nude figure of Newton crouched on a rock at the sea floor, bent over a scroll, drawing with a compass, ignoring the dark rocks behind him
Newton, William Blake, 1795 (reworked c. 1805). Tate, London (public domain).

This series has spent five essays building tools. Bouchaud’s branching ratio, Sornette’s critical time, Scheffer’s slowing-down, all of them measure the same thing in different clothes: how close a system is to its edge. This last essay is about the catch that has been waiting the whole time. That number, the distance to the edge, is the single hardest thing to measure on exactly the kind of system where it matters. Nassim Taleb built a career on this objection, and it is the strongest argument against everything that came before.

#I. Where counting stops working

Most of statistics rests on a quiet promise: collect enough data and the average settles down to the truth. Flip a coin a thousand times and the fraction of heads sits near one half. Measure a thousand people and the average height barely moves when you add the next. This is the law of large numbers, and in a thin-tailed world it works beautifully.

In a fat-tailed world it barely works at all.

Two panels, each a running average against the number of observations. Left, thin-tailed data: the line wobbles at first and quickly flattens to a steady value. Right, fat-tailed data: the line keeps lurching upward in steps, jumping each time a new extreme arrives, never settling.

Left, with thin tails the average settles fast, so a modest sample tells you the truth. Right, with fat tails the average keeps jumping with each new extreme and never settles, so no sample you will ever collect pins it down.

Look at the right panel. The running average does not converge. It sits flat for a while, then a single huge observation arrives and yanks it upward, and this happens again and again, forever. The sample is dominated by its largest member, and the largest member keeps being beaten. For a fat-tailed quantity, the thing you most want to know, the size of the typical large event, is the thing your data refuses to reveal, because the data is always one big surprise away from rewriting it.

Taleb’s word for this is preasymptotics. The textbook results are about what happens with infinite data. With the finite data you actually have, a fat-tailed world is a place where the worst event so far is never a good estimate of the worst possible, where the mean may be unknowable, and where the next observation can dominate everything before it. The tail is where all the risk lives, and the tail is precisely the region you can never sample enough of.

#II. Honesty makes it worse

Now suppose you set that aside and insist the world is thin-tailed after all. You fit a nice bell curve. Even then there is a trap, and it is subtle.

You never know the bell curve’s width exactly. You estimate it from data, and that estimate has error. And the error has its own error, because your method of estimating is itself uncertain, and so on. Taleb and Pasquale Cirillo showed what taking that seriously does. It does not just widen your uncertainty politely. It fattens the tail.

A log-scale plot of two curves. One, sure of its spread, is a bell curve that drops away steeply. The other, unsure of its spread, sits above it far out in the tails, so the rare extremes are much more likely.

Start with a bell curve and admit you are unsure of its width. Average over that uncertainty and the result has heavy tails: the rare extremes become far more likely. Being honest about not knowing the parameter turns a thin tail into a fat one.

The mechanism is clean once you see it. Some of the time the true spread is larger than your best guess, and those wide cases throw out extremes far more often than the narrow cases suppress them, because the extremes scale up faster than they scale down. Mix over your uncertainty and the blend is heavy-tailed even though every ingredient was a bell curve. Taleb and Cirillo call the consequence the forecasting paradox: the distribution you should use to predict is heavier-tailed than the one you fit to the past, so the future is structurally more extreme than the data you trained on. Thin-tailed certainty is not just hard to find. It is unreachable from the inside, because the act of being honestly uncertain manufactures fat tails on its own.

#III. What this does to the whole series

Hold this up against the previous five essays and it cuts deep.

Sornette’s critical time is estimated by fitting a shape to a price run. Bouchaud’s branching ratio is estimated by fitting a self-exciting process to trades, and we already saw two careful groups disagree on whether it rises or holds steady. Scheffer’s warning signs are estimated from the variance and autocorrelation of a noisy series. Every one of them is an estimate of how close a system sits to its critical point, and the critical point is exactly where estimates become unstable and sample-hungry, on data that is fat-tailed in the first place. The tools tell you the system is near the edge, and being near the edge is what makes the tools’ own readings shaky. Taleb’s claim, put plainly, is that the number these methods chase is the number their own setting forbids you from trusting.

There is a worthy opponent here, and it is not Sornette. It is Paul Embrechts, the dean of extreme value theory, whose whole discipline is the rigorous estimation of tails: fit the right curve to the exceedances, estimate the tail index, extrapolate past your largest data point in a principled way. To Embrechts, the tail is hard but tractable, and Taleb’s pessimism throws away a working toolkit. Taleb’s reply is that extreme value theory relocates the problem rather than solving it: now you must estimate the tail index from the few points in the tail, and that estimate is itself unstable and threshold-dependent, which is the errors-on-errors problem wearing a lab coat. That argument is unresolved, and probably unresolvable, because settling it would take exactly the abundant tail data whose absence is the whole point.

#IV. So what do you actually do

The wrong lesson is despair. Taleb is not a nihilist, and neither is this series. The right lesson is that you do not need to measure the edge precisely to act well near it.

If you cannot know the probability of the tail, stop trying to forecast it and change your exposure instead. Carry redundancy and slack, the very things the efficiency essay said get optimized away. Cut leverage and tight coupling, because they are what turn a local shock into a cascade. Hold cheap insurance against the moves you cannot rule out, and buy it when it is cheap, which is during the calm. Prefer positions that lose a little if you are wrong and gain a lot if the rare event lands, rather than the reverse. None of this requires knowing when, or how close to the edge, or which grain triggers the slide. It only requires believing that the edge is real and probably nearer than it looks.

That is how the series resolves. The diagnostics from the earlier essays are not crystal balls and should not be traded as if they were. They are regime detectors, and their honest job is qualitative: to tell you when a market has stopped being an ordinary thin-tailed place and has loaded itself near its critical edge. Sornette’s accelerating climb, Bouchaud’s branching ratio near one, Scheffer’s slowing recovery, all of them are ways of noticing the same condition. And once you have noticed it, the move is not to predict the crash. It is to stop standing where it will land.

Blake painted Newton at the bottom of the sea, so absorbed in the diagram under his compass that he never turns to see the dark rock behind him. The figures in this series, Bouchaud, Sornette, Scheffer, are doing real and beautiful work with the compass. Taleb is the one tapping the shoulder, pointing at the dark. You want both. Measure what you can, and build for what you cannot.

#Further reading

The statistics of fat tails:

Errors on errors and the forecasting paradox: