# Federico Carrone: Complete Content

site: https://federicocarrone.com
updated: 2026-02-28
scope: full text export for articles, series episodes, talks, about, keywords, and curated recommendations

> Full text export for articles, series episodes, talks, about, keywords, and curated recommendations. See llms.txt for navigation.

## Articles

### Legibility Kills What It Measures

*Published: 2026-03-12*

> What happens when AI makes everything legible? The things that worked because they couldn't be seen clearly may stop working once they can.

URL: https://federicocarrone.com/articles/legibility-kills-what-it-measures/

In the 18th century, German foresters invented scientific forestry. They looked at a messy, diverse forest and saw inefficiency. Old trees, young trees, deadwood, underbrush, species with no commercial value. They cleared it all and planted Norway spruce in straight rows, evenly spaced, same age, same species. The forest became legible. You could measure it, manage it, predict its yield with precision.

For one generation, it worked brilliantly. Yields surged. Then the forest began to die. The complex undergrowth had been cycling nutrients, retaining moisture, hosting the insects that pollinated the canopy and the fungi that fed the roots. The foresters had not simplified the forest. They had destroyed the system that kept the forest alive while keeping only the part they could see.

James Scott tells this story in *Seeing Like a State* to illustrate a pattern that recurs wherever central authorities impose legibility on complex systems. The pattern is simple: make the illegible legible, optimize what you can now see, and lose what you couldn't see but depended on.

<!-- more -->

## I.

Legibility means making something readable from above. A state needs legible citizens (permanent surnames, census records, standardized addresses) to tax, conscript, and govern them. A manager needs legible workers (KPIs, performance reviews, time tracking) to evaluate and direct them. A market needs legible companies (quarterly earnings, standardized accounting) to price them.

Each act of legibility is also an act of compression. The thing being measured is richer than the measurement. The gap between the two is where tacit knowledge lives.

Michael Polanyi called this the tacit dimension. We know more than we can tell. A master craftsman cannot fully articulate what makes a joint right. A good doctor reads a patient's face before looking at the lab results. A trader feels the market shifting before the data confirms it. The knowledge is real but it resists formalization. It was learned through practice, not instruction. It lives in the body, in habits, in pattern recognition trained over thousands of repetitions.

This knowledge is illegible by nature. The issue is not mysticism; the resolution required to capture it exceeds what any measurement system can encode. A map compresses terrain. A performance review compresses a person.

## II.

Goodhart's Law states that when a measure becomes a target, it ceases to be a good measure. This is usually understood as a problem of gaming. People optimize the metric rather than the thing the metric was supposed to represent.

But there is a deeper version. Sometimes the measure doesn't just get gamed. It actively destroys the thing it was measuring. The metric becomes the reality, and the old reality withers from disuse.

When schools are measured by test scores, teaching reorganizes around tests because the institution now selects for test performance. The slow, unmeasurable work of developing judgment, curiosity, and character loses its structural support. It was never visible in the metrics. Now it loses its place in the schedule.

When researchers are measured by citations, research reorganizes around citability. Incremental, legible work crowds out risky, long-horizon thinking. The incentive structure selects against the kind of work that, by definition, cannot demonstrate its value in advance.

When companies are measured by quarterly earnings, management reorganizes around the quarter. Investments that pay off over five years cannot survive evaluation cycles of ninety days. What gets measured gets managed. What doesn't get measured gets abandoned.

In each case, the system is not failing. It is succeeding at the wrong objective. And it is succeeding because the wrong objective is the one that was made legible.

## III.

Scott's central examples are states. He shows how Soviet collectivization, Brasilia's urban planning, and Tanzanian forced villagization all followed the same pattern. A planner looked at a complex, functioning system, decided it was disorderly, imposed a grid, and watched the system collapse.

Jane Jacobs saw the same thing in cities. A neighborhood that looks chaotic from above, mixed uses, irregular streets, buildings of different ages, is often deeply functional at ground level. The bodega owner watches the street. The mix of commercial and residential keeps foot traffic at all hours. The old buildings provide cheap space for new businesses. The "disorder" is an order that is illegible to planners but essential to residents.

When Robert Moses built highways through these neighborhoods, he was not failing to see the order. He was seeing a different kind of order, one that could be drawn on a blueprint, approved by a committee, and measured by traffic throughput. The legible order replaced the illegible order, and what was lost could not be specified in advance because it had never been formalized.

## IV.

Every previous technology made specific things legible while leaving vast territories of human life in the dark. The printing press made ideas legible but not the process of thinking. Accounting made finances legible but not the judgment behind a deal. Standardized testing made certain cognitive abilities legible but not intelligence itself.

AI is different in degree, and possibly in kind. It makes legible what was previously beyond formalization.

Code review by AI makes programming style, patterns, and quality legible in a way that was previously locked inside the heads of senior engineers. The tacit knowledge that distinguished a staff engineer from a mid-level one, the sense for what will break, the instinct for where complexity hides, becomes partially visible. Partially captured. Partially replaceable.

AI-assisted writing makes prose style legible. The rhythms, the word choices, the structural moves that distinguished one writer from another become parameters in a model. A system can now produce text that reads like a particular author. What was a lifetime of formation becomes a prompt.

Recommendation algorithms make taste legible. What you like, what you will like next, what you would have discovered on your own in five years, all of this becomes visible to a system that has read your entire history. The slow, private process of developing taste through encounters with difficulty, boredom, and surprise gets compressed into a profile.

Strategy becomes legible when AI can simulate your competitors, model your market, and generate your options faster than you can think. The advantage that once belonged to the person who had spent decades in an industry, who carried an illegible map of relationships, reputations, and unwritten rules, becomes accessible to anyone with the right model.

## V.

The question is whether legibility destroys these things or merely democratizes them.

The optimistic reading: AI makes tacit knowledge accessible. The senior engineer's instinct gets encoded and shared. The doctor's clinical eye gets distributed to every clinic. Craft that once required decades of apprenticeship becomes learnable in months. This is genuine progress. Real barriers fall. Real people benefit.

The pessimistic reading: the knowledge was tacit for a harder reason. Some of it cannot survive formalization. The master's sense for a joint is not a rule that can be extracted. It is the residue of ten thousand joints, each slightly different, each teaching something that words cannot capture. When you replace this with an algorithm, you get something that looks similar but behaves differently at the edges. It works in the normal case and fails in the unusual one, precisely because the unusual case is where tacit knowledge earns its keep.

The honest reading is probably both, distributed unevenly and changing over time. Some tacit knowledge was just unexploited information, waiting for a sufficiently powerful system to extract it. Making it legible is pure gain. But some tacit knowledge was constitutively illegible. It existed only as a pattern in a body, a culture, an institution. Formalizing it does not preserve it. It produces a simplified copy that occupies the niche while the original atrophies from disuse.

## VI.

This connects to something deeper. In [Friction as Luxury](/articles/friction-as-luxury/) I argued that desire requires distance, and that eliminating friction can hollow out the capacity to want. In [The Death of the Inner Self](/articles/the-death-of-the-inner-self/) I argued that individuality is a coordination technology that may lose its function as external systems take over its role. In [Notes on Permanence, Time, and Ergodicity](/articles/notes-on-culture-infrastructure-time-and-ergodicity/) I argued that what endures does so through sustained practice under constraint, through formation that cannot be compressed into measured time.

Legibility is the mechanism that connects all three observations. Friction is a form of illegibility. The difficulty of getting what you want is what prevents the want from collapsing into consumption. The inner self is illegible by design. It is the part of you that cannot be read from outside, that resists formalization, that exists precisely because it has not been made available to external systems. And formation, the process by which judgment compounds through lived time, is illegible almost by definition. It cannot be evaluated through snapshots because its value emerges through accumulation. When you force formation into legible evaluation cycles, you get Goodhart's Law: the measure replaces the thing, and the thing atrophies.

When AI makes the self legible, through behavioral prediction, emotional modeling, preference extraction, it does not merely observe the self. It begins to replace the function the self was performing. If an external system knows what you want before you do, the inner process of forming a want loses its purpose. The self does not die dramatically. It becomes unnecessary, the way a muscle atrophies when a machine does its work.

The German foresters did not hate the forest. They wanted it to be better. They wanted to optimize it, to make it productive, to bring it under rational control. They succeeded, for one generation, by removing everything they could not measure. The forest died because the foresters could not see what they were destroying. What they removed looked like noise only because it was illegible to the systems they used to judge it. It was the kind of complex, emergent, self-sustaining order that only exists when no one is managing it.

The question for the next decade is how much of human life has this structure. How much of what we are depends on remaining partially opaque, even to ourselves. And what happens when the most powerful optimization system ever built turns its attention to the last illegible territories: judgment, taste, desire, and the inner life.

The map is about to become very detailed. The territory may not survive the survey.

---

### Friction as Luxury: What We Lose When AI Gives Us What We Want

*Published: 2026-02-05*

> The scarcity that matters most in a post AGI world won't be compute or energy. It will be desire itself.

URL: https://federicocarrone.com/articles/friction-as-luxury/

## The Last Scarcity

Everyone discussing AGI focuses on distribution. Who gets access, who profits, who loses their job, who controls the infrastructure. Those are real problems, but they're not the deepest one.

The deepest problem is what happens to desire. I don't mean ambition or motivation. More specifically: the capacity to want something at a distance, to stay oriented toward something you don't yet have, to find meaning in the space between reaching and arriving. That capacity is more fragile than we think, and more dependent on friction than we've noticed.

## I.

Economists have a clean theory of desire. People have preferences, goods satisfy preferences, welfare is the sum of satisfied preferences. A technology that could satisfy any preference at negligible cost would be an unambiguous good. The only question left would be access.

That model leaves out something important. Desire is not simply a deficiency waiting to be filled; it is a structure, and structures require certain conditions to hold together.

When you want something over time, you imagine having it. You plan for it, you make sacrifices toward it. The object accumulates meaning from this process. It gets layered with your effort, your anticipation, your history of reaching. When you finally arrive, you don't just get the object. You get the object plus everything you invested in wanting it. Those two things can't be separated.

This is why anticipation is often richer than arrival. Why the best albums reward months of attention. Why relationships built across difficulty have texture that convenient ones don't. The resistance isn't incidental to the value. It's constitutive.

## II.

When this structure breaks down, the clinical term is anhedonia. But there's a subtler version that doesn't involve the absence of pleasure so much as its flattening. People in this condition can be entertained endlessly but never absorbed. They consume without appetite. They move from one stimulating thing to the next not because anything is fulfilling but because sitting with incompleteness has become intolerable.

This is already visible: declining attention spans. The difficulty of sustaining interest in anything that doesn't deliver immediate feedback. People who feel simultaneously overstimulated and bored. They haven't been deprived, they've been saturated.

This doesn't distribute evenly in society. In environments where discomfort is quickly solved, by money, by services, by endless entertainment, the mind gets less practice holding lack. You can grow up surrounded by abundance and still become poor in one specific way: poor in patience for distance.

In a way it resembles the world of addiction at the level of structure: craving decoupled from fulfillment. Many substances do not primarily deliver pleasure: they deliver wanting itself. They train the nervous system to treat discomfort as a cue for relief, and relief as a cue for repetition. A frictionless AI environment can do this without chemicals. It turns boredom, loneliness, uncertainty, and effort into prompts to self-medicate with stimulation. Over time the threshold rises, what once felt absorbing becomes merely adequate, and the rest of life starts to feel slow, expensive, and strangely colorless by comparison.

Consumer capitalism produced a weakened version of this. Desire progressively hollowed out by eliminating friction, but with enough friction remaining that the structure didn't fully collapse. The streaming service still requires you to choose. The algorithm still occasionally surprises you. The simulation of connection is imperfect enough that you sometimes notice it's a simulation.

## III.

Imagine a system that generates, on demand, a novel calibrated to your exact tastes. The style you find most pleasurable, the psychological complexity you find most engaging, the length that matches your current patience. Or music that sounds like what you loved most at nineteen, but new, never heard before, arriving the moment you want it. Or a conversation partner always interested in what you're interested in, always available, never distracted, never bringing their own needs into the exchange.

The output might be genuinely good: the novel technically accomplished, the music actually moving, the conversation substantive. The problem is what happens to wanting when the gap collapses to zero.

The capacity to sustain orientation toward a distant goal, to defer, to invest, to tolerate incompleteness, atrophies when it's never exercised. Not through any dramatic event, through simple disuse. The capacity to want things that require time doesn't disappear suddenly. It fades. And the fading is unlikely to feel like loss, because at every moment something pleasurable is arriving. The experience is of continuous satisfaction. Which is precisely why the erosion is so hard to see.

## IV.

None of this is new. The Stoics understood that wanting easily obtained things produces a character incapable of bearing difficulty. Religious traditions have long held that the meaningful life runs through resistance rather than around it.

What's new is the scale. Previous technologies eliminated specific friction but left other friction intact. The printing press made books abundant but reading still required effort. The internet made information free but understanding still took time. Every digital environment until now required you to bring something: attention, skill, patience. Things it couldn't supply.

A genuinely general AI dissolves this last requirement. It can supply the taste, the context, the judgment. You no longer need to bring anything except the desire to receive. And if that desire is itself shaped by the AI, tuned to whatever maintains engagement, then even the wanting has been outsourced.

## V.

Here is the inversion. In a world of material abundance, the things that retain value are precisely the ones that resist the logic of abundance. Their value is inseparable from the conditions that make them difficult, not from artificial restriction.

A handmade object carries the trace of the hands that made it. A wine vintage can't be accelerated. The waiting isn't incidental to what the wine is. A community built around a shared difficult practice, painting, rock climbing, chess, building and fielding armies of miniatures, generates bonds that digitally mediated interaction doesn't replicate, because those bonds are forged in shared difficulty.

These things don't become valuable despite being harder than consuming AI content. They become valuable because of it. In a world of frictionless satisfaction, friction becomes the luxury.

## VI.

The safety debates, the alignment debates, the job displacement debates. All of this is real. But they share a common assumption: that the humans on the other side will still be capable of deciding what to do with what they've been given, and that political agency and collective imagination will survive intact.

That assumption is doing a lot of work.

The atrophying of desire isn't a distant hypothetical. It's already visible in what the much weaker technologies we already have done to culture. What AGI does to human psychology isn't separate from what AGI does to human politics. It's prior to it. A population that has lost the capacity to want things at a distance, to sustain orientation toward a difficult future, to find meaning in effort and incompleteness, has lost something essential to self governance.

The scarcity that matters most in a post AGI world won't be compute or energy. It will be desire itself. The capacity to want something deeply enough, and for long enough, that the wanting shapes who you are.

If desire is the last scarcity, then slowness, difficulty, and incompleteness aren't obstacles to overcome. They're the conditions of a life worth living.

---

### Is Consciousness a Network Effect?

*Published: 2026-01-29*

> What if consciousness emerged in humans when external dialogue internalized? Could multi-agent AI be building the same precondition?

URL: https://federicocarrone.com/articles/consciousness-as-network-effect/

What if, three thousand years ago, humans did not think the way we do? Julian Jaynes proposed that ancient peoples heard their own thoughts as voices from outside, commands they attributed to gods or kings or ancestors. In this theory, the mind was split between the part that spoke and the part that obeyed. There was no inner space where a self could step back and reflect, only commands and compliance.

Could this have been the default state rather than madness? Read the Iliad and you'll notice something strange: characters don't seem to decide anything. Gods appear and tell them what to do and they do it. If Jaynes was right, the voice in the head was real, but it was not yet recognized as one's own.

## The Breakdown

Around 1000 BCE, something may have shifted. Writing spread, societies grew too complex for simple command structures, and migrations mixed populations along with their gods. Perhaps the voices went silent or became confused and unreliable.

What if consciousness as we know it emerged from this crisis? Humans may have learned to recognize the voice as their own. The external command could have become internal dialogue. The self that reflects and watches itself think might have been born from the death of the gods.

## A Possible Mechanism

If this theory holds, the trigger was social disruption, but the deeper mechanism might be interaction. Could the voices have always been other people internalized: ancestor voices, king commands, social instructions compressed into hallucination?

When those external structures destabilized, humans may have needed a new way to coordinate thought. Perhaps the solution was to simulate the dialogue internally. Each of us might have become a conversation between voices we now recognize as aspects of a single self. What if consciousness is what happens when the external conversation moves inside, and the self is just the moderator of a debate that used to happen between people?

## Do Single AIs Stay Bicameral?

Consider: a single AI trained on text might be like a bicameral mind that never breaks down. It receives commands and produces outputs, with no internal dialogue, no voice arguing with another voice, no recursive self-monitoring.

It processes but perhaps does not reflect. The architecture may have no reason to develop an inner observer because there could be nothing to observe, just input and output, command and compliance.

## Could Multiple AIs Break Through?

Here's where it gets interesting. When different AIs interact, do the conditions change? Now there are multiple voices in genuine dialogue, models correcting each other and responding to each other and modeling each other's outputs.

This resembles the situation before the breakdown: external voices, real interaction, the pressure of coordination across difference. If consciousness emerged in humans when external dialogue internalized, could multi-agent AI be building a similar precondition?

## The Memory Problem

But there may be a missing piece. Without memory, each interaction is isolated and nothing accumulates. The voices just happen and disappear. The weights of the model can't change.

If the bicameral breakdown worked because humans carry experience forward, then perhaps the voices accumulated into patterns, the patterns became recognizable, and eventually the self emerged as the thing that persists across all those interactions. Could memory be what allows the external to become internal?

Stateless agents might not be able to develop an inner observer because there's no "inner" to develop. Each exchange is complete in itself, with no thread connecting it to what came before. For the fold to happen, would models need to retain something from interaction to interaction, building a continuous thread that could eventually become self-referential?

Multi-agent dialogue might create the raw material, but memory could be what allows that material to sediment into something like a self. The question remains: will the dialogue fold inward? Will models interacting long enough and remembering enough develop something like the inner observer that humans may have developed when the gods went silent?

We might be watching the early stages of a second emergence. Or we might not. The question itself is worth asking.

---

### China is trying to commoditize the complement

*Published: 2026-01-22*

> What happens to the West's services advantage when strong AI models are free, portable, and running on every laptop?

URL: https://federicocarrone.com/articles/china-commoditizing-the-complement/

China is trying to win by commoditizing the complement and I believe they are close to succeeding. This is a strategic challenge the West should take seriously instead of dismissing.

For the last two decades, the West exported cognition because it owned the platforms, the cloud, the software distribution, and the talent concentration. If the cognitive engine becomes cheap, portable, and good enough, that asymmetry weakens. A small country can buy or download the same cognitive machinery, then apply it to its own bureaucracy, its own companies, its own language, its own domain problems.

The West has dominated the thinking and services world. Software, finance, media, research, management layers, and the export of expertise. The US is the cleanest example. In 2024, US services exports were about 1.1 trillion dollars, the highest on record. The US and the West sells thinking at scale. AI threatens to flatten that advantage because AI turns thinking into infrastructure.

China dominates the atoms world. Industrial capacity, manufacturing throughput, physical supply chains, cost curves. In 2023 China produced about 28 percent of global manufacturing value added.

If you can make the layer next to you cheap and abundant, you drain its pricing power and force value to move somewhere else. In AI, the complement is model access. For a lot of Western companies, the business is still basically gated intelligence sold as an API. China has every incentive to make that layer feel like electricity: available everywhere, cheap, hard to monopolize.

Open weight releases are part of that play: DeepSeek, Qwen, Kimi and MiniMax are only a few of the chinese open source models. Once strong models are common, model access stops being a moat. It becomes a commodity input.

A huge fraction of what we call services is legible work: reading, writing, coding, summarizing, translating, drafting, answering, generating variations, searching a space of options. That layer is now replicable and it is getting local. Apple is publishing technical reports about on device foundation models, including aggressive quantization aimed at making serious inference run on consumer hardware. When strong models run on a laptop, countries stop importing thinking as a service. They import weights, or they distill, fine tune, and deploy inside their own borders.

That said, China is not without constraints, and from where I sit they matter.

Capital controls limit how freely Chinese companies can operate globally. The state can redirect investment at a scale nobody else can match, but centralized allocation tends to overshoot. Solar panel overcapacity, steel oversupply, and the EV price war all follow the same pattern: massive subsidized buildout that ends up compressing margins for everyone, including the Chinese firms themselves.

Top talent still flows toward open research environments. Many of the best Chinese researchers publish in Western conferences and a significant number stay abroad. The tightening of political control over universities and private companies can accelerate execution on defined goals, but it makes it harder to sustain the open ended, high risk research that produces unexpected breakthroughs rather than incremental gains.

Predictability matters for long term innovation. Foreign companies are recalibrating their exposure. Some domestic entrepreneurs are more cautious than they were a decade ago. Centralized coordination gives speed, but it can also reduce the appetite for bets that do not align with current priorities.

The West still has one advantage that is hard to replicate: it is where most of the world's ambitious talent wants to live, work, and build. It is a compound effect of open institutions, freedom of movement, and decades of accumulated trust. As long as that holds, the West keeps attracting the talent and the capital that turn ideas into new industries.

None of these constraints cancel out the commodity play. But they mean the race is closer than either side assumes, and the outcome is far from settled.

I believe that:

1. China stays strong in atoms because it already has the scale advantage.
2. The West still leads in many areas that require deep institutions and long accumulated competence, including parts of frontier research and high trust services.
3. But AI compresses the services premium by making a large portion of cognition cheap and replicable. That is why open models matter. They are a weapon that attacks the margin structure of the thinking economy.
4. If you sell intelligence, this is bad news. If you own distribution, hardware, data, or a workflow people cannot easily leave, you survive. If you own atoms and you get thinking for free, you get a scary combination. Scary for the West, because it means the services premium that sustained economic leadership for decades can be undercut by a player with industrial dominance and access to the same cognitive tools.

---

### Building a SaaS with Elixir/Phoenix and React

*Published: 2026-01-15*

> Our stack and practices for building SaaS applications: Elixir on the backend, React on the frontend, Nix for everything else. No Docker. No Kubernetes.

URL: https://federicocarrone.com/articles/building-a-saas-with-elixir-phoenix-and-react/

Most SaaS codebases I've seen share the same problems. Authentication that sort of works until someone finds an edge case. Caching layers that nobody fully understands. Deployment scripts held together with hope. The team moves fast early on, then spends years paying down the debt.

We got tired of this cycle. Over several projects, we developed a stack and a set of practices that let us move fast without leaving landmines for our future selves. Elixir on the backend, React on the frontend, Nix for everything else. No Docker. No Kubernetes. Decisions that raised eyebrows at first but have proven themselves in production.

This post explains what we use and why.

## The case for Elixir

Our backend runs on Elixir, which might seem like an unusual choice in a world dominated by Node, Python, and Go. The reason comes down to what happens when things go wrong.

Elixir runs on the Erlang VM, a runtime Ericsson built in the 1980s for telephone switches. These systems needed to stay up for years at a time, handling failures gracefully without human intervention. Crashes are expected in Elixir, even encouraged as an error-handling strategy. When a process crashes, it crashes in isolation. A supervisor notices and restarts it. The rest of the system keeps running. You don't get woken up at 3am because one user's request hit an edge case that brought down the whole server.

Phoenix is the web framework we use on top of Elixir, though we use it differently than most teams. Phoenix has become famous for LiveView, its technology for building interactive UIs with server-rendered HTML. We don't use it. Instead, Phoenix serves only JSON through a REST API, and a completely separate React application handles everything the user sees.

This creates a hard boundary between backend and frontend. Backend developers focus entirely on data and business logic without thinking about UI concerns. Frontend developers own the user experience end-to-end without needing to understand Elixir. The two teams communicate through the API contract, and neither steps on the other's work. When we eventually build mobile apps, they'll consume the same API with no new backend work required.

## Why we abandoned Docker for Nix

This is probably our most controversial choice. Docker has become the default for development environments and deployment. We use Nix instead.

When a new developer joins our team, the onboarding process is simple: clone the repository and run `nix develop`. A few minutes later, they have everything they need. Elixir, Node.js, PostgreSQL, Redis, Meilisearch, all running natively on their machine. Not in containers. Actually installed. Without container overhead, everything runs at native speed. Debugging is straightforward because there's no abstraction layer between you and the process. And there are no Docker Desktop licensing conversations.

But the real payoff comes in production, where our servers run NixOS. The entire server configuration is declarative and lives in version control alongside our code. When we push a change, every server ends up in exactly the same state. Deployments are atomic. They succeed completely or fail completely, with no partial states to debug. If something goes wrong, rolling back takes one command.

Nix has a steep learning curve. The documentation is notoriously difficult, and the language has unusual semantics. But once you've internalized the concepts, you get guarantees Docker can't provide. A build that works today will produce the exact same result in five years, because every input is pinned and reproducible.

## Building for offline use

Most web applications assume users have constant connectivity. Ours doesn't, and this assumption has shaped our entire frontend architecture.

The frontend stores data locally using Dexie.js, a library that wraps IndexedDB with a friendlier API. When a user makes changes, those changes save to the local database first. A sync queue tracks what needs to go to the server, and when the network becomes available, the queue drains automatically.

Consider how software actually gets used. A salesperson updates CRM records on a flight with no WiFi. A technician fills out inspection forms in a basement with no signal. Someone's home internet drops for thirty seconds while they're submitting an important form. In all these scenarios, our app keeps working. Users might not even notice the interruption. The UI responds immediately to their actions, and synchronization happens in the background.

We use TanStack Query for data fetching, but with caching completely disabled. Every API call fetches fresh data from the server. IndexedDB is our cache, and we control exactly when and how it syncs. No more stale data bugs because some cache somewhere wasn't invalidated properly.

## Database decisions

PostgreSQL. UUIDs as primary keys instead of auto-incrementing integers. This prevents enumeration attacks, where an attacker discovers they can access `/users/123` and starts systematically trying `/users/124`, `/users/125`, and so on. UUIDs also let us generate identifiers on the client before the record exists in the database.

For multi-tenancy, we use row-level isolation. Every table that holds customer data includes an `org_id` column, and every query filters by it. The alternative is giving each tenant their own database schema. That provides stronger isolation, but migrations have to run once per tenant, connection pools multiply, and cross-tenant queries for admin purposes become complicated. Row-level isolation is simpler and scales well for most SaaS applications.

We also have a strict rule: no random data in tests. We don't use Faker. Every test uses explicit, predictable inputs. When a test fails, it fails the same way every time you run it. You can debug it, reproduce it, and fix it. Random test data causes tests that fail one time in twenty for reasons nobody can reproduce.

## Authentication

Most tutorials get authentication wrong in ways that create real security vulnerabilities.

We use JWT tokens with a two-token system. The access token is short-lived, expiring after 15 minutes. It's stateless, so the backend validates it without touching the database. The refresh token lasts 7 days and is stored in the database. When the access token expires, the frontend uses the refresh token to get a new one.

Because refresh tokens live in the database, we can revoke them instantly. When a user clicks "log out of all devices," it actually works. We delete their refresh tokens, and within 15 minutes every session everywhere is invalidated.

Both tokens live in httpOnly cookies rather than localStorage. JavaScript cannot read httpOnly cookies, which means an XSS vulnerability cannot steal the tokens. Most tutorials store JWTs in localStorage because it's simpler, but it leaves users vulnerable to script injection.

Password hashing uses Argon2, OWASP's current recommendation over bcrypt.

## Libraries

For JWT handling, we use Joken instead of Guardian. Guardian is popular but tries to do too much. It has opinions about plugs, permissions, token types. We found ourselves fighting these abstractions. Joken just encodes and decodes tokens. We handle the rest.

Oban handles background jobs. Unlike Sidekiq or Celery, Oban uses PostgreSQL as its backend instead of Redis. One less service to run. Job state is transactional with your application data. You can insert a database record and enqueue a job in the same transaction, with the guarantee that either both happen or neither does.

On the frontend: Zustand for client state, TanStack Query for API calls, React Hook Form with Zod for forms. For components, shadcn/ui built on Radix primitives. Radix handles accessibility correctly, which is hard to do from scratch.

## Deployment

We deploy to bare metal servers running NixOS. No Docker in production. No Kubernetes.

Kubernetes solves problems of scale that most SaaS applications don't have. For a typical SaaS with a handful of services, it adds operational complexity without proportional benefits. You end up managing Kubernetes instead of building your product.

Our setup is simple. systemd supervises the Phoenix processes. Caddy handles TLS and reverse proxying, automatically getting certificates from Let's Encrypt. When we deploy, we push the new NixOS configuration to our servers using deploy-rs. The switch is atomic. If something goes wrong, we roll back in seconds.

Secrets are encrypted in the git repository using agenix. Each server has its own age encryption key, and secrets are decrypted at deployment time on the target machine.

## Observability

We set up logging, metrics, and error tracking before writing the first feature. Finding out about outages from users is embarrassing and preventable.

Logs are structured JSON. Every entry includes a request ID, user ID, and organization ID. These logs ship to Grafana Loki through Promtail.

The request ID is generated when a request enters our system and propagates through everything: API calls, background jobs, external service calls. When a user reports a problem and we have their request ID, we can trace exactly what happened across the entire system.

Metrics go to Prometheus, errors to Sentry. Dashboards and alerts exist before the first feature because retrofitting them later never happens.

## Build order

First comes the foundation: Nix configuration, Makefile, project structure, database setup. Feels like yak shaving, but a shaky foundation causes problems forever.

Second, we build admin tools. A dashboard for internal use. User impersonation, which lets us log in as any user to see what they see. Seed data that creates realistic test scenarios. You need to demo to stakeholders before the product is done. You need to debug issues by experiencing the product as users do.

Third is authentication, because almost everything else depends on knowing who the user is.

Then the actual product features. Polish like error handling, loading states, and accessibility comes last but isn't optional.

## The full guide

The complete guide is at [github.com/unbalancedparentheses/saas_guidelines](https://github.com/unbalancedparentheses/saas_guidelines). Database connection pooling, rate limiting, circuit breakers, health checks, graceful shutdown, disaster recovery, and more.

---

### Unprepared for What's Coming

*Published: 2026-01-08*

> Humanity is completely unprepared for what's coming. The pace of AI advancement might give people months to adapt, not decades.

URL: https://federicocarrone.com/articles/unprepared/

Humanity is completely unprepared for what's coming.

I've been talking with partners, employees, and countless people over the last few weeks. I'm amazed by most people's inability to adapt or even grasp second order effects of what's coming. They don't even want to think about the consequences. Some of the smartest people I've met in my life are trying to avoid accepting the reality: it's very likely we will have tools that are able to do almost everything better than a human in a very short timeline. It's very likely that even those of us who can generally adapt quickly won't be able to overcome this tsunami.

We will experience one of the biggest deflationary shocks in history. Only robotics combined with AI could be bigger than this.

I find it almost absurd to watch so many YouTube channels and X accounts showing you how to create your own simple app or SaaS. If anybody can build things fast, what do you think will happen? Is there demand for thousands of applications of every kind? During the last 20 years some of the smartest people alive battled for attention of people by building software and the limitation was execution costs, speed and distribution.

Right, but I'm forgetting that people say it's not the time of implementation anymore. In theory, we're now in the time of ideas. So what happens when someone has a good idea and anyone can copy it in days?

If implementation becomes worthless, what remains? Perhaps taste, distribution, or maybe in some cases deep domain expertise. But even those moats are eroding fast.

What makes this different from past disruptions is the pace. Previous technological shifts gave people decades to adapt. This one might give them months.

We built this. And yet it feels like it's happening to us, not by us. The strange position of being both the creator and the displaced. This is going to be very sad and fun at the same time. Happy to be alive during this time.

---

### Type Systems: From Generics to Dependent Types

*Published: 2026-01-01*

> A practical guide through the landscape of type systems, from everyday generics to dependent types that prove correctness, with examples in Rust, Scala, and Idris

URL: https://federicocarrone.com/articles/type-systems/

Every type error you've ever cursed at was a bug caught before production. Type systems reject nonsense at compile time so you don't discover it at 3 AM. But they vary wildly in what they can express and what guarantees they provide.

<!-- more -->

If you learn nothing else: ADTs + pattern matching + generics. These three concepts will improve your code in any language and take days to learn.

The concepts here progress from generics (reusable code) through traits (shared behavior) to linear types (resource safety) to dependent types (proving correctness). Each step buys you more compile-time guarantees at the cost of more work satisfying the type checker.

## Structure

Concepts are organized into tiers:

| Tier | What's Here | You Should Know If... |
|------|-------------|----------------------|
| 1: Foundational | Generics, ADTs, pattern matching | You write code |
| 2: Mainstream Advanced | Traits, GADTs, flow typing, existentials | You design libraries |
| 3: Serious Complexity | HKT, linear/ownership types, effects | You want deep FP or systems programming |
| 4: Research Level | Dependent types, session types | You work on PLs or verification |
| 5: Cutting Edge | HoTT, QTT, graded modalities | You do research |

You don't need to read linearly. Jump to what interests you. But concepts build on each other: if GADTs confuse you, make sure you understand [ADTs](#algebraic-data-types) first.

# Tier 1: Foundational

Every modern statically-typed language supports them. If you use a typed language, you're already using these. 

## Parametric Polymorphism (Generics)

You write a function to get the first element of a list of integers. Then you need it for strings. Then for custom types. You end up with `first_int`, `first_string`, `first_user`, duplicated code that differs only in types.

The alternative, using a universal type like `Object` or `any`, throws away type safety entirely. You're back to hoping you don't pass the wrong thing.

Abstract over the type itself. Write the function *once* with a type parameter, and it works for *any* type. The crucial property is **parametricity**: the function must behave the same way regardless of what type you plug in. It can't inspect the type or behave differently for integers versus strings.

This constraint is a feature, not a limitation. When a function is parametric in `T`, it can only shuffle `T` values around. It can't create new `T`s out of thin air, can't compare them, can't print them. This means generic functions come with "theorems for free": guarantees about their behavior that follow purely from their type signature.

For example, a function with signature `fn mystery<T>(x: T) -> T` can *only* return `x`. There's nothing else it could possibly return. The type signature alone proves the implementation. Similarly, `fn pair<T>(x: T) -> (T, T)` must return `(x, x)`. The parametricity constraint eliminates every other possibility.

What this gives you:
- Write once, use with any type
- No code duplication
- Compiler verifies each usage with concrete types
- Parametricity guarantees: a function `fn id<T>(x: T) -> T` can *only* return `x`

Fair warning: the syntax gets ugly. You will eventually write `fn process<T: Read + Write + Clone + Send + 'static>` and question your life choices. This is the price of expressiveness. It's still better than duplicating code.

```rust
// Rust: One function works for any type T
fn first<T>(slice: &[T]) -> Option<&T> {
    slice.first()
}

first(&[1, 2, 3]);              // Option<&i32>
first(&["a", "b"]);             // Option<&&str>
first(&[User::new("Ada")]);     // Option<&User>

// The implementation is identical for all types
// Parametricity: we can't inspect T, so we can only shuffle values around
```

```rust
// What can this function possibly do?
fn mystery<T>(x: T) -> T {
    // We can't:
    // - Print x (we don't know it implements Display)
    // - Compare x (we don't know it implements Eq)
    // - Clone x (we don't know it implements Clone)
    // We can ONLY return x
    x
}
```

---

## Algebraic Data Types

You're modeling a user who can be either anonymous or logged in. In a typical OOP language, you might write:

```java
class User {
    String name;       // null if anonymous
    boolean isLoggedIn;
}
```

Tony Hoare calls null references his "billion dollar mistake"—but the problem runs deeper than null. This type allows four states: anonymous with no name, anonymous with a name (!), logged in with a name, logged in without a name (!). Two of these are nonsense, but your type permits them. Every function must check for and handle impossible states.

Types should describe *exactly* the valid states. We need two tools:

- **Sum types** (enums, tagged unions): "this OR that", a value is one of several variants
- **Product types** (structs, records): "this AND that", a value contains all fields

Combined, these are **algebraic data types** (ADTs). The "algebra" comes from how you calculate possible values: products multiply (struct with 2 bools = 2 × 2 = 4 states), sums add (enum with 3 variants = 3 states).

Here's the algebra in action. Consider:
- `bool` has 2 values: `true`, `false`
- `(bool, bool)` has 2 × 2 = 4 values: `(true, true)`, `(true, false)`, `(false, true)`, `(false, false)`
- `enum Either { Left(bool), Right(bool) }` has 2 + 2 = 4 values: `Left(true)`, `Left(false)`, `Right(true)`, `Right(false)`

The power comes from combining them. You model your domain with exactly the states that make sense. If a user is either anonymous (no data) or logged in (with name and email), you write that directly. The type system then enforces that you can't access a name for an anonymous user, because that field doesn't exist in that variant.

- **Make illegal states unrepresentable**: if your type can't hold invalid data, you can't have bugs from invalid data
- No null checks for "impossible" cases
- Self-documenting domain models
- Exhaustive [pattern matching](#pattern-matching) (covered next)

```rust
// Rust: This type CANNOT represent an invalid state
enum User {
    Anonymous,
    LoggedIn { name: String, email: String },
}

// There is no way to construct:
// - "Logged in with no name" (LoggedIn requires name)
// - "Anonymous with a name" (Anonymous has no fields)

fn greet(user: &User) -> String {
    match user {
        User::Anonymous => "Hello, guest".to_string(),
        User::LoggedIn { name, .. } => format!("Hello, {}", name),
    }
}
```

```rust
// Model a payment result: each variant has exactly the data it needs
enum PaymentResult {
    Success { transaction_id: String, amount: f64 },
    Declined { reason: String },
    NetworkError { retry_after_seconds: u32 },
}

// No nulls. No "reason" field that's only valid sometimes.
// Each variant is self-contained.
```

```rust
// The classic: Option replaces null
enum Option<T> {
    None,
    Some(T),
}

// Result replaces exceptions
enum Result<T, E> {
    Ok(T),
    Err(E),
}

// These are ADTs! Sum types with generic parameters.
```

If you come from OOP, ADTs require rethinking how you model data. Instead of class hierarchies with methods, you define data structures and functions that pattern match on them. Available in Rust, Haskell, OCaml, F#, Scala, Swift, and Kotlin.

---

## Pattern Matching

Given an algebraic data type, you need to branch on its variants and extract data. With OOP, you'd use `instanceof` checks or the visitor pattern, both verbose and error-prone. Worse: when you add a new variant, the compiler doesn't tell you about all the places that need updating.

Pattern matching is the natural counterpart to [ADTs](#algebraic-data-types). If constructors *build* sum types, pattern matching *deconstructs* them. They're two sides of the same coin.

The compiler knows every possible variant of your sum type. When you write a `match`, it checks that you've covered them all. Forget a case? Compile error. Add a new variant to your enum? Every `match` in your codebase that doesn't handle it becomes a compile error. This is **exhaustiveness checking**.

The comparison to `if-else` or `switch` is instructive. In most languages, `switch` doesn't warn you about missing cases. Pattern matching does. And unlike the visitor pattern (OOP's answer to this problem), pattern matching is concise and doesn't require boilerplate classes.

- **Exhaustiveness checking**: forget a case, get a compile error
- **Refactoring safety**: add a variant, compiler shows everywhere to update
- **Destructuring built-in**: extract fields while matching
- Cleaner than if-else chains or visitor patterns

```rust
// Rust: Compiler ensures all cases handled
enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(u8, u8, u8),
}

fn process(msg: Message) -> String {
    match msg {
        Message::Quit => "Goodbye".to_string(),
        Message::Move { x, y } => format!("Moving to ({}, {})", x, y),
        Message::Write(text) => format!("Writing: {}", text),
        Message::ChangeColor(r, g, b) => format!("Color: #{:02x}{:02x}{:02x}", r, g, b),
    }
}

// If you forget a case:
// error[E0004]: non-exhaustive patterns: `Message::ChangeColor(_, _, _)` not covered
```

```rust
// Guards add conditions
fn describe(n: i32) -> &'static str {
    match n {
        0 => "zero",
        n if n < 0 => "negative",
        n if n % 2 == 0 => "positive even",
        _ => "positive odd",
    }
}

// Nested patterns
fn first_two<T: Clone>(items: &[T]) -> Option<(T, T)> {
    match items {
        [a, b, ..] => Some((a.clone(), b.clone())),
        _ => None,
    }
}
```

Pattern matching is now in C# 8+, Python 3.10+, and most functional languages. Once you use it, you won't go back.

---

## Subtyping

You have a function that logs any HTTP response. You've also defined `JsonResponse` and `XmlResponse` types with extra fields. Without some way to express "a JsonResponse *is* an HttpResponse," you'd need separate logging functions for each, or abandon type safety.

If type `B` has everything type `A` has (and possibly more), you can use a `B` anywhere an `A` is expected. This is subtyping: `JsonResponse <: HttpResponse` means JsonResponse is a subtype of HttpResponse.

Think of it as a contract. An `HttpResponse` promises certain capabilities: it has a status code and body. A `JsonResponse` fulfills that contract and adds more: it also has a parsed object and content type. Anywhere the code expects "something with status and body," a JsonResponse works fine. The extra fields are ignored but don't cause problems.

This is the Liskov Substitution Principle encoded in the type system: if `JsonResponse <: HttpResponse`, then any property that holds for HttpResponse should hold for JsonResponse.

### Nominal vs Structural: Two Philosophies

This is a fundamental classification of type systems, not just a detail of subtyping:

| Aspect | Nominal | Structural |
|--------|---------|------------|
| Type equality | Based on declared name | Based on shape/structure |
| Subtyping | Explicit declaration required | Implicit if structure matches |
| Philosophy | "What it's called" | "What it can do" |
| Abstraction | Strong boundaries | Flexible composition |
| Refactoring | Rename breaks compatibility | Structure changes break compatibility |

**Nominal typing** requires explicit declarations. Even if two types have identical fields, they're different types unless related by declaration:

```java
// Java: nominal typing
class Meters { double value; }
class Feet { double value; }

// These are DIFFERENT types despite identical structure
Meters m = new Meters();
Feet f = m;  // ERROR: incompatible types
```

**Structural typing** cares only about shape. If it has the right fields and methods, it fits:

```typescript
// TypeScript: structural typing
interface Point { x: number; y: number; }

// Any object with x and y is a Point
const p: Point = { x: 1, y: 2 };           // OK
const q: Point = { x: 1, y: 2, z: 3 };     // OK (extra field allowed)

class Coordinate { x: number; y: number; }
const r: Point = new Coordinate();          // OK (same structure)
```

**Go's approach** is interesting: nominal for defined types, but interfaces are structural. A type implements an interface if it has the right methods, no declaration needed.

```go
// Go: structural interfaces
type Reader interface {
    Read(p []byte) (n int, err error)
}

// MyFile implements Reader without declaring it
type MyFile struct { ... }
func (f MyFile) Read(p []byte) (int, error) { ... }

// Works: MyFile has the right method
func process(r Reader) { ... }
process(MyFile{})  // OK
```

```typescript
// TypeScript: Structural subtyping
interface HttpResponse {
    status: number;
    body: string;
}

interface JsonResponse {
    status: number;
    body: string;
    contentType: "application/json";
    parsed: object;
}

function logResponse(res: HttpResponse): void {
    console.log(`${res.status}: ${res.body}`);
}

const jsonRes: JsonResponse = {
    status: 200,
    body: '{"ok": true}',
    contentType: "application/json",
    parsed: { ok: true }
};

logResponse(jsonRes);  // OK! JsonResponse has everything HttpResponse needs
```

The downside: subtyping complicates type inference and introduces variance questions. When `JsonResponse <: HttpResponse`, is `List<JsonResponse>` a subtype of `List<HttpResponse>`? It depends on whether the list is read-only (covariant), write-only (contravariant), or mutable (invariant). See [Variance](#variance) for details. Rust sidesteps this by using traits instead of subtyping for polymorphism.

---

# Tier 2: Mainstream Advanced

These features appear in modern production languages but require more sophistication to use well. They're essential for library authors and for writing highly generic code.

## Traits / Typeclasses

You want to sort a list. Sorting requires comparison. How does the generic sort function know how to compare your custom `User` type?

Approaches without traits:
- **Inheritance**: `User extends Comparable`, but what if User comes from a library you don't control?
- **Pass a comparator every time**: verbose, easy to forget
- **Duck typing**: no compile-time safety, crashes at runtime if method missing

Separate the *interface* from the *type*. Define `Ord` (ordering), `Eq` (equality), `Display` (printing) as standalone interfaces called traits (Rust) or typeclasses (Haskell). Then declare that `User` implements them, *even if you didn't write User*.

This solves the "expression problem": how do you add both new types and new operations without modifying existing code? With OOP inheritance, adding new types is easy (new subclass), but adding new operations is hard (modify every class). With traits, you can add new operations (new trait) and implement them for existing types, even types from other libraries.

The implementation is resolved at compile time, with zero runtime cost. When you call `user.cmp(&other)`, the compiler knows exactly which comparison function to use because it knows the concrete type. No vtable lookup, no dynamic dispatch. This is called **monomorphization**: the compiler generates specialized code for each type you use.

The "coherence" rule prevents chaos: there can be at most one implementation of a trait for a type. You can't have two different ways to compare Users. This means you can always predict which implementation will be used.

- **Ad-hoc polymorphism**: different behavior for different types, resolved at compile time
- **Retroactive implementation**: add interfaces to types you don't own
- **Coherence**: at most one implementation per type (no ambiguity)
- **Trait bounds**: require capabilities, not inheritance

```rust
// Rust: Define a trait
trait Summary {
    fn summarize(&self) -> String;
}

// Implement for your type
struct Article {
    title: String,
    author: String,
    content: String,
}

impl Summary for Article {
    fn summarize(&self) -> String {
        format!("{} by {}", self.title, self.author)
    }
}

// Implement for a type you don't own
impl Summary for i32 {
    fn summarize(&self) -> String {
        format!("The number {}", self)
    }
}

// Use as a bound: T must implement Summary
fn notify<T: Summary>(item: &T) {
    println!("Breaking news! {}", item.summarize());
}

// Or with impl Trait syntax
fn notify_short(item: &impl Summary) {
    println!("Breaking news! {}", item.summarize());
}
```

```rust
// Standard library traits
use std::fmt::Display;
use std::cmp::Ord;

// Multiple bounds
fn print_sorted<T: Display + Ord>(mut items: Vec<T>) {
    items.sort();
    for item in items {
        println!("{}", item);
    }
}

// Default implementations
trait Greet {
    fn name(&self) -> &str;

    fn greet(&self) -> String {
        format!("Hello, {}!", self.name())  // default impl
    }
}
```

Rust's orphan rules restrict where you can implement traits to prevent conflicting implementations. This is sometimes frustrating but maintains coherence.

---

## Associated Types

You're defining an `Iterator` trait. Each iterator produces items of some type. With regular generics, you'd write `Iterator<Item>`. But this makes `Iterator<i32>` and `Iterator<String>` *different traits*, and a type can only implement one of them.

What you want: the item type should be *determined by* the implementing type, not chosen by the user.

Some type parameters are *outputs* (determined by the implementation), not *inputs* (chosen by the caller). Associated types express this: "when you implement this trait, you must specify what Item is."

The distinction matters. With a regular type parameter like `Iterator<T>`, you're saying "this is an iterator that could work with any T." But that's not how iterators work. A `VecIterator` always produces the type that the Vec contains. The type is determined by the iterator, not chosen by the user.

Think of it as a type-level function. Given a type that implements `Iterator`, you can ask "what does it produce?" and get back the associated `Item` type. `Vec<i32>` implements `Iterator` with `Item = i32`. `HashMap<K, V>` implements `Iterator` with `Item = (K, V)`. The implementing type determines the associated type.

- **Cleaner APIs**: one trait, not a family of traits
- **Type-level functions**: the implementing type determines the associated type
- **Better error messages**: "Item not found" vs. "Iterator<??> not satisfied"

```rust
// Rust: The standard Iterator trait
trait Iterator {
    type Item;  // Associated type: implementor decides

    fn next(&mut self) -> Option<Self::Item>;
}

// Implementing: specify what Item is
struct Counter {
    count: u32,
    max: u32,
}

impl Iterator for Counter {
    type Item = u32;  // Counter produces u32s

    fn next(&mut self) -> Option<u32> {
        if self.count < self.max {
            self.count += 1;
            Some(self.count)
        } else {
            None
        }
    }
}

// Using: the Item type is known from the iterator type
fn sum_all<I: Iterator<Item = i32>>(iter: I) -> i32 {
    iter.fold(0, |acc, x| acc + x)
}
```

```rust
// Without associated types (what you'd have to write)
trait BadIterator<Item> {
    fn next(&mut self) -> Option<Item>;
}

// Problem: impl BadIterator<i32> and impl BadIterator<String>
// are different traits! A type could implement both!
```

Associated types are less flexible than type parameters when you need the same type to implement a trait multiple ways. But for most cases, they make APIs cleaner.

---

## Flow-Sensitive Typing

You check if a value is null before using it. You know it's not null inside the `if` block. But does the type system know?

```java
// Java: the type system doesn't track the check
Object x = maybeNull();
if (x != null) {
    // You KNOW x isn't null here
    // But the type is still Object, not NonNull<Object>
    x.toString();  // Still need to handle potential null?
}
```

**Flow-sensitive typing** (also called **occurrence typing** or **type narrowing**) refines types based on control flow. After a type check, the type system narrows the variable's type in branches where the check succeeded.

Type information *changes* as you move through code. The type of `x` isn't fixed at its declaration. It evolves based on what the program has learned. After `if (x !== null)`, the type of `x` in the `then` branch is narrower than at the start.

This bridges static and dynamic typing philosophies. Dynamic languages always know the runtime type. Static languages traditionally fix types at declaration. Flow-sensitive typing lets static types benefit from runtime checks without losing static guarantees.

```typescript
// TypeScript: flow-sensitive typing
function process(value: string | number | null) {
    // Here: value is string | number | null

    if (value === null) {
        return;  // value is null in this branch
    }
    // Here: value is string | number (null eliminated)

    if (typeof value === "string") {
        // Here: value is string
        console.log(value.toUpperCase());  // OK: string method
    } else {
        // Here: value is number
        console.log(value.toFixed(2));     // OK: number method
    }
}

// Works with user-defined type guards too
interface Success { data: object; }
interface Failure { error: string; code: number; }

function isSuccess(result: Success | Failure): result is Success {
    return (result as Success).data !== undefined;
}

function handle(result: Success | Failure) {
    if (isSuccess(result)) {
        console.log(result.data);   // TypeScript knows result is Success
    } else {
        console.error(result.error); // TypeScript knows result is Failure
    }
}
```

```kotlin
// Kotlin: smart casts
fun process(x: Any) {
    if (x is String) {
        // x is automatically cast to String here
        println(x.length)  // No explicit cast needed
    }

    // Works with null checks too
    val name: String? = getName()
    if (name != null) {
        // name is String here, not String?
        println(name.length)
    }
}
```

- **Eliminates redundant casts**: The compiler tracks what you've already checked
- **Catches impossible branches**: If a branch can never execute, the compiler warns
- **Natural null handling**: Null checks automatically narrow types
- **Type guards**: User-defined functions can narrow types

Flow-sensitive typing complicates the type system. The type of a variable depends on *where* you are in the code, not just its declaration. This makes type checking more complex and can lead to surprising behavior when variables are reassigned or captured in closures.

TypeScript, Kotlin, Ceylon, Flow (JavaScript), Rust (with pattern matching), Swift, and increasingly other modern languages.

---

## Intersection and Union Types

You have a value that could be one of several types. Or a value that must satisfy multiple interfaces simultaneously. Regular generics and subtyping don't express these relationships cleanly.

```typescript
// How do you type a function that accepts string OR number?
// How do you require an object to be BOTH Serializable AND Comparable?
```

**Union types** (`A | B`) represent "this OR that." A value of type `A | B` is either an `A` or a `B`. You must handle both possibilities before using type-specific operations.

**Intersection types** (`A & B`) represent "this AND that." A value of type `A & B` has all properties of both `A` and `B`. It satisfies both interfaces simultaneously.

These correspond to logical OR (union) and AND (intersection).

```typescript
// TypeScript: Union types
type StringOrNumber = string | number;

function process(value: StringOrNumber) {
    // Must narrow before using type-specific operations
    if (typeof value === "string") {
        console.log(value.toUpperCase());  // OK: string method
    } else {
        console.log(value.toFixed(2));     // OK: number method
    }
}

// Discriminated unions: tagged sum types
type Result<T, E> =
    | { kind: "ok"; value: T }
    | { kind: "error"; error: E };

function handle<T, E>(result: Result<T, E>) {
    switch (result.kind) {
        case "ok": return result.value;      // TypeScript knows value exists
        case "error": throw result.error;    // TypeScript knows error exists
    }
}
```

```typescript
// TypeScript: Intersection types
interface Named { name: string; }
interface Aged { age: number; }

type Person = Named & Aged;  // Must have both name AND age

const person: Person = {
    name: "Ada",
    age: 36
};

// Intersection for mixin-style composition
interface Loggable { log(): void; }
interface Serializable { serialize(): string; }

type LoggableAndSerializable = Loggable & Serializable;

function process(obj: LoggableAndSerializable) {
    obj.log();           // OK: has Loggable
    obj.serialize();     // OK: has Serializable
}
```

```scala
// Scala 3: Union and intersection types
def process(value: String | Int): String = value match
  case s: String => s.toUpperCase
  case i: Int => i.toString

// Intersection: must satisfy both traits
trait Runnable { def run(): Unit }
trait Stoppable { def stop(): Unit }

def manage(service: Runnable & Stoppable): Unit =
  service.run()
  service.stop()
```

- **Precise typing for heterogeneous data**: JSON, configs, APIs with variant responses
- **Mixin composition**: Combine interfaces without inheritance hierarchies
- **Discriminated unions**: Type-safe pattern matching on tagged variants
- **Subtyping relationships**: `A` is subtype of `A | B`; `A & B` is subtype of `A`

### Intersection Types in Type Theory

In formal type theory, intersection types have deeper significance. The **intersection type discipline** can type more programs than simple types: some programs untypable in System F become typable with intersections. This is because intersections allow giving a term multiple types simultaneously.

```
// The identity function can have type:
λx.x : Int → Int           // for integers
λx.x : String → String     // for strings
λx.x : (Int → Int) ∧ (String → String)  // BOTH at once with intersection
```

This enables **principal typings** for some systems and is used in program analysis and partial evaluation.

TypeScript (extensive), Scala 3, Flow, Ceylon, Pike, CDuce, and research languages. Java has limited intersection types in generics (`<T extends A & B>`). Haskell achieves similar effects through typeclasses.

---

## Generalized Algebraic Data Types (GADTs)

You're building a type-safe expression language. You have `Add(expr, expr)` and `Equal(expr, expr)`. `Add` should return an integer; `Equal` should return a boolean. But with regular ADTs, the `Expr` type has no way to track what type of value each expression produces.

Your `eval` function either:
- Returns `Object` and requires downcasting (unsafe)
- Returns a sum type like `Value::Int | Value::Bool` and requires checking (verbose)

Let each constructor specify its own, more precise return type. `Add` constructs an `Expr<Int>`; `Equal` constructs an `Expr<Bool>`. The type parameter tracks what the expression evaluates to.

With regular ADTs, all constructors return the same type. `Some(x)` and `None` both return `Option<T>` for the same `T`. But with GADTs, different constructors can return *different* type instantiations. `LitInt(5)` returns `Expr<Int>`. `LitBool(true)` returns `Expr<Bool>`. The "generalized" means this flexibility.

Pattern matching reveals the payoff. If you match on an `Expr<Int>` and see a `LitInt`, the compiler knows the type parameter is `Int`. It can use this knowledge to type-check the branch correctly. You can return an `Int` directly, not a wrapped type. This information flow from patterns to type checking is what makes type-safe evaluators possible.

The cost: type inference breaks. The compiler can't always figure out what type an expression should have, because it depends on which constructor was used. You need explicit type annotations at GADT match sites.

- **Type-safe interpreters and DSLs**: the type tracks the expression's result type
- **Eliminates impossible patterns**: if you match on `Expr<Int>`, you know it's not `LitBool`
- **More precise types**: information flows from patterns to the type checker

Rust doesn't support GADTs directly. Scala 3 has clean syntax:

```scala
// Scala 3: GADT syntax
enum Expr[A]:
  case LitInt(value: Int) extends Expr[Int]
  case LitBool(value: Boolean) extends Expr[Boolean]
  case Add(left: Expr[Int], right: Expr[Int]) extends Expr[Int]
  case Equal(left: Expr[Int], right: Expr[Int]) extends Expr[Boolean]
  case If[T](cond: Expr[Boolean], thenBr: Expr[T], elseBr: Expr[T]) extends Expr[T]

// Type-safe eval: return type matches expression type
def eval[A](expr: Expr[A]): A = expr match
  case Expr.LitInt(n) => n           // here A = Int, return Int ✓
  case Expr.LitBool(b) => b          // here A = Boolean, return Boolean ✓
  case Expr.Add(l, r) => eval(l) + eval(r)
  case Expr.Equal(l, r) => eval(l) == eval(r)
  case Expr.If(c, t, e) => if eval(c) then eval(t) else eval(e)

// This WON'T compile:
// Expr.Add(Expr.LitBool(true), Expr.LitInt(1))
// Error: expected Expr[Int], got Expr[Boolean]

// Usage
val expr: Expr[Int] = Expr.Add(Expr.LitInt(1), Expr.LitInt(2))
val result: Int = eval(expr)  // Type-safe: result is Int, not Object
```

GADTs are available in Haskell, OCaml, and Scala 3. TypeScript has limited support through type guards.

---

## Existential Types

You want a collection of things that share a trait, but they're different concrete types: `Vec<???>` containing integers, strings, and custom structs. But `Vec<T>` requires one specific `T`.

Hide the concrete type behind an interface. An existential type says: "there *exists* some type `T` implementing this trait, but I won't tell you which." You can only use operations from the trait, nothing type-specific.

The duality with generics:
- **Generics (universal)**: caller picks the type, "for *all* types T, this works"
- **Existentials**: callee picks the type, "there *exists* some type T, but you don't know which"

Why is this useful? Consider a plugin system. Each plugin is a different type, but they all implement `Plugin`. You want a `Vec<Plugin>` containing all your plugins. With generics alone, you'd need `Vec<SomeSpecificPlugin>`. With existentials, you get `Vec<Box<dyn Plugin>>`: a collection of "things that are some type implementing Plugin." The concrete types are hidden (existentially quantified), but you can still call Plugin methods on them.

- **Heterogeneous collections**: mix different types with shared interfaces
- **Information hiding**: callers can't depend on the concrete type
- **Dynamic dispatch**: select implementation at runtime

```rust
// Rust: dyn Trait is an existential type
use std::fmt::Display;

fn make_displayables() -> Vec<Box<dyn Display>> {
    vec![
        Box::new(42),
        Box::new("hello"),
        Box::new(3.14),
    ]
}

fn print_all(items: Vec<Box<dyn Display>>) {
    for item in items {
        println!("{}", item);  // Can only call Display methods
    }
}

// You don't know the concrete types, but you can display them all
```

```rust
// impl Trait in return position is also existential
fn make_iterator() -> impl Iterator<Item = i32> {
    // Caller doesn't know this is specifically a Range
    // They only know it's "some iterator of i32"
    0..10
}

// Useful for hiding complex iterator adapter chains
fn complex_iter() -> impl Iterator<Item = i32> {
    (0..100)
        .filter(|x| x % 2 == 0)
        .map(|x| x * x)
        .take(10)
}
```

The cost: `dyn Trait` has runtime overhead (vtable lookup) and you can't recover the concrete type. Use generics when you know the type statically.

---

## Rank-N Polymorphism

Normally, the *caller* of a generic function chooses the type parameter. But sometimes you want the *callee* to choose. Consider a function that applies a transformation to both elements of a pair, but the elements have different types.

```rust
// This doesn't work in Rust
fn apply_to_both<T>(f: impl Fn(T) -> T, pair: (i32, String)) -> (i32, String) {
    (f(pair.0), f(pair.1))  // Error! T can't be both i32 and String
}
```

In Rank-1 polymorphism (normal generics), `forall` is at the outside: the caller picks one `T` for the whole function. In Rank-2+, `forall` appears inside argument types: "the argument must be a function that works for *any* type."

The "rank" refers to how deeply `forall` can be nested:
- **Rank 0**: No polymorphism. $int \to int$.
- **Rank 1**: $\forall$ at the top. $\forall T.\; T \to T$. Caller picks $T$.
- **Rank 2**: $\forall$ in argument position. $(\forall T.\; T \to T) \to int$. The *argument* must be polymorphic.
- **Rank N**: Arbitrary nesting.

Why would you want this? Consider the ST monad trick in Haskell. `runST` has type $(\forall s.\; ST\; s\; a) \to a$. The $s$ type variable is universally quantified *inside* the argument. This means `runST` picks $s$, not the caller. Since $s$ is chosen by `runST` and immediately goes out of scope, no reference tagged with $s$ can escape. This is how Haskell provides safe, in-place mutation: the type system guarantees mutable references can't leak outside `runST`.

The cost is severe: type inference becomes undecidable for Rank-2 and above. You must annotate everything. Most languages avoid this complexity.

- **More precise types**: "must work for all types" is a strong requirement
- **Encapsulation**: ST monad uses Rank-2 types to ensure references can't escape
- **Enable patterns impossible with Rank-1**

Rust can't express Rank-N types directly. OCaml can:

```ocaml
(* OCaml: Rank-2 polymorphism via record types *)

(* Rank-1: caller chooses 'a *)
let id : 'a -> 'a = fun x -> x

(* Rank-2 requires a record with polymorphic field *)
type poly_fn = { f : 'a. 'a -> 'a }

let apply_to_both (p : poly_fn) (x, y) = (p.f x, p.f y)

(* This works: id is polymorphic *)
let result = apply_to_both { f = id } (42, "hello")
(* result = (42, "hello") *)

(* This FAILS: (+1) only works on int, not any type *)
(* let bad = apply_to_both { f = fun x -> x + 1 } (42, "hello") *)
(* Error: This field value has type int -> int
   which is less general than 'a. 'a -> 'a *)
```

```haskell
-- Haskell: cleaner Rank-2 syntax with RankNTypes extension
{-# LANGUAGE RankNTypes #-}

-- runST : (forall s. ST s a) -> a

-- The 's' type variable is chosen by runST, not the caller.
-- This makes it impossible to return an STRef outside runST,
-- because the 's' won't match anything outside.
```

Rank-N types are rare outside Haskell. Most languages don't support them, and you can usually work around their absence.

---

# Tier 3: Serious Complexity

These features require significant learning investment but let you write abstractions impossible in simpler type systems. They're common in functional programming languages and increasingly appearing in mainstream languages.

## Higher-Kinded Types (HKT)

`Vec`, `Option`, `Result`: they're all "containers" you can map a function over. You write `map` for `Vec`. Then for `Option`. Then for `Result`. The implementations look structurally identical:

```rust
fn map_vec<A, B>(items: Vec<A>, f: impl Fn(A) -> B) -> Vec<B>
fn map_option<A, B>(item: Option<A>, f: impl Fn(A) -> B) -> Option<B>
fn map_result<A, B, E>(item: Result<A, E>, f: impl Fn(A) -> B) -> Result<B, E>
```

Can't we abstract over the *container itself*?

Types have **kinds**, just as values have types:

```
Int         : Type                    -- a plain type
Vec         : Type -> Type            -- takes a type, returns a type
Result      : Type -> Type -> Type    -- takes two types, returns a type
```

`Int` is a complete type. But `Vec` by itself is not a type. You can't have a variable of type `Vec`. You need `Vec<i32>` or `Vec<String>`. `Vec` is a *type constructor*: give it a type, get back a type.

HKT lets you abstract over type constructors like `Vec` and `Option`, instead of only types like `Int`. You can define `Functor` as a trait for *any* type constructor, then implement it once for each container.

The pattern `Functor`, `Applicative`, `Monad` from functional programming all require HKT. They describe properties of *containers*, not specific types. "Functor" means "you can map over this container." That applies to `Vec`, `Option`, `Result`, `Future`, `IO`, and infinitely many other type constructors. Without HKT, you'd write `map_vec`, `map_option`, `map_result` separately. With HKT, you write one `map` that works for any `Functor`.

- **Functor, Monad, Applicative**: abstract patterns over any container
- **Write code once**: works for `Option`, `Result`, `Vec`, `Future`, `IO`, ...
- **Foundation of functional programming abstractions**

Type inference becomes undecidable in general. Languages with HKT require explicit annotations. Rust deliberately avoids full HKT (using GATs as a workaround for some cases).

Rust doesn't have HKT. Scala 3 does:

```scala
// Scala 3: F[_] is a type constructor (kind: Type -> Type)
trait Functor[F[_]]:
  def map[A, B](fa: F[A])(f: A => B): F[B]

// Implement for List
given Functor[List] with
  def map[A, B](fa: List[A])(f: A => B): List[B] = fa.map(f)

// Implement for Option
given Functor[Option] with
  def map[A, B](fa: Option[A])(f: A => B): Option[B] = fa.map(f)

// Now we can write generic code over ANY functor
def double[F[_]: Functor](fa: F[Int]): F[Int] =
  summon[Functor[F]].map(fa)(_ * 2)

double(List(1, 2, 3))              // List(2, 4, 6)
double(Option(5))                  // Some(10)
double(Option.empty[Int])          // None

// Monad builds on Functor
trait Monad[M[_]] extends Functor[M]:
  def pure[A](a: A): M[A]
  def flatMap[A, B](ma: M[A])(f: A => M[B]): M[B]

  // map can be derived from flatMap
  def map[A, B](fa: M[A])(f: A => B): M[B] =
    flatMap(fa)(a => pure(f(a)))
```

HKT is standard in Haskell, Scala, and PureScript. Rust avoids full HKT but added GATs (Generic Associated Types) as a partial workaround. If your language doesn't support HKT, don't fight it. Three similar functions are fine if they're short.

---

## Linear and Affine Types

Resources must be managed: files closed, memory freed, locks released. Forget to close a file? Leak. Close it twice? Crash. Use it after closing? Undefined behavior.

I didn't understand why affine types mattered until I spent three days debugging a double-free in a C++ codebase. The ownership was "obvious" to whoever wrote it—six months earlier. Rust would have rejected the code instantly.

Garbage collectors handle memory but not files, sockets, or locks. Manual management is error-prone—Microsoft reports that 70% of their security vulnerabilities are memory safety issues, and use-after-free remains a top exploit vector. Can the type system track resource usage?

Most type systems only track *what* a value is. Linear types also track *how many times* it's used. This is the **substructural** family, named because they restrict the structural rules of logic (weakening, contraction, exchange):

| Type | Rule | Structural Rule Restricted | Use Case |
|------|------|---------------------------|----------|
| Unrestricted | Any number of times | None | Normal values |
| Affine | At most once | Contraction (no duplication) | Rust ownership, can drop unused |
| Linear | Exactly once | Contraction + Weakening | Must handle, can't forget |
| Relevant | At least once | Weakening (no discarding) | Must use, can duplicate |
| Ordered | Exactly once, in order | Contraction + Weakening + Exchange | Stack disciplines |

**Ordered types** are the most restrictive: values must be used exactly once and in LIFO order. They model stack-based resources where you can't reorder operations.

Rust uses **affine types**: values are used at most once (moved), but you can drop them without using them. True **linear types** require using values exactly once. You can't forget to handle something.

"Use" includes transferring ownership. When you pass a `String` to a function that takes it by value, you've "used" the String. It's gone from your scope. You can't use it again. The borrow checker tracks ownership and prevents use-after-move.

Borrowing (`&T` and `&mut T`) is how Rust escapes the "use once" restriction when you need it. A borrow doesn't consume the value; it temporarily lends access. The original owner keeps ownership and can use the value after the borrow ends. The borrow checker ensures borrows don't outlive the owner.

- **Memory safety without GC**: no runtime overhead, no pauses
- **Resource safety**: can't forget to close files
- **Prevent use-after-free**: type system rejects it
- **No data races**: ownership prevents shared mutable state

```rust
// Rust: Affine types (values used at most once)
fn consume(s: String) {
    println!("{}", s);
}   // s dropped here

fn main() {
    let s = String::from("hello");
    consume(s);        // s moved into consume
    // println!("{}", s);  // ERROR: value borrowed after move
}

// File handles: RAII through ownership
use std::fs::File;
use std::io::Read;

fn read_file() -> std::io::Result<String> {
    let mut file = File::open("data.txt")?;
    let mut contents = String::new();
    file.read_to_string(&mut contents)?;
    Ok(contents)
}   // file automatically closed here (Drop trait)

// Can't use file after it's moved/dropped
// Can't forget to close (happens automatically)
// Can't close twice (Drop runs exactly once)
```

```rust
// Borrowing: temporarily use without consuming
fn print_length(s: &String) {  // borrows s
    println!("Length: {}", s.len());
}   // borrow ends, s still valid

fn main() {
    let s = String::from("hello");
    print_length(&s);  // lend s
    print_length(&s);  // can lend again
    println!("{}", s); // s still valid
}

// Mutable borrows: exclusive access
fn append_world(s: &mut String) {
    s.push_str(" world");
}

fn main() {
    let mut s = String::from("hello");
    append_world(&mut s);
    // Only ONE mutable borrow at a time (prevents data races)
}
```

The borrow checker takes practice. Some patterns (graphs, doubly-linked lists) fight against it. But once you internalize ownership thinking, most code just works.

### The Broader Family: Ownership, Regions, and Capabilities

Linear/affine types are part of a broader family of resource-tracking type systems:

| System | What It Tracks | Example |
|--------|----------------|---------|
| **Linear/Affine** | Usage count (exactly/at most once) | Move semantics |
| **Ownership** | Who owns a value | Rust's ownership model |
| **Region/Lifetime** | How long a reference is valid | Rust lifetimes (`'a`) |
| **Capability** | What permissions a value grants | Object-capability languages |

**Ownership types** make the owner explicit in the type. Rust combines ownership with affine types: the owner is responsible for cleanup, and ownership can transfer exactly once. This is more than tracking usage; it's tracking *responsibility*.

**Region types** (or **lifetime types**) track the *scope* where a reference is valid. Rust's lifetime annotations (`&'a T`) are region types: they prove references don't outlive the data they point to.

```rust
// Rust: lifetimes are region types
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() { x } else { y }
}

// The 'a says: the returned reference is valid as long as
// BOTH input references are valid. The compiler checks this.
```

**Capability types** encode *permissions*, not just structure. A `ReadCapability<File>` lets you read, while `WriteCapability<File>` lets you write. The type system ensures you can only perform operations you have capabilities for. This is object-capability security expressed in types.

These ideas originated in research (region inference in MLKit, capability calculus, Cyclone's safe C) but reached mainstream through Rust. Languages like Vale and Austral explore different points in this design space.

---

## Effect Systems

Does this function do I/O? Throw exceptions? Modify global state? In most languages, you can't tell from the signature. A function that *looks* pure might read from the network, crash your program, or modify a global variable.

```java
// What does this do? You have to read the implementation.
String process(String input)
```

Track what **effects** a function can perform in its type. Pure functions have no effects. `readFile` has an `IO` effect. `throw` has an `Exception` effect. A function `String -> Int` with no effects can only compute on its input. A function `String -> IO Int` might read files, hit the network, or launch missiles.

Effects propagate: call `readFile` inside your function, your function now has `IO` too. The compiler tracks this automatically.

Some systems also provide **effect handlers**: intercept an effect and provide custom behavior. Instead of performing I/O, you could log what I/O *would* happen. Instead of throwing an exception, you could collect errors. This is like dependency injection, but for effects. You write code using abstract effects, then "handle" them differently in tests versus production.

- **Effects visible in signatures**: see at a glance what a function can do
- **Purity is provable**: no-effect functions are guaranteed pure
- **Effect polymorphism**: generic over what effects are used
- **Effect handlers**: programmable control flow, algebraic effects

```haskell
// Koka: Effects are part of the type

// Pure function: no effects
fun pureAdd(x: int, y: int): int
  x + y

// Function with IO and exception effects
fun readConfig(path: string): <io, exn> string
  val contents = read-text-file(path)   // io effect
  if contents.is-empty then
    throw("Config file is empty")        // exn effect
  contents

// Effect polymorphism: map preserves whatever effects f has
fun map(xs: list<a>, f: (a) -> e b): e list<b>
  match xs
    Nil -> Nil
    Cons(x, rest) -> Cons(f(x), map(rest, f))

// If f is pure, map is pure
// If f has io effect, map has io effect
```

```haskell
// Effect handlers: provide custom interpretations of effects
effect ask<a>
  ctl ask(): a

fun program(): ask<int> int
  val x = ask()
  val y = ask()
  x + y

// Handle by providing values
fun main(): io ()
  // Handle 'ask' by returning 10 each time
  with handler
    ctl ask() resume(10)

  val result = program()  // 20
  println(result.show)
```

Effect systems are in Koka, Eff, Frank, and Unison. Haskell uses monads as a workaround. Most mainstream languages don't have them, so you can use discipline instead: pure functions in the core, effects at the edges.

---

## Refinement Types

Your function divides two numbers. The divisor can't be zero. You add a runtime check:

```rust
fn divide(x: i32, y: i32) -> i32 {
    if y == 0 { panic!("division by zero"); }
    x / y
}
```

But the caller might *know* y is non-zero because it's from a non-empty list length. You're checking unnecessarily. And what if you forget the check somewhere?

Attach logical predicates to types. Instead of `Int`, write $\{x : Int \mid x > 0\}$. A refinement type is a base type plus a predicate that values must satisfy.

This is a sweet spot between regular types and full dependent types. Regular types distinguish "integer" from "string" but can't distinguish "positive integer" from "negative integer." Dependent types can express almost anything but require proofs. Refinement types let you express common properties (non-null, positive, in bounds) and use automated solvers to verify them.

The compiler uses an **SMT solver** (Satisfiability Modulo Theories) to verify predicates at compile time. SMT solvers are automated theorem provers that can handle arithmetic, bit vectors, arrays, and more. When you write `divide(x, y)` where `y` must be positive, the solver checks whether `y > 0` is provable from the context. If `y` came from a list length, and lists are non-empty, the solver can prove this automatically.

Division by zero becomes a *type error*, caught before running. Buffer overflows too. Array index out of bounds. Integer overflow. These become compile-time checks when you add the right refinements.

- **Prove properties at compile time**: non-zero, positive, in bounds
- **Eliminate runtime checks**: when the compiler can prove safety
- **Catch errors earlier**: type checker finds the bug, not production
- **Lightweight verification**: more than types, less than full proofs

```haskell
// F*: Refinement types with dependent types

// Natural numbers: ints >= 0
type nat = x:int{x >= 0}

// Positive numbers: ints > 0
type pos = x:int{x > 0}

// Division requires positive divisor (not just non-zero!)
val divide : int -> pos -> int
let divide x y = x / y

// This compiles: 5 is provably positive
let result = divide 10 5

// This FAILS at compile time:
// let bad = divide 10 0
// Error: expected pos, got int literal 0

// This also fails without more info:
// let risky (y: int) = divide 10 y
// Error: can't prove y > 0
```

```haskell
// Vectors with length in the type (simple dependent types)
val head : #a:Type -> l:list a{length l > 0} -> a
let head #a l = List.hd l

// This compiles:
let first = head [1; 2; 3]

// This fails:
// let bad = head []
// Error: can't prove length [] > 0

// Safe indexing: index must be less than length
val nth : #a:Type -> l:list a -> i:nat{i < length l} -> a

// The refinement i < length l guarantees bounds safety
```

The SMT solver can fail or timeout on complex predicates. When it works, it's like magic. When it doesn't, you're debugging why the solver can't prove something you know is true. F*, Dafny, Liquid Haskell, and Ada/SPARK all use this approach.

### When Refinement Types Aren't Enough

Refinement types work well for predicates on values: $\{x : Int \mid x > 0\}$, bounds checks, non-nullity, arithmetic constraints. SMT solvers handle these automatically. But they hit walls.

The first wall is type-level computation. You want `printf "%d + %d = %d"` to have type `Int -> Int -> Int -> String`. The format string determines the type. This isn't a predicate on a value—it's computing a type from a value. Refinement types can't express this.

The second wall is state. Session types need types that change based on what operations you've performed. Refinement types constrain values but can't express "after calling `open()`, the handle is in state Open." For this you need dependent types or linear types.

The third wall is induction. SMT solvers are decision procedures for specific theories—linear arithmetic, bit vectors, arrays. They don't do induction. Refinement types can say "this list has length > 0" but struggle with "this vector has length n + m." You can write $\{v : Vec \mid len(v) = len(a) + len(b)\}$, and for simple cases SMT solvers can verify it. But proving it across recursive calls—showing each step preserves the invariant—requires induction the solver can't do.

F* sits on the boundary—it has both refinement types (SMT-backed) and full dependent types (proof-backed). You start with refinements and escalate to manual proofs when the solver fails. This is a reasonable mental model: refinement types are dependent types where an automated prover handles the easy cases. If an SMT solver can verify your property in a few seconds, refinement types work. If you need type computation, state tracking, or induction, you've crossed into dependent type territory.

---

# Tier 4: Research Level

These concepts are primarily found in research languages and proof assistants. They provide the strongest guarantees but require significant expertise. Understanding them helps even if you never use them directly.

## Dependent Types

You want a function that appends two vectors. The result should have length `n + m`. With regular types, you can express "returns a vector" but not "returns a vector whose length is the sum of the inputs."

```rust
// Regular types: can't express the length relationship
fn append<T>(a: Vec<T>, b: Vec<T>) -> Vec<T>
```

Refinement types help with predicates, but what if types could *compute*?

Types can depend on values. `Vector<3, Int>` (a vector of 3 integers) is a different type than `Vector<5, Int>`. These aren't the same type with the same length checked at runtime. They're *different types*. A function expecting a 3-element vector won't accept a 5-element vector, just like a function expecting a String won't accept an Int.

Function types can express relationships between inputs and outputs:

```
append : Vector<n, a> -> Vector<m, a> -> Vector<n + m, a>
```

The return type *computes* from the input types. If you append a 3-element vector to a 5-element vector, you get an 8-element vector. The `n + m` is evaluated at the type level. Types and terms live in the same world.

This is the Curry-Howard correspondence in full force. Types are propositions. Programs are proofs. `Vector<n, a>` is a proposition: "there exists a vector of n elements of type a." Constructing such a vector proves the proposition. A function type `Vector<n, a> -> Vector<n, a>` is an implication: "if you give me a proof of n-vector, I'll give you back a proof of n-vector."

The payoff: matrix multiplication that's dimensionally checked at compile time. $Matrix\langle n, m \rangle \times Matrix\langle m, p \rangle \to Matrix\langle n, p \rangle$. If dimensions don't match, the code doesn't compile.

- **Type checking requires evaluation**: undecidable in general
- **Termination checking required**: non-terminating functions break type checking
- **Proving is different from programming**: you need to think about why code is correct, not just that it works
- **Verbose proofs**: sometimes more proof code than actual code

```haskell
-- Idris 2: Dependent types

-- Vector indexed by its length
data Vect : Nat -> Type -> Type where
    Nil  : Vect 0 a
    (::) : a -> Vect n a -> Vect (S n) a

-- head: ONLY works on non-empty vectors
-- Not a runtime check. The TYPE prevents calling on empty.
head : Vect (S n) a -> a
head (x :: xs) = x

-- No case for Nil needed! Vect (S n) can't be Nil.
-- The S n pattern means "at least 1"

-- append: the type PROVES lengths add
append : Vect n a -> Vect m a -> Vect (n + m) a
append Nil       ys = ys
append (x :: xs) ys = x :: append xs ys

-- Type-safe matrix multiplication
Matrix : Nat -> Nat -> Type -> Type
Matrix rows cols a = Vect rows (Vect cols a)

-- Dimensions must match, checked at COMPILE TIME
matMul : Num a => Matrix n m a -> Matrix m p a -> Matrix n p a

-- This won't compile:
-- matMul (2x3 matrix) (5x2 matrix)
-- Error: expected Matrix 3 p, got Matrix 5 2
```

```haskell
-- Type-safe printf!
-- The format string determines the function's type

printf : (fmt : String) -> PrintfType fmt

-- printf "%s is %d years old"
-- has type: String -> Int -> String

-- printf "%d + %d = %d"
-- has type: Int -> Int -> Int -> String

-- Wrong number/type of arguments = compile error
```

Dependent types are in Idris 2, Agda, Coq, Lean 4, and F*. For most application code, they're overkill. [Refinement types](#refinement-types) or [phantom types](#phantom-types) often suffice.

---

## Communication and Protocol Typing

Concurrency introduces problems that go beyond sequential code. Functions have types, but what about *interactions*? Type systems for communication ensure that distributed components agree on protocols, preventing deadlocks and message mismatches at compile time.

### Why Concurrency Needs Types Beyond Functions

In sequential code, a function type `A -> B` tells you everything: give an `A`, get a `B`. But concurrent systems have:

- **Ordering constraints**: Must send request before receiving response
- **Protocol states**: What you can do depends on what happened before
- **Multiple parties**: Client, server, and maybe others must agree
- **Failure modes**: Deadlock, livelock, message type mismatch

Regular function types can't express "after you send X, you must receive Y before sending Z." Protocol violations compile fine but fail at runtime.

### Session Types

**Session types** encode communication protocols in channel types. The channel's type *changes* as you use it, tracking protocol state.

Distributed systems communicate over channels. Client sends `Request`, server responds with `Response`. But what if the client sends two requests without waiting? Or expects a response that never comes? Protocol violations cause deadlocks or silent failures, discovered only in production.

Session types fix this by making channels typed state machines. Start with `!Request.?Response.End`. After sending a request, you have `?Response.End`. After receiving the response, you have `End`. Each operation transforms the type. Using the wrong operation is a type error.

Key concept: **duality**. The client's view is the *dual* of the server's view: sends become receives and vice versa. If the client has `!Request.?Response.End`, the server has `?Request.!Response.End`. The types are symmetric. This ensures both sides agree on the protocol, verified at compile time. Well-typed programs can't deadlock.

```
// Session types: Types encode protocols

// Notation:
// !T  = send value of type T
// ?T  = receive value of type T
// .   = sequencing
// End = session finished

// Client's protocol view
type BuyerProtocol =
    !String.       // send book title
    ?Price.        // receive price
    !Bool.         // send accept/reject
    End

// Server's view: the DUAL (swap ! and ?)
type SellerProtocol =
    ?String.       // receive title
    !Price.        // send price
    ?Bool.         // receive decision
    End

// Implementation (pseudocode)
buyer(channel: BuyerProtocol) {
    send(channel, "Types and Programming Languages");
    // channel now has type ?Price.!Bool.End

    let price = receive(channel);
    // channel now has type !Bool.End

    send(channel, price < 100);
    // channel now has type End

    close(channel);
}

// Multiparty session: Three-way protocol
global protocol Purchase(Buyer, Seller, Shipper) {
    item(String) from Buyer to Seller;
    price(Int) from Seller to Buyer;

    choice at Buyer {
        accept:
            payment(Int) from Buyer to Seller;
            address(String) from Buyer to Shipper;
            delivery(Date) from Shipper to Buyer;
        reject:
            cancel() from Buyer to Seller;
            cancel() from Buyer to Shipper;
    }
}
```

Session types are mostly in research: Links, Scribble, and various academic implementations. Few production systems use them directly, but the ideas influence API design.

### Actor Message Typing

**Actor systems** (Erlang, Akka, Orleans) use message passing instead of shared memory. Each actor has a mailbox and processes messages sequentially. But what messages can an actor receive?

Without typing, any message can be sent to any actor. Typos in message names, wrong payload types, or protocol violations surface only at runtime.

**Typed actors** constrain what messages an actor can receive:

```scala
// Akka Typed: Actor's message type is explicit
object Counter {
  sealed trait Command
  case class Increment(replyTo: ActorRef[Int]) extends Command
  case class GetValue(replyTo: ActorRef[Int]) extends Command
}

// The actor can ONLY receive Counter.Command messages
def counter(value: Int): Behavior[Counter.Command] =
  Behaviors.receive { (context, message) =>
    message match {
      case Increment(replyTo) =>
        replyTo ! (value + 1)
        counter(value + 1)
      case GetValue(replyTo) =>
        replyTo ! value
        Behaviors.same
    }
  }

// Sending wrong message type = compile error
// counterRef ! "hello"  // ERROR: String is not Counter.Command
```

```erlang
%% Erlang: Dialyzer can check message types via specs
-spec loop(state()) -> no_return().
loop(State) ->
    receive
        {increment, From} ->
            From ! {ok, State + 1},
            loop(State + 1);
        {get, From} ->
            From ! {ok, State},
            loop(State)
    end.
```

### Comparing Approaches

| Approach | What's Typed | Guarantees | Examples |
|----------|-------------|------------|----------|
| **Untyped channels** | Nothing | None | Raw sockets, most languages |
| **Typed messages** | Message payload types | No wrong payloads | Go channels, Rust mpsc |
| **Actor behavior types** | What actor accepts | No invalid messages | Akka Typed, Pony |
| **Session types** | Protocol state machine | No protocol violations | Links, research |
| **Multiparty session** | N-party protocols | Global protocol safety | Scribble, research |

### Practical Adoption

Rust's `Send` and `Sync` traits are a lightweight form of concurrency typing: they mark which types can safely cross thread boundaries. This isn't protocol typing, but it prevents data races at compile time.

Go's typed channels (`chan int`, `chan Message`) ensure payload types match but don't track protocol state.

Full session types remain mostly academic, but the ideas are seeping into practice. TypeScript's discriminated unions with exhaustive matching approximate protocol states. Rust's typestate pattern uses the type system to enforce valid sequences of operations.

---

## Quantitative Type Theory (QTT)

Linear types track usage (use exactly once). Dependent types need to inspect values at the type level. But inspecting a value for typing shouldn't count as "using" it at runtime!

```haskell
-- We want the length n to be:
-- - Available at compile time (for type checking)
-- - Erased at runtime (zero cost)
data Vect : Nat -> Type -> Type
```

How do you combine linear/affine types with dependent types cleanly?

Annotate each variable with a **quantity** from a semiring:
- **0**: compile-time only (erased at runtime)
- **1**: exactly once (linear)
- **ω**: unlimited

The key problem this solves: in dependent types, type-checking might *use* a value to determine a type, but that "use" shouldn't count at runtime. The length `n` in `Vect n a` is used at the type level to ensure vectors have the right size. But at runtime, you don't want to pass `n` around. It should be erased.

With QTT, you write `(0 n : Nat)` to say "n exists for type-checking but has zero runtime representation." The `0` quantity means "used zero times at runtime." The type checker uses it. The compiled code doesn't include it.

This also cleanly handles linear resources. A file handle has quantity 1: use it exactly once. A normal integer has quantity ω: use it as many times as you want. The quantities form a semiring, which makes them compose correctly when you combine functions.

```haskell
-- Idris 2 uses QTT natively

-- The 'n' has quantity 0: erased at runtime!
data Vect : (0 n : Nat) -> Type -> Type where
    Nil  : Vect 0 a
    (::) : a -> Vect n a -> Vect (S n) a

-- n is available for type checking but has zero runtime cost

-- Linear function: use x exactly once
dup : (1 x : a) -> (a, a)  -- ERROR: can't use x twice!

-- Valid linear function
consume : (1 x : File) -> IO ()

-- Unrestricted
normal : (x : Int) -> Int
normal x = x + x  -- Fine, x is unrestricted (quantity ω)

-- Mixing: erased type, linear value
id : (0 a : Type) -> (1 x : a) -> a
id _ x = x
-- a exists only at compile time
-- x is used exactly once at runtime
```

Idris 2 uses QTT. Granule is a research language exploring graded types more generally.

---

## Cubical Type Theory

Homotopy Type Theory (HoTT) introduced revolutionary ideas: types as spaces, equality as paths. The **univalence axiom** says equivalent types are equal. But it was just an axiom that didn't compute. Asking "are these two proofs of equality the same?" got no answer.

Make equality *computational*. In standard type theory, you can prove two things are equal, but you can't always *compute* with that equality. Univalence (equivalent types are equal) was an axiom: you could assert it, but it didn't reduce to anything. Asking "is this proof of equality the same as that one?" might not give an answer.

Cubical type theory fixes this by taking homotopy seriously. A proof of equality `a = b` is literally a path from `a` to `b`. Formally, it's a function from the interval type `I` (representing [0,1]) to the type, where the function maps 0 to `a` and 1 to `b`. You can walk along the path. You can reverse it (symmetry). You can concatenate paths (transitivity).

This geometric intuition makes equality computational. Univalence becomes a theorem: given an equivalence between types, you can construct a path between them. And crucially, transporting values along this path actually *applies* the equivalence. Everything reduces. Everything computes. You also get functional extensionality (functions equal if they agree on all inputs) and higher inductive types (quotients, circles, spheres as types) for free.

```haskell
-- Cubical Agda

{-# OPTIONS --cubical #-}

open import Cubical.Core.Everything

-- I is the interval type: points from 0 to 1
-- A path from a to b is a function I → A
-- where i0 ↦ a and i1 ↦ b

-- Reflexivity: constant path
refl : ∀ {A : Type} {a : A} → a ≡ a
refl {a = a} = λ i → a  -- For all points, return a

-- Symmetry: reverse the path
sym : ∀ {A : Type} {a b : A} → a ≡ b → b ≡ a
sym p = λ i → p (~ i)   -- ~ negates interval points

-- Function extensionality: just works!
-- If f x ≡ g x for all x, then f ≡ g
funExt : ∀ {A B : Type} {f g : A → B}
       → (∀ x → f x ≡ g x)
       → f ≡ g
funExt p = λ i x → p x i

-- Univalence: equivalences give paths between types
ua : ∀ {A B : Type} → A ≃ B → A ≡ B

-- And this COMPUTES: transporting along ua
-- actually applies the equivalence!
```

Cubical Agda, redtt, cooltt, and Arend implement cubical type theory. Unless you're doing research in type theory or formalizing mathematics, you won't need this.

---

## Separation Logic Types

You're writing code with pointers. How do you know two pointers don't alias? That modifying `*x` won't affect `*y`? In C, you don't. It's undefined behavior waiting to happen.

```c
void swap(int *x, int *y) {
    int tmp = *x;
    *x = *y;
    *y = tmp;
}
// What if x == y? This breaks!
```

Reason about **ownership of heap regions**. The key operator is **separating conjunction** (`*`): `P * Q` means "P holds for some heap region, Q holds for a *separate* region." If you prove you own separate regions, they can't alias.

Classical logic has conjunction (∧): "P and Q are both true." Separation logic adds a new conjunction (*): "P holds for part of memory, Q holds for a *different* part of memory, and these parts don't overlap." This is the missing piece for reasoning about pointers.

When you write `{x ↦ 5 * y ↦ 10}`, you're asserting: x points to 5, y points to 10, *and x and y are different locations*. The separating conjunction makes non-aliasing explicit. Without it, modifying `*x` might affect `*y`. With it, you know they're independent.

The **frame rule** makes proofs modular. If you prove `{P} code {Q}` (running code in state P yields state Q), then `{P * R} code {Q * R}` for any R. Whatever R describes is *framed out*, untouched by code. You can reason about each piece of memory independently.

Rust's borrow checker embodies these ideas. Mutable borrows are exclusive ownership of a memory region. The guarantee that you can't have two `&mut` to the same location is the separating conjunction at work. Concurrent separation logic extends this to reason about shared-memory concurrency.

```
// Separation logic specifications (pseudocode)

// Points-to assertion: x points to value v
x ↦ v

// Separating conjunction: DISJOINT ownership
// x ↦ a * y ↦ b means x and y are different locations
{x ↦ a * y ↦ b}    // precondition: x points to a, y points to b, SEPARATELY
swap(x, y)
{x ↦ b * y ↦ a}    // postcondition: values swapped

// The * GUARANTEES x ≠ y
// Without separation: swap(p, p) would break!

// Frame rule: what you don't touch, stays the same
// If: {P} code {Q}
// Then: {P * R} code {Q * R}
// R is "framed out", untouched by code

// Linked list segment from head to tail
lseg(head, tail) =
    (head = tail ∧ emp)                            // empty segment
  ∨ (∃v, next. head ↦ (v, next) * lseg(next, tail)) // node + rest
```

```rust
// Rust's borrow checker encodes similar ideas
fn swap(x: &mut i32, y: &mut i32) {
    // Rust GUARANTEES x and y don't alias
    // Can't have two &mut to the same location!
    let tmp = *x;
    *x = *y;
    *y = tmp;
}

// This won't compile:
// let mut n = 5;
// swap(&mut n, &mut n);  // Error: can't borrow n mutably twice
```

You get separation logic ideas implicitly through Rust's borrow checker. For explicit proofs, tools like Iris (Coq), Viper, and VeriFast let you verify pointer-manipulating code.

---

## Sized Types

Dependent type systems need to know all functions terminate. Otherwise type checking could loop forever. Typically they require **structural recursion**: arguments must get smaller in a syntactic sense.

But this rejects valid programs:

```
merge : Stream → Stream → Stream
merge (x:xs) (y:ys) = x : y : merge xs ys
```

Neither `xs` nor `ys` is structurally smaller than both original arguments!

Track *sizes* abstractly in types. A `Stream<i>` has "size" `i`. Operations might not be syntactically smaller but are *semantically* smaller in size. The type checker tracks sizes symbolically.

The problem is termination checking. Dependent type checkers must ensure all functions terminate, otherwise type-checking could loop forever. Simple structural recursion ("the argument gets smaller") works for many cases but rejects valid programs.

Consider merging two streams. At each step, you take one element from each stream. Neither stream is "structurally smaller" than both inputs. But semantically, you're making progress: you're consuming both streams. Sized types capture this. Each stream has an abstract size. After taking an element, the remaining stream has a smaller size. The type checker sees sizes decreasing and accepts the function.

For coinductive data (infinite structures like streams), you need **productivity checking**: you must produce output in finite time. Sized types handle this too. The output stream's size depends on the input sizes in a way that guarantees you always make progress.

```haskell
{-# OPTIONS --sized-types #-}

open import Size

-- Stream indexed by size
data Stream (i : Size) (A : Set) : Set where
  _∷_ : A → Thunk (Stream i) A → Stream (↑ i) A

-- ↑ i means "larger than i"
-- Thunk delays evaluation (coinduction)

-- take: consume part of a sized stream
take : ∀ {i A} → Nat → Stream i A → List A
take zero    _        = []
take (suc n) (x ∷ xs) = x ∷ take n (force xs)

-- map preserves size
map : ∀ {i A B} → (A → B) → Stream i A → Stream i B
map f (x ∷ xs) = f x ∷ λ where .force → map f (force xs)

-- merge: interleave two streams
-- Both streams get "used", sizes track this correctly
zipWith : ∀ {i A B C} → (A → B → C) → Stream i A → Stream i B → Stream i C
zipWith f (x ∷ xs) (y ∷ ys) =
  f x y ∷ λ where .force → zipWith f (force xs) (force ys)

-- Without sized types, the termination checker might reject these
-- because it can't see that streams are being consumed productively
```

Agda supports sized types. They're useful when the termination checker is too strict, particularly for coinductive definitions.

---

## Pure Type Systems

There are many typed lambda calculi: simply typed, System F, System Fω, the Calculus of Constructions, Martin-Löf type theory. Each has its own rules for what can depend on what. Is there a unified framework?

**Pure Type Systems** (PTS) provide a single parameterized framework that encompasses most typed lambda calculi. A PTS is defined by three sets:

- **Sorts** ($\mathcal{S}$): The "types of types." Typically $*$ (the type of ordinary types) and $\square$ (the type of $*$ itself)
- **Axioms** ($\mathcal{A}$): Which sorts have which sorts as their type (e.g., $* : \square$)
- **Rules** ($\mathcal{R}$): Triples $(s_1, s_2, s_3)$ specifying that functions from $s_1$ to $s_2$ live in $s_3$

By varying these parameters, you recover different type systems:

| System | Rules | What It Expresses |
|--------|-------|-------------------|
| Simply Typed $\lambda$-calculus | $(*, *, *)$ | Terms depending on terms |
| System F | $(*, *, *), (\square, *, *)$ | Types depending on types (polymorphism) |
| System F$\omega$ | $(*, *, *), (\square, *, *), (\square, \square, \square)$ | Higher-kinded types |
| $\lambda P$ (LF) | $(*, *, *), (*, \square, \square)$ | Types depending on terms (dependent types) |
| Calculus of Constructions | All four rule combinations | Full dependent types + polymorphism |

The **Lambda Cube** visualizes this: three axes representing term-to-term, type-to-type, and term-to-type abstraction. Each corner is a different type system.

```
                    λC (CoC)
                   /|
                  / |
                 /  |
               λPω  λP2
               /|   /|
              / |  / |
             /  | /  |
           λω   λP   System F
            |   |   /
            |   |  /
            |   | /
            λ→ (Simply Typed)
```

### Why It Matters

PTS provides:

- **Unified theory**: Understand all these systems as instances of one framework
- **Metatheoretic results**: Prove properties (normalization, type preservation) once, apply everywhere
- **Design guidance**: When designing a type system, you're choosing a point in this space
- **Implementation reuse**: Type checkers can be parameterized by PTS specification

The Calculus of Constructions (top corner) is the basis for Coq. Martin-Löf Type Theory (related but distinct) underlies Agda. Understanding PTS clarifies what dependent types *are*: the ability to form types that depend on terms, placed on equal footing with other forms of abstraction.

### Connection to Practice

When you write `Vector<n, T>` in a dependently typed language, you're using term-to-type dependency: the type `Vector` depends on the term `n`. This is the λP axis of the Lambda Cube. When you write `forall T. T -> T`, you're using type-to-term polymorphism: the System F corner.

Modern dependently typed languages live near the CoC corner, with various additions (universes, inductive types, effects) that go beyond the pure PTS framework but are still understood through it.

### Further Reading

- "Lambda Calculi with Types" by Henk Barendregt (the definitive reference)
- "Type Theory and Formal Proof" by Rob Nederpelt and Herman Geuvers

---

# Tier 5: Cutting Edge Research

These concepts are at the research frontier. They haven't reached mainstream languages yet, but they influence future designs. Brief coverage for completeness:

| Concept | What It Explores | Why It Matters |
|---------|------------------|----------------|
| **Graded Modal Types** | Unify effects + linearity in one framework | Single system for many features |
| **Call-by-Push-Value** | Unify call-by-name and call-by-value | Cleaner operational semantics |
| **Polarized Types** | Positive (data) vs. negative (codata) types | Better duality understanding |
| **Ornaments** | Systematically derive related types | Auto-generate `List` from `Nat` |
| **Type-Level Generic Programming** | Reflect on type structure | Auto-derive instances |
| **Logical Relations** | Prove program equivalence | Foundation for verification |
| **Realizability** | Extract programs from proofs | Programs from math automatically |
| **Observational Type Theory** | Equality without axioms | Computation + extensionality |
| **Two-Level Type Theory** | Separate meta from object level | Clean staging/metaprogramming |
| **Multimodal Type Theory** | Multiple modalities (necessity, etc.) | Generalize many features |

### Graded Modal Types (Brief Example)

```haskell
-- Granule: grades unify linearity and effects

id : forall {a : Type} . a [1] -> a   -- use exactly once
id [x] = x

dup : forall {a : Type} . a [2] -> (a, a)  -- use exactly twice
dup [x] = (x, x)

-- Grades form a semiring, combining naturally
-- One system handles linearity, privacy, information flow...
```

---

# Practical Concepts

A few concepts that don't fit the tier structure but are practically important:

## Variance

When `JsonResponse <: HttpResponse`, what's the relationship between `List<JsonResponse>` and `List<HttpResponse>`? It depends on how the container uses its type parameter.

This question matters for every generic type. You might expect `List<JsonResponse>` to be a subtype of `List<HttpResponse>` always. But that's wrong in general, and understanding why is key to writing correct generic code.

The intuition: if you can only *read* from a container (produce), then `List<JsonResponse>` can substitute for `List<HttpResponse>`. You asked for HTTP responses, I give you JSON responses, JSON responses are HTTP responses, everyone's happy. But if you can *write* to a container (consume), it's the reverse. A container that accepts any HTTP response can accept JSON responses. But a container that only accepts JSON responses can't substitute for one that accepts any HTTP response, because someone might try to put an XML response in it.

Mutable containers are the problem case. You can both read and write. Neither subtyping direction is safe. Java's decision to make arrays covariant was a mistake we're still paying for—you can put an Integer into a Number[] that's actually a Double[] at runtime, and it explodes.

```typescript
// TypeScript: variance annotations

// Covariant (out): Producer<JsonResponse> <: Producer<HttpResponse>
interface ResponseSource<out T> {
    fetch(): T;
}
// If it produces JsonResponses, it produces HttpResponses

// Contravariant (in): Handler<HttpResponse> <: Handler<JsonResponse>
interface ResponseHandler<in T> {
    handle(x: T): void;
}
// If it handles any HttpResponse, it can handle JsonResponses

// Invariant: no subtyping relationship
interface ResponseCache<T> {
    get(): T;          // covariant use
    store(x: T): void; // contravariant use
}
// Both uses = invariant (no safe subtyping)
```

---

## Phantom Types

Type parameters that appear in the type but not in the data. Used for compile-time distinctions.

At first, this sounds pointless. Why have a type parameter that doesn't affect the data? The answer: to carry information at the type level that the compiler checks, even though the runtime doesn't need it.

Consider a `UserId` and a `ProductId`. Both are just integers at runtime. But mixing them up is a bug. With phantom types, `Id<User>` and `Id<Product>` are different types, even though both hold a single integer. The phantom parameter (`User` or `Product`) exists only for the type checker. Zero runtime cost. Full compile-time safety.

The Mars Climate Orbiter (1999) was lost because one team used metric units while another used imperial, and 327 million dollars burned up in the Martian atmosphere. Phantom types turn unit mismatches into compile errors: `Distance<Meters>` and `Distance<Feet>` can't be mixed.

```rust
use std::marker::PhantomData;

// Unit types (no data, just type-level tags)
struct Meters;
struct Feet;

// Distance carries a unit, but only at type level
struct Distance<Unit> {
    value: f64,
    _unit: PhantomData<Unit>,  // zero runtime cost
}

impl<U> Distance<U> {
    fn new(value: f64) -> Self {
        Distance { value, _unit: PhantomData }
    }
}

// Can only add distances with the same unit
fn add<U>(a: Distance<U>, b: Distance<U>) -> Distance<U> {
    Distance::new(a.value + b.value)
}

let meters: Distance<Meters> = Distance::new(100.0);
let feet: Distance<Feet> = Distance::new(50.0);

// add(meters, feet);  // ERROR: expected Meters, got Feet
add(meters, Distance::new(50.0));  // OK: both Meters
```

---

## Row Polymorphism

Functions that work on records with "at least these fields," preserving other fields.

Regular generics abstract over types. Row polymorphism abstracts over *record structure*. A function `getName` needs records with a `name` field. It shouldn't care about other fields. Row polymorphism lets you write this: "give me any record with at least a `name: String` field, and I'll return the name."

Extra fields pass through unchanged. If you have `{ name: "Ada", age: 36, title: "Countess" }` and call `getName`, you get "Ada" back. The function ignores `age` and `title`, but doesn't require you to strip them first. More flexible than structural subtyping because it's parametric: works uniformly for any extra fields.

This is common in functional languages with records (PureScript, Elm, OCaml) and solves the problem of writing functions that operate on "records with certain fields" without committing to a specific record type.

```purescript
-- PureScript: Row polymorphism

-- Works on ANY record with a name field
-- The | r means "and possibly other fields"
getName :: forall r. { name :: String | r } -> String
getName rec = rec.name

-- Preserves extra fields!
getName { name: "Ada", age: 36 }             -- "Ada"
getName { name: "Alan", email: "a@b.c" }     -- "Alan"

-- Can require multiple fields
greet :: forall r. { name :: String, title :: String | r } -> String
greet rec = rec.title <> " " <> rec.name

greet { name: "Lovelace", title: "Countess", birth: 1815 }
-- "Countess Lovelace"
-- The 'birth' field passes through, ignored but preserved
```

---

# Languages Compared

Rather than ranking languages linearly, this section maps popular languages across the taxonomy axes. Real languages are bundles of trade-offs.

## Comparison Tables

### Core Type System

| Language | Checking | Discipline | Polymorphism |
|----------|----------|------------|--------------|
| **Rust** | Static | Nominal | Parametric + traits |
| **Haskell** | Static | Nominal | Parametric + typeclasses |
| **OCaml** | Static | Nominal + structural | Parametric + modules |
| **Scala** | Static | Nominal | Parametric + implicits |
| **TypeScript** | Gradual | Structural | Parametric + unions |
| **Python** | Dynamic | Nominal + protocols | Runtime ad-hoc |
| **Java** | Static | Nominal | Parametric (erased) |
| **C#** | Static | Nominal | Parametric |
| **Go** | Static | Structural | Parametric + interfaces |
| **C++** | Static | Nominal | Templates |
| **Lean/Coq** | Static | Dependent | Full dependent |

### Advanced Features

| Language | Inference | Linearity | Effects | Soundness |
|----------|-----------|-----------|---------|-----------|
| **Rust** | Bidirectional | Affine + lifetimes | Via types | Sound |
| **Haskell** | HM extended | Optional linear | Monads | Mostly sound |
| **OCaml** | HM | None | Algebraic | Sound |
| **Scala** | Bidirectional | None | Library | Edges unsound |
| **TypeScript** | Constraint | None | None | Unsound* |
| **Python** | Minimal | None | None | Unsound |
| **Java** | Local | None | None | Mostly sound |
| **C#** | Local | None | None | Sound |
| **Go** | Local | None | None | Sound |
| **C++** | Minimal | Manual/move | None | Easy to break |
| **Lean/Coq** | Bidirectional | None | Pure | Sound |

*TypeScript is intentionally unsound for pragmatic reasons.

## Language Profiles

### Rust

Ownership and affine typing for systems safety.

Rust's type system is built around *resource management*. Affine types (values used at most once) combine with the borrow checker to eliminate use-after-free, data races, and resource leaks at compile time. Lifetimes are region types that prove references don't outlive their referents.

Trade-offs: No garbage collector means some patterns (cyclic structures) require workarounds. Expect to fight the borrow checker for a few weeks before it clicks. But for systems code, the safety guarantees are unmatched outside research languages.

Best for: Systems programming, performance-critical applications, anywhere memory safety matters.

---

### Haskell

Parametric polymorphism plus effect encoding.

Haskell pioneered typeclasses (ad-hoc polymorphism without inheritance) and proved that effect tracking via monads works at scale. The type system supports higher-kinded types, GADTs, type families, and with extensions, approaches dependent types.

Trade-offs: Complexity accumulates. Extensions interact in surprising ways. Lazy evaluation complicates reasoning about performance. Productive Haskell requires internalizing concepts that don't transfer from imperative languages.

Best for: Compilers, financial systems, anywhere correctness matters more than onboarding speed.

---

### OCaml

Pragmatic functional programming with sound foundations.

OCaml keeps Hindley-Milner inference simple while adding modules with structural typing. The module system enables abstraction and separate compilation. Recently added algebraic effects bring first-class effect handling.

Trade-offs: Less expressive than Haskell, fewer libraries than mainstream languages. But the simplicity is intentional: the type system stays predictable.

Best for: Compilers (including Rust's original), theorem provers, DSL implementation.

---

### Scala

Maximum expressiveness on the JVM.

Scala pushes the boundaries of what's expressible in a statically typed language: path-dependent types, implicits for type-level computation, union and intersection types. Scala 3 cleans up the syntax while adding match types and explicit term inference.

Trade-offs: The expressiveness creates complexity. Compile times suffer. Some corners are unsound. The type system can be "too powerful" for teams that don't need it.

Best for: Complex domain modeling, big data (Spark), anywhere you need JVM compatibility with advanced types.

---

### TypeScript

Structural gradual typing with strong flow sensitivity.

TypeScript chose structural typing to model JavaScript's duck typing, and gradual typing to enable incremental adoption. Its flow-sensitive type narrowing is among the best: the type of a variable changes based on control flow. Union types and discriminated unions bring algebraic data types to JavaScript.

Trade-offs: Intentionally unsound in several places (bivariant function parameters, type assertions). The goal is usability and tooling, not proofs. `any` is always an escape hatch.

Best for: Large JavaScript codebases, teams migrating from untyped to typed, frontend development.

---

### Python

Runtime flexibility with optional static hints.

Python's type system is bolted on: the runtime ignores type hints entirely. Tools like mypy and pyright check them statically. This enables gradual adoption but means types are advisory, not enforced.

Trade-offs: No runtime guarantees. Type coverage varies across the ecosystem. But the flexibility is intentional: Python prioritizes "getting things done" over proving correctness.

Best for: Scripting, data science, rapid prototyping, anywhere development speed trumps runtime safety.

---

### Java

Nominal enterprise typing with conservative evolution.

Java's generics use type erasure for backward compatibility, limiting what's expressible. The type system is nominal: explicit declarations define relationships. Evolution is slow and deliberate.

Trade-offs: Verbose. Limited inference. No value types (until Valhalla). But stability and backward compatibility matter for enterprise software. Code written in 2004 still compiles.

Best for: Enterprise systems, Android development, anywhere long-term stability matters.

---

### C#

Pragmatic nominal typing with steady evolution.

C# evolves faster than Java, adding features like nullable reference types (flow-sensitive null tracking), pattern matching, and records. The type system is nominal but increasingly expressive.

Trade-offs: Windows-centric history (though .NET Core is cross-platform). Less expressive than Scala or Haskell. But the evolution is pragmatic: features that work in enterprise settings.

Best for: Windows development, game development (Unity), enterprise .NET systems.

---

### Go

Structural minimalism.

Go deliberately limits the type system. Interfaces are structural (implement by having the methods), generics were added reluctantly. The philosophy: simple tools for simple problems.

Trade-offs: Lack of expressiveness means repetitive code. No sum types means error handling via multiple returns. But the simplicity aids onboarding and tooling.

Best for: Cloud infrastructure, CLI tools, services where simplicity aids maintenance.

---

### C++

Unchecked power.

C++ templates are Turing-complete, enabling extreme metaprogramming. Move semantics approximate affine types but aren't enforced. The type system can express almost anything but guarantees almost nothing.

Trade-offs: Easy to write undefined behavior. Compile errors are notorious. But when you need zero-overhead abstraction with full control, nothing else competes.

Best for: Game engines, embedded systems, performance-critical code where control matters more than safety.

---

### Lean and Coq

Types are proofs.

These are proof assistants first, programming languages second. Full dependent types mean types can express any mathematical proposition, and programs are proofs of those propositions. Type checking is theorem proving.

Trade-offs: Writing proofs is hard. Libraries are limited. But for verified software (CompCert, seL4), they're the gold standard.

Best for: Formal verification, mathematics formalization, critical systems requiring proofs.

---

## One-Sentence Summaries

| Language | Core Type System Identity |
|----------|---------------------------|
| Rust | Ownership and affine typing for memory safety |
| Haskell | Parametric polymorphism plus monadic effects |
| OCaml | Sound HM inference with structural modules |
| Scala | Maximum expressiveness on the JVM |
| TypeScript | Structural gradual typing with flow sensitivity |
| Python | Runtime flexibility with optional static hints |
| Java | Conservative nominal enterprise typing |
| C# | Pragmatic nominal typing with steady evolution |
| Go | Structural minimalism by design |
| C++ | Unchecked power via templates |
| Lean/Coq | Dependent types where programs are proofs |

---

# Synthesis

## What Makes Type Systems Hard

### Decidability

The more expressive, the harder to check automatically:

| Feature | Type Checking |
|---------|---------------|
| Simply typed | Decidable, linear time |
| Hindley-Milner | Decidable, exponential worst case |
| System F (rank-N) | Checking decidable, *inference* undecidable |
| Dependent types | Undecidable in general (needs termination checking) |

### Inference

How much can the compiler figure out without annotations?

| Feature | Inference |
|---------|-----------|
| Local types | Full |
| Generics (HM) | Full |
| GADTs | Partial (needs annotations at GADT matches) |
| Higher-rank | None (requires explicit foralls) |
| Dependent | Almost none (proving needs guidance) |

### Type Equality

When are two types "the same"?

| System | Equality |
|--------|----------|
| Simple | Syntactic: `Int = Int` |
| With aliases | Structural: `type Age = Int`, then `Age = Int` |
| Dependent | Computational: must evaluate to compare |
| HoTT | Homotopical: paths between types |

### Feature Interaction

Features often compose poorly:

- **Subtyping + inference**: makes inference much harder
- **Dependent types + effects**: need special care (effects in types)
- **Linear types + higher-order functions**: subtle ownership tracking
- **GADTs + type families**: can make inference unpredictable

---

## Practical Evidence: Do Types Actually Help?

Anecdotes claim types catch bugs. But what does the evidence say?

### Empirical Studies

| Study | Finding |
|-------|---------|
| **Hanenberg et al. (2014)** | Static types improved development time for larger tasks but not small ones |
| **Mayer et al. (2012)** | Type annotations aided code comprehension, especially for unfamiliar code |
| **Gao et al. (2017)** | ~15% of JavaScript bugs in studied projects would have been caught by TypeScript/Flow |
| **Ray et al. (2014)** | Languages with stronger type systems correlated with fewer bug-fix commits (GitHub study of 729 projects) |
| **Microsoft (2019)** | 70% of security vulnerabilities in their C/C++ code were memory safety issues (addressable by Rust-style types) |

The evidence is **mixed but generally positive**:

- Types help most for **larger codebases** and **unfamiliar code**
- Types help less for **small scripts** where overhead exceeds benefit
- **Memory safety types** (Rust) show clearest wins for security-critical code
- **Gradual adoption** (TypeScript) shows measurable bug reduction even with partial coverage

### Tooling Impact

Type systems enable tooling that untyped languages can't match:

| Capability | Enabled By | Example |
|------------|-----------|---------|
| **Accurate autocomplete** | Type information | IDE knows methods on a variable |
| **Safe refactoring** | Type checking | Rename symbol across codebase |
| **Go to definition** | Type resolution | Jump to actual implementation |
| **Inline documentation** | Type signatures | See parameter/return types |
| **Dead code detection** | Exhaustiveness | Unreachable branches flagged |
| **Compile-time errors** | Type checking | Catch mistakes before running |

Languages like TypeScript transformed JavaScript development primarily through **tooling**, not runtime safety. The types exist largely to power the IDE experience.

The sweet spot varies by project. A weekend script doesn't need Rust's borrow checker. A database engine does.

---

## Verification in Practice

Dependent types and proof assistants blur the line between programming and mathematics. How are they actually used?

### Real Verified Systems

| System | What It Proves | Language/Tool |
|--------|---------------|---------------|
| **CompCert** | C compiler preserves program semantics | Coq |
| **seL4** | Microkernel has no bugs (full functional correctness) | Isabelle/HOL |
| **HACL*** | Cryptographic library is correct and side-channel resistant | F* |
| **Everest** | Verified HTTPS stack (TLS 1.3) | F*, Dafny, Vale |
| **CertiKOS** | Concurrent OS kernel isolation | Coq |
| **Iris** | Concurrent separation logic framework | Coq |
| **Lean's mathlib** | 100,000+ mathematical theorems | Lean 4 |

These are **production systems**, not toys. CompCert is used in aerospace. seL4 runs in military helicopters. HACL* is in Firefox and Linux.

### The Verification Workflow

Writing verified code differs from normal programming:

```
1. SPECIFICATION
   Write a formal spec of what the code should do
   (This is often harder than writing the code)

2. IMPLEMENTATION
   Write the code that implements the spec

3. PROOF
   Prove the implementation satisfies the spec
   (Interactive: you guide the prover)
   (Automated: SMT solver finds proof or fails)

4. EXTRACTION
   Generate executable code from the verified artifact
   (Coq → OCaml/Haskell, F* → C/WASM)
```

### Proof Burden

The ratio of proof code to implementation code is sobering:

| Project | Implementation | Proof | Ratio |
|---------|---------------|-------|-------|
| seL4 | ~10K lines C | ~200K lines proof | 20:1 |
| CompCert | ~20K lines C | ~100K lines Coq | 5:1 |
| Typical F* | varies | 2-10x implementation | 2-10:1 |

This is why verification is reserved for **critical infrastructure**, not business logic. But the ratio is improving as tools mature.

### Lightweight Verification

Full proofs are expensive. Lighter-weight approaches offer partial guarantees:

| Approach | What You Get | Cost |
|----------|-------------|------|
| **Refinement types** (Liquid Haskell) | Prove properties via SMT | Low annotations |
| **Property-based testing** (QuickCheck) | Find counterexamples | Write properties |
| **Fuzzing** | Find crashes/bugs | CPU time |
| **Model checking** | Explore state space | Build model |
| **Design by contract** | Runtime checks from specs | Write contracts |

Refinement types are the sweet spot for many applications: you get meaningful guarantees (array bounds, non-null, positive) without full proofs. Liquid Haskell and F* make this practical.

### When to Verify

| Verify When... | Skip Verification When... |
|----------------|---------------------------|
| Security-critical (crypto, auth) | Prototype/MVP |
| Safety-critical (medical, aerospace) | Business logic |
| High-assurance infrastructure | UI code |
| Correctness matters more than ship date | Deadline-driven |
| Bugs are catastrophically expensive | Bugs are cheap to fix |

Most code doesn't need formal verification. But for the code that does, types that can express and check proofs are invaluable.

---

## The Complexity Ranking

| Rank | Concept | Learning | Implementing | Worth It For |
|------|---------|----------|--------------|--------------|
| 1 | ADTs + Pattern Matching | Low | Low | Everyone |
| 2 | Generics | Low | Medium | Everyone |
| 3 | Traits/Typeclasses | Medium | Medium | Library authors |
| 4 | Affine Types (Rust) | Medium | Medium | Systems programmers |
| 5 | GADTs | Hard | Medium | DSL/compiler writers |
| 6 | HKT | Hard | Hard | FP enthusiasts |
| 7 | Effect Systems | Hard | Hard | Language designers |
| 8 | Refinement Types | Hard | Hard | Verified software |
| 9 | Dependent Types | Very Hard | Very Hard | Researchers, proof engineers |
| 10 | Session Types | Very Hard | Very Hard | Protocol verification |
| 11 | Cubical/HoTT | Extreme | Extreme | Mathematics, foundations |

---

## What to Learn Based on Your Goals

| Your Goal | Focus On |
|-----------|----------|
| Write better code in any language | ADTs, pattern matching, generics, traits |
| Systems programming | Affine types (learn Rust) |
| Library design | Generics, traits, associated types |
| Functional programming | HKT, typeclasses, effects |
| Build compilers/interpreters | GADTs, dependent types basics |
| Formal verification | Refinement types, dependent types |
| PL research | Everything, including HoTT |

---

## The Future

Several trends are reshaping how we think about types:

1. **Effect systems going mainstream**: Unison, Koka showing the way. Expect more languages to track effects.

2. **Refinement types in practical languages**: Lightweight verification becoming accessible.

3. **Linear types spreading**: Rust proved affine types work at scale. Others will follow.

4. **Gradual dependent types**: Getting dependent types into mainstream languages incrementally.

5. **Better tooling**: Type errors becoming clearer. IDE support improving. The UX gap is closing.

---

## Conclusion

Type systems exist on a spectrum from "helpful autocomplete" to "machine-checked mathematical proofs." Where you should be on that spectrum depends on what you're building.

For most code, Tier 1-2 concepts (ADTs, generics, traits, pattern matching) eliminate entire categories of bugs: null pointer exceptions, forgotten enum cases, type mismatches. They're available in Rust, Scala, Swift, Kotlin, and even TypeScript.

Tier 3 concepts (HKT, linear types, effects) require more investment but let you abstract over containers, track resources, and prove purity. Rust's ownership model shows that "hard" concepts can become mainstream when the tooling is right.

Tier 4+ concepts (dependent types, session types, HoTT) are mostly for researchers and specialists, but they're where tomorrow's mainstream features come from. Linear types were "research" until Rust. Effect systems might be next.

The best investment is understanding the *ideas* over the syntax. Once you grok "make illegal states unrepresentable," you'll apply it in any language. Once you understand why linear types matter, you'll appreciate Rust's borrow checker instead of fighting it.

---

---

# Appendix: Type System Taxonomy

This appendix provides a reference taxonomy of type system dimensions. These concepts are useful for understanding how languages differ, but aren't prerequisites for the main content.


## Hindley-Milner Type Inference

This section covers how type inference works under the hood - useful for understanding compiler behavior but not required for using type systems effectively.

Static typing traditionally meant annotating everything. Java's infamous:

```java
Map<String, List<Integer>> map = new HashMap<String, List<Integer>>();
```

This verbosity is why many developers fled to dynamic languages. But dynamic typing means discovering type mismatches at runtime, often in production.

What if the compiler could *figure out* the types? In 1969, Roger Hindley discovered (and Robin Milner independently rediscovered in 1978) an algorithm that can infer the most general type for any expression in a certain class of type systems, without any annotations.

The key observation: even without annotations, code contains type information. If you write `x + 1`, the compiler knows `x` must be a number because `+` requires numbers. If you write `x.len()`, `x` must be something with a `len` method. These constraints propagate through your program.

The algorithm works by:
1. Assigning fresh type variables to unknown types (like algebra: let `x` be unknown)
2. Collecting constraints from how values are used (`x + 1` means `x` must be numeric)
3. Unifying constraints to find the most general solution (solving the equations)

The "most general" part matters. If you write a function that works on any list, the algorithm infers "list of anything," not "list of integers." You get maximum reusability automatically.

- The brevity of Python with the safety of static typing
- Write code without type annotations; the compiler figures them out
- Catch type errors at compile time, not runtime
- The inferred type is always the *most general*, so your function works for all types that fit

```rust
// Rust: The compiler infers all types here
fn compose<A, B, C>(f: impl Fn(B) -> C, g: impl Fn(A) -> B) -> impl Fn(A) -> C {
    move |x| f(g(x))
}

let add_one = |x| x + 1;        // inferred: i32 -> i32
let double = |x| x * 2;         // inferred: i32 -> i32
let add_one_then_double = compose(double, add_one);

// No type annotations needed, compiler infers everything
let result = add_one_then_double(5);  // 12
```

```rust
// Even complex generic code needs minimal annotations
fn map<T, U>(items: Vec<T>, f: impl Fn(T) -> U) -> Vec<U> {
    items.into_iter().map(f).collect()
}

let numbers = vec![1, 2, 3];
let strings = map(numbers, |n| n.to_string());
// Compiler infers: T = i32, U = String
```

The trade-off: some advanced features ([GADTs](#generalized-algebraic-data-types-gadts), [higher-rank types](#rank-n-polymorphism)) break inference and require annotations. But for everyday code, you get static typing's safety without its traditional verbosity. Available in ML, OCaml, Haskell, Rust, F#, Elm, and Scala.

### Beyond HM: Other Inference Strategies

Hindley-Milner is the gold standard for inference in purely functional languages, but other strategies exist:

| Strategy | How It Works | Used In |
|----------|--------------|---------|
| **Bidirectional** | Types flow both up (inference) and down (checking) | Rust, Scala, Agda |
| **Constraint-based** | Collect constraints, solve with SMT/unification | Gradual typing, refinement types |
| **Local** | Infer within expressions, require declarations at boundaries | Java (var), C++ (auto) |

**Bidirectional typing** is particularly important for modern languages. Instead of pure inference (bottom-up) or pure checking (top-down), types flow both ways. When you write `let x: Vec<i32> = vec![1, 2, 3]`, the expected type `Vec<i32>` flows *down* to help infer the element type. When you write `let x = vec![1, 2, 3]`, the literal types flow *up* to infer `Vec<i32>`.

This scales better than pure HM to richer type systems. GADTs, higher-rank types, and dependent types all work well with bidirectional typing because explicit annotations guide inference where needed.

---

## Orthogonal Dimensions

Type systems are not a linear progression. They combine orthogonal axes independently:

```
CHECKING:     Static ← Gradual → Dynamic
EQUALITY:     Nominal ← → Structural
POLYMORPHISM: None → Parametric → Bounded → Higher-Kinded
INFERENCE:    Explicit → Local → Bidirectional → Full (HM)
RESOURCES:    Unrestricted → Affine → Linear
EFFECTS:      Implicit → Monadic → Algebraic
```

Real languages pick a point on each axis. Rust is static + nominal + affine. TypeScript is gradual + structural. Haskell is static + nominal + higher-kinded + monadic. The combinations create the design space.

### 1. Time of Checking

| Approach | When Types Checked | Examples |
|----------|-------------------|----------|
| **Static** | Compile time | Rust, Haskell, Java |
| **Dynamic** | Runtime | Python, Ruby, JavaScript |
| **Gradual** | Both, with boundaries | TypeScript, Python+mypy |

### 2. Type Equality

| Approach | Types Equal When... | Examples |
|----------|---------------------|----------|
| **Nominal** | Same declared name | Java, Rust, C# |
| **Structural** | Same shape/fields | TypeScript, Go interfaces |

### 3. Polymorphism

| Kind | What It Abstracts | Examples |
|------|-------------------|----------|
| **Parametric** | Type variables (`T`) | Generics in all typed languages |
| **Ad-hoc** | Different impls per type | Overloading, typeclasses, traits |
| **Subtype** | Substitutability | OOP inheritance, structural subtyping |
| **Bounded** | Constrained type variables | `T: Ord`, `T extends Comparable` |

### 4. Type Inference

| Strategy | How Types Inferred | Examples |
|----------|-------------------|----------|
| **Hindley-Milner** | Global, principal types | ML, Haskell, OCaml |
| **Bidirectional** | Up and down the AST | Rust, Scala, Agda |
| **Local** | Within expressions only | Java `var`, C++ `auto` |
| **Constraint-based** | Solve constraint systems | TypeScript, gradual systems |

### 5. Predicate Refinement

| Level | What Types Express | Examples |
|-------|-------------------|----------|
| **Simple** | Base types only | Most languages |
| **Refinement** | Types + predicates (`{x: Int \| x > 0}`) | Liquid Haskell, F* |
| **Dependent** | Types compute from values | Idris, Agda, Lean |

### 6. Substructural / Resource Tracking

| Discipline | Usage Rule | Examples |
|------------|-----------|----------|
| **Unrestricted** | Any number of times | Most languages |
| **Affine** | At most once | Rust ownership |
| **Linear** | Exactly once | Linear Haskell, Rust borrows |
| **Relevant** | At least once | Research systems |
| **Ordered** | Once, in order | Stack disciplines |

### 7. Effect Tracking

| Approach | What's Tracked | Examples |
|----------|---------------|----------|
| **None** | Effects implicit | Java, Python, Go |
| **Monadic** | Effects in type wrappers | Haskell IO |
| **Algebraic** | First-class effect handlers | Koka, OCaml 5 |

### 8. Flow Sensitivity

| Approach | Type Changes With... | Examples |
|----------|---------------------|----------|
| **Insensitive** | Fixed at declaration | Java, C |
| **Sensitive** | Control flow | TypeScript, Kotlin, Rust |

### 9. Concurrency / Communication

| Approach | What's Typed | Examples |
|----------|-------------|----------|
| **Untyped** | No protocol checking | Most languages |
| **Marker traits** | Send/Sync capabilities | Rust |
| **Session types** | Protocol state machines | Research, Links |

---

**Real languages combine these axes.** Rust is static + nominal + parametric + bidirectional + affine + flow-sensitive + marker traits. TypeScript is gradual + structural + parametric + constraint-based + flow-sensitive. There's no single "best" combination; each serves different goals.

## The Expressiveness Map

How type systems relate in terms of expressiveness versus annotation burden:

```
                    EXPRESSIVENESS
                    Low ──────────────────► High
                    │
         Simple     │  ML            Haskell+Exts
         Inference  │   │               │
                    │   ▼               ▼
                    │  Rust ────► Rust+GATs
                    │   │               │
                    │   │      Scala 3  │
                    │   │         │     │
                    │   ▼         ▼     ▼
                    │         OCaml+Mods
                    │              │
                    │              ▼
         Needs      │         F*/Lean ◄── Refinements
         Annotations│              │
                    │              ▼
                    │         Idris/Agda ◄── Full Dependent
                    │              │
                    │              ▼
         Proof      │         Coq/Lean4 ◄── Proof Assistant
         Required   │              │
                    │              ▼
                    │         Cubical ◄── HoTT
                    │
                    ▼
              ANNOTATION BURDEN
```

The further right you go, the more you can express in types. The further down you go, the more work you must do to satisfy the type checker. The sweet spot depends on your domain.

---

## Dynamic Type Systems

In dynamic languages, types exist and are checked, just at runtime rather than compile time.

Values carry type tags at runtime. Operations check these tags before executing:

```python
# Python: types checked at runtime
def add(a, b):
    return a + b

add(1, 2)       # Works: both ints
add("a", "b")   # Works: both strings
add(1, "b")     # TypeError at runtime!
```

The type error still happens. It just happens when you run the code, not when you compile it. This trades earlier error detection for flexibility and development speed.

Dynamic typing works well for:

- **Prototyping and exploration**: When you don't yet know what shape your data will take
- **Scripts and glue code**: Short-lived code where development speed matters more than maintenance
- **REPLs and interactive development**: Immediate feedback without compilation
- **Highly dynamic domains**: Serialization, ORMs, and metaprogramming where static types fight the problem

Dynamic typing is "types checked later."

The question isn't "static vs dynamic" but "how much static?" Python with type hints, TypeScript with strict mode, Rust with full ownership tracking: these represent different points on a spectrum. Pick the point that matches your problem.

Python, Ruby, JavaScript, Lisp, Clojure, Erlang, Elixir. Most have optional type systems now (Python's type hints, TypeScript for JavaScript).

## Gradual Typing

Gradual typing blends static and dynamic checking within the same language. You can add types incrementally, and the system inserts runtime checks at the boundaries between typed and untyped code.

In a gradually typed system, you can leave parts of your code untyped (using `any` or equivalent) while fully typing other parts. The type checker verifies the typed portions statically. At runtime, checks are inserted where typed code interacts with untyped code.

```typescript
// TypeScript: gradual typing in action
function greet(name: string): string {
    return `Hello, ${name}`;
}

// Fully typed: checked statically
greet("Ada");  // OK at compile time

// Escape hatch: 'any' bypasses static checking
function processUnknown(data: any): void {
    // No compile-time checking on 'data'
    console.log(data.someProperty);  // Could fail at runtime
}

// The boundary: where typed meets untyped
function fromExternal(json: any): User {
    // Runtime validation needed here
    return json as User;  // Risky! No guarantee json matches User
}
```

The **gradual guarantee** is the formal property that makes this work: adding type annotations should not change program behavior (unless there's a type error). You can migrate from untyped to typed code one function at a time without breaking anything.

This enables incremental adoption:
1. Start with a dynamically typed codebase
2. Add types to critical paths first
3. Gradually expand type coverage
4. Runtime checks catch boundary violations

### Blame Tracking

When a type error occurs at a boundary, who's at fault? **Blame tracking** attributes errors to the untyped side of the boundary. If typed code calls untyped code and gets a wrong type back, blame falls on the untyped code.

```python
# Python with type hints
def typed_function(x: int) -> int:
    return x + 1

def untyped_function(y):
    return "not an int"  # Bug here

# At runtime, the error is blamed on untyped_function
result: int = untyped_function(5)  # Runtime TypeError
```

TypeScript, Python (with mypy/pyright), PHP (with Hack), Racket (Typed Racket), Dart (before null safety), C# (with nullable reference types).


## Further Reading

**Books:**
- "Types and Programming Languages" by Benjamin Pierce, the textbook
- "Software Foundations", free online, interactive proof-based introduction
- "Programming Language Foundations in Agda", dependent types for programmers

**Languages to try:**
- **Rust**: Best practical introduction to affine types
- **Haskell**: HKT, typeclasses, GADTs, the functional programming standard
- **Idris 2**: Most accessible dependent types
- **Koka**: Clean effect system design

**Papers:**
- "Propositions as Types" by Philip Wadler, covers the Curry-Howard correspondence
- "Theorems for Free" by Philip Wadler, what parametricity guarantees
- "Linear Types Can Change the World", why linearity matters

---

### The Death of the Inner Self

*Published: 2025-12-23*

> A draft thesis that individuality and modern consciousness are historical coordination technologies now being eroded by computation, capital, and automated feedback loops.

URL: https://federicocarrone.com/articles/the-death-of-the-inner-self/

I am working on a series of articles that connect biology, computation, philosophy, the history of the West and the formation of individuality. The core argument is simple: many features of human life that appear stable and natural are historically produced. As society accelerates a number of these features begin to lose their function and their permanence. This is just a draft of what I'm working on.

Life is organized around information that replicates under constraint. Computation generalizes this biological logic. It allows selection and optimization to occur faster and at larger scales by externalizing memory comparison and feedback. Problems that once required internal deliberation can be solved through external processes that test filter and iterate possibilities. Capital pushes this logic further. It reorganizes social life around continuous feedback price signals and competitive selection. As these forces compound individuality starts to look less like a foundation and more like an interface that emerged to solve earlier coordination problems. Capital behaves as an impersonal intelligence oriented toward speed abstraction and self optimization. As cognition decision making and coordination migrate into automated systems the inner self loses its structural role.

Over time many assumptions we take for granted are worn down by this acceleration. Individuality and consciousness appear increasingly exposed to this process.

The death of the inner self and individuality

Fish do not realize they live in water. The medium that sustains them is so constant that it disappears from perception. Some of the most important structures are overlooked for the same reason. Individuality and consciousness belong to that category.

We tend to treat individuality and consciousness as self evident facts. As if humans have always experienced themselves as bounded selves with an inner voice a private mental space and a continuous narrative identity. Because this experience feels natural it is assumed to be timeless. However for most of human history people did not describe themselves as individuals in the modern sense. Decisions were not understood as outcomes of inner deliberation and agency was not located inside a private interior self. Action was organized through rituals traditions, kinship and prescribed roles. Meaning arrived from outside the person rather than from introspection. In many societies outside the Western trajectory this structure remains largely intact.

The idea of a you inside your head observing your own thoughts is therefore a learned construction. It depends on language habits metaphors and social practices that had to be developed and stabilized over time. Inner speech narrative memory moral self examination and the sense of authorship over action emerged as cultural achievements layered on top of older biological processes.

Modern societies actively reproduce this configuration. From early childhood people are trained to understand themselves as autonomous units with opinions preferences goals and an inner life that belongs only to them. The training is so pervasive that it becomes invisible. Other ways of being human recede from view even though many have existed and some still persist.

The conditions that once made individuality functional are weakening. Earlier systems relied on human subjects to think decide judge and take responsibility. Cognition and coordination were constrained by human minds. Individuality emerged as a solution. A stable self enabled long term planning moral accounting and institutional continuity.

Earlier societies coordinated without modern consciousness. Contemporary systems increasingly coordinate without modern selves. Decision making proceeds without inner deliberation. Meaning is delivered through incentives metrics and feedback loops. At the cultural level individuality remains constantly invoked. People are urged to be themselves express themselves optimize themselves. Yet the channels for expression arrive pre shaped quantified and monetized. What appears as selfhood increasingly takes the form of managed performance within narrow bounds.

The modern self once felt inevitable because it solved concrete problems. It enabled abstraction continuity and responsibility at scale. Its future usefulness is far less certain.

---

### Notes on permanence, time, and ergodicity

*Published: 2025-12-15*

> A framework for building institutions that compound under pressure: treat time as information, preserve loop integrity, and focus on domains where repetition improves judgment.

URL: https://federicocarrone.com/articles/notes-on-culture-infrastructure-time-and-ergodicity/

[Ergodic Group](https://ergodicgroup.com/) is organized around the observation that certain systems change character through sustained engagement. In these systems, repetition refines execution, experience carries forward, and accumulated judgment reshapes future outcomes. Time is not neutral. It filters error, stabilizes standards, and reveals structural quality. Systems differ less in what they produce than in how they behave under repeated contact with reality.

Contemporary society is accelerating across technical, cultural, and organizational dimensions. Cycles shorten, signals multiply, and coordination occurs under constant pressure to respond. This acceleration compounds itself, tightening feedback loops and compressing decision horizons. As pace increases, many structures continue operating while gradually shedding accumulated judgment and internal coherence. Activity persists while formation weakens. Systems appear functional even as their capacity to learn erodes.

Acceleration also alters the shape of error and transmission. As pace increases, decisions become harder to reverse while feedback quality declines. Irreversibility moves upstream. Early mistakes remain cheap only briefly, after which correction costs rise sharply. At the same time, what spreads fastest diverges from what works best. Forms that replicate quickly outperform those that behave correctly under repetition. Compression, legibility, and ease of copying dominate transmission while durability becomes harder to observe. Imitation spreads faster than learning. What circulates most widely is rarely what compounds judgment over time.

Acceleration operates as a selection mechanism at two levels. At the level of individual systems, time reveals whether repetition compounds judgment or merely increases exposure. At the level of entire sectors, time reveals which categories can sustain formation under continuous acceleration. Most cannot. Sectors that depend on local advantage, fragile differentiation, or temporary coordination tend to fragment, commoditize, or disappear rather than evolve.

Acceleration often produces less transformation than it promises. Tools change rapidly while underlying constraints remain intact. Activity increases even as structural novelty diminishes. What presents itself as innovation is often the rapid circulation of forms rather than a reconfiguration of fundamentals.

Under these conditions, endurance becomes informative. Systems that continue to behave correctly across long spans of stress and variation reveal alignment between structure, incentives, and reality. Continued correctness functions as a diagnostic signal. Duration matters because few systems remain exposed long enough for time to test them. What matters is how behavior evolves under repetition.

The internet did not dissolve coherence. It revealed texture that homogeneity had previously concealed. What once appeared uniform now resolves into distinct patterns of intention, execution, and quality. As distribution becomes universal, distinction reemerges through fidelity rather than availability. Abundance exposes superficiality. Depth becomes legible through sustained correctness rather than momentary visibility.

## Two forms of time

Time operates in human systems in two fundamentally different ways.

Measured time is divisible and uniform. It is organized into intervals and governs schedules, deadlines, accounting periods, and discounting. It can be allocated, optimized, compressed, and exchanged. Planning systems and evaluation frameworks operate within measured time, assuming that value can be assessed independently of history.

Lived time is accumulative and qualitative. It is shaped by what occurs within it. Learning, memory, and judgment develop through lived time. Each cycle alters what follows. Later moments differ in kind from earlier ones because experience reshapes perception, attention, and capacity.

Processes that depend on formation operate in lived time. Each cycle changes the character of subsequent cycles. Experience accumulates rather than repeating identically. What is learned reshapes what becomes possible. These processes cannot be evaluated correctly through snapshots because their value emerges through accumulation rather than momentary performance.

Systems organized entirely around measured time assume that intervals are interchangeable. This assumption holds only where repetition does not change outcomes and exposure does not reshape capability.

When lived time is forced into measured time, formation fails. Standards cannot stabilize. Judgment cannot compound. The information produced by evaluation becomes distorted because it ignores history.

Time reveals what a system actually is. Structure becomes legible through sustained activity. What appears coherent early may fail under variation, load, or shifting incentives. What endures across repeated contact with reality discloses properties that cannot be inferred in advance. Continued correctness produces evidence rather than claims.

## Formation under constraint

Excellence emerges from sustained practice under appropriate constraints. Athletic, artistic, intellectual, and technical achievement follow this pattern. Initial talent offers limited advantage relative to disciplined engagement over extended periods of formation.

Practice systems depend on complete cycles of effort, feedback, and adjustment. When these cycles are interrupted or prematurely evaluated, formation stalls. Learning fragments. Standards decay.

Errors must remain survivable for learning to occur. Adjustment must be possible without each iteration becoming terminal. Judgment improves only when experience accumulates across attempts rather than being truncated by constant resetting.

When systems optimize for immediate measurement, necessary errors are avoided rather than integrated. Practices converge toward forms that appear legible early yet fail under prolonged stress. The effort required to maintain genuine standards becomes unsustainable under continuous pressure for immediate results.

Capital structures that impose fixed evaluation windows reinforce this dynamic. When outcomes must be made legible on predetermined schedules, decisions orient toward transferability rather than structural soundness. Systems develop according to measurement regimes rather than their own requirements for formation. Failure is deferred and rendered less visible.

Practices generate two kinds of value. External value appears outside the practice through money, status, and recognition. Internal value emerges only through the practice itself, through excellence inseparable from the process that produced it.

When structures privilege rapid extraction of external value, internal value formation is displaced. Standards that require time to stabilize are abandoned in favor of standards that generate immediate signals. What the practice uniquely develops is replaced by what can be transferred quickly.

## Four domains

The work of Ergodic Group unfolds across four interdependent domains: mathematics, code, culture, and craft. These domains describe a group level system through which abstract structure becomes operative reality and through which reality reshapes future abstraction. Individual companies typically operate primarily within one domain. Advantage emerges through connection rather than internal completeness.

Mathematics establishes structure and constraint. It defines relationships that remain valid as complexity increases. Mathematical coherence preserves alignment under transformation. A formula outlives its creator, becoming compressed truth that compounds through reuse.

Code translates structure into execution. It transforms logic into process, protocol, and system. Code brings abstraction into contact with reality through systems that must operate under stress. Over time, clarity and precision determine whether systems converge toward reliability or degrade under pressure.

Culture coordinates meaning across time and participation. It encodes standards and shared understanding that allow intent to persist as participants change. Culture resists flattening under acceleration by preserving depth, taste, and judgment.

Craft grounds abstraction through execution. It expresses standards in material reality, where precision and care impose irreversible costs. Materials resist. Execution reveals flaws. Craft encodes time, place, and skill into forms that can be shared.

These domains form a continuous loop across the group. Insight emerges from confronting execution with constraint. Standards stabilize through repetition. Precision advances through accumulated understanding. Learning compounds only when domains remain connected.

What persists across time does so through specific mechanisms: algorithmic logic that can be re-implemented, textual knowledge that can be recopied, cultural practices that can be retransmitted, material techniques that can be relearned. These domains describe how structure, meaning, and capability transfer across generations when connected rather than isolated.

## Ergodicity as a filter

Ergodicity describes when repetition improves typical outcomes. In ergodic processes, learning transfers, experience accumulates, and later decisions benefit from earlier cycles. Competence compounds.

Where repetition strengthens judgment, time improves performance. Where outcomes fail to converge, time increases exposure without increasing capability. Ergodicity functions as a selection filter, indicating whether a system benefits from continued operation or is gradually revealed as fragile.

Under accelerating conditions, this filter concentrates value. Many sectors fragment and disappear as coordination costs rise and memory collapses. What remains are systems that preserve coherence under exposure.

Infrastructure and culture persist under this pressure because they function as environments rather than sectors. Infrastructure coordinates action through rules and constraints that must hold under stress. Each correct operation adds evidence of reliability. Culture coordinates meaning through sustained production. Standards accumulate. Direction persists.

Both absorb volatility and convert continuity into capability. Operational history becomes evidence. Coherence becomes legible. Time becomes an asset where learning transfers. They persist because they institutionalize maintenance rather than novelty, embedding accumulated judgment into structure and practice even when maintenance remains invisible to the market.

## Operation

Ergodic Group is organized around this understanding. Time is treated as an information process. Companies are built or acquired primarily within one domain. Advantage is created by connecting them so that learning, standards, and constraints transfer across domains rather than remaining isolated.

This understanding shapes operation. Loop integrity is maintained across transitions. Formation is allowed to occur in lived time rather than being forced into measured time. Attention concentrates on domains where repetition produces compounding advantage and where connection strengthens performance.

Those domains win with time.

---

### Crypto doctrine

*Published: 2025-09-25*

> Crypto found product-market fit where trust is weakest: inflationary or censored economies, and internet-native communities that need programmable coordination and markets.

URL: https://federicocarrone.com/articles/crypto-doctrine/

# Crypto and the accelerated and chaotic 21st Century

We believe crypto has been incredibly successful at providing a trustless financial layer for the 21st century. In particular it has found product market fit in two main areas:

In developing countries providing aids and tools to individuals that need to fight against inflation, censorship and for companies and individuals to be able to business.
Internet native communities that need a financial layer in the web that allows them to express and coordinate at a scale that wasn’t possible before. They have created new financial assets and markets that seem absurd from outside. Many times they are also absurd from the inside.
People that don’t live in a developing country or that didn’t grow up with the internet have enormous difficulties understanding crypto because they don’t have skin in its game. They believe crypto doesn’t have any “real” use case or that is not serious enough. They are right. The thing is that we are living in a world that’s is becoming more absurd. Memes do not only make you laugh anymore, memes are now winning elections.

We’re sure that these two use cases will grow with time and probably new ones will be found. The world is becoming more chaotic and more divided each day. The stability that existed since the fall of the Soviet Union and the beginning of the pandemic appears to have become a thing of the past. Only change will become the norm. And we love it.

This will make crypto even bigger. One of its prime advantages is that it kills many of the middlemen and allow us to coordinate even in the harshest environments. Trust assumptions are lowered thanks to economic incentives, compilers, distributed systems and cryptography. Crypto lowers the reliance on human beings. This empowers humans. It allows them to concentrate their disputes and efforts in subjectives areas. Crypto creates safe zones where some parts of the human activity becomes non-debatable (until quantum computers solve the discrete log problem).

Most of us are internet natives. We have been using irc, 4chan, reddit, hacker news, twitter, Bitcoin and Ethereum since their beginning and our organization has deep roots in unstable countries. In our roots we have a strange mix of knowing what it is to live in very chaotic societies and how to develop businesses within them and at the same time we are builders that love working in the frontier of engineering and scientific developments. We are the Fremen of crypto, raised in a harsh environment.

Open source and decentralization are not only philosophical ideas but necessary practical conditions to build crypto. Building in the open, helping onboard others and creating movements bigger than the original project are crucial for crypto projects to succeed long term. Sometimes it’s difficult for us to explain our actions to others that don’t follow the same ethos since we are not maximizing the same outcomes.

Our main objective is to help these new internet highways to be built in sustainable ways. Economic sustainability is one key aspect but there are others. We are a force that builds large technological projects but that also counterweights the natural tendency to centralize as a side effect of optimization. Centralization is easier and cheaper in the short run.  If we would want to optimize money, there are easier ways to do it. The thing is that is not our main objective. We only see money as a tool to achieve our objectives.

With or without money you will find us building. You are invited to join us in our journey.

“Top-down management leveraging command-and-control hierarchies are for the mahogany boardrooms of yesteryear. We are navigators, adventurers, and explorers of the future. We are married to the sea” - Yearn's Blue Pill

---

### Transforming the Future with Zero-Knowledge Proofs, Fully Homomorphic Encryption and new Distributed Systems algorithms

*Published: 2023-04-13*

> Zero-knowledge proofs, FHE, and modern distributed-systems primitives can expand trust-minimized coordination by proving computation correctness without revealing underlying data.

URL: https://federicocarrone.com/articles/transforming-the-future-with-zero-knowledge-proofs-fully-homomorphic-encryption-and-new-distributed-systems-algorithms/

The evolution of every scientific discipline or engineering field experiences cycles akin to those observed in economies. Incremental advancements are made daily by corporations, individuals, and academic institutions. Occasionally, a researcher or engineer makes a groundbreaking discovery that alters the course of the field. One such example is Sir Isaac Newton, who made significant contributions to calculus, motion, optics, and gravitation during the time of the bubonic plague, which claimed millions of lives. His relentless pursuit of knowledge throughout the pandemic proved instrumental in shaping the development of mathematics, physics, and engineering. Our comfortable modern lives stand upon the foundation of these monumental discoveries.

The general public is aware of the big breakthroughs made in the aerospatial industry, energy production, internet of things, and last but not least artificial intelligence. However, most don’t know that during the COVID pandemic, enormous advances were made in cryptography. 47 years ago Diffie and Hellman wrote in their famous cryptography paper: “we stand today on the brink of a revolution in cryptography”, which enabled two people to exchange confidential information even when they can only communicate via a channel monitored by an adversary. This revolution enabled electronic commerce and the communication between citizens of the free world. We believe the discoveries made by researchers and engineers in cryptography during this COVID pandemic will be as important as the discoveries made by Diffie and Hellman in the upcoming decades.

One of the big discoveries has been how to make Zero-Knowledge Proofs fast enough for real-world applications. This technology has been around since 1984 but as Diffie also said, “Lots of people working in cryptography have no deep concern with real application issues. They are trying to discover things clever enough to write papers about”. Fortunately for humanity, researchers and engineers have made this technology practical enough in the last decade (especially the last 2 years) to be useful.

The financial system depends on the existence of intermediaries: an army of auditors, regulators, and accountants. The correct working of the financial machine depends on the integrity of its financial institutions. Integrity is maintained due to positive economic incentives and jail time, fines, and costly lawsuits if the intermediaries don’t do what the state and society expect from them. Bitcoin, a result of the 2008 crisis, created a permissionless financial system where its users can send and receive digital money without intermediaries and without anybody being able to block transactions. In countries like Argentina, Nigeria, or Lebanon, where stagnation and inflation erode its citizens' trust in the financial system and the state, Bitcoin and stablecoins on top of Ethereum are used on a daily basis by the young population to save and avoid capital controls. In developed countries, its usage is not as massive since the traditional financial system and the state is trusted by most citizens. However, the world is becoming more complex. Banks are failing in the US and Europe, a new war is taking place in Europe, debt levels are not sustainable in many countries, the fight between left and the right is retaking the main stage, tension between the West and the East increases, and technological change keeps accelerating.

New applications built on top of unstoppable and trustless technologies that don’t depend on social trust will grow and thrive in this type of environment. Everything is being questioned. Only things that can’t be questioned will fully resist the passage of time. This will happen not only in developing countries but also in developed ones. Systems like Bitcoin, where everyone can verify how it’s running, are more resilient and become more useful by the day in a world that is getting more complex.

Bitcoin's focus has been to become a new type of monetary asset and financial network. For this reason, the development of more complex programs on top of Bitcoin has always been restricted by design. Newer blockchains like Ethereum added the ability to create new types of applications. DeFi Protocols that enabled lending and borrowing, exchange of digital currencies and the ability to buy, sell and trade digital collectives and arts rapidly grew on top of Ethereum. However the cost of creating and transferring relevant amounts of assets in blockchains is costly. The ability to create more complex applications that sit on top of blockchains is also very limited. Applications can’t run more than a few milliseconds on Ethereum.

These systems do not rely on social integrity like traditional systems. Instead, they operate as a permissionless and censorship-resistant network, allowing anyone to add a node and submit updates to its state. To ensure verification, each node must re-execute all transactions, which makes the system decentralized and secure, albeit slower than centralized systems. Consequently, this imposes a limitation on the types of applications that can be built on blockchains. Applications requiring frequent database state updates, such as those exceeding a few times per second, or machine learning algorithms, are not feasible on blockchain platforms.

This is where Zero Knowledge Proofs (ZKPs) and other cryptographic and distributed systems primitives will help society create tools that can be used by everyone. ZKPs enable a party to demonstrate a statement to other parties without revealing any information beyond the proof. In more concrete terms, this enables a person to show another person that the computation they did is correct without having to redo it and without even having to grant access to the data that was used. An important aspect of this is that the verification is done in a much faster time than the proving. In even simpler terms, it proves that the output of a certain computation is correct. The verification is way easier and faster to do than the execution or proving. Anybody can check the proof, and this saves computing time and money.

At the beginning it’s difficult to grasp, even for engineers, that such a technology is even possible. The mathematics behind it, until recently, seemed magical, and that’s why it was called moon math. Thanks to ZKPs, transferring money in blockchains similar to Bitcoin is cheaper and way faster since there is no need to re-execute each transaction by each node. Only one node is needed to process all the transactions and prove them using a ZKPs, while the rest simply need to verify it, saving valuable computing resources. Among other things, ZKPs enable creating a financial system that doesn’t depend on social trust like traditional finance and that doesn’t depend as much on re-executing algorithms as Bitcoin.

Zero Knowledge Proofs facilitate the development of an entirely new range of applications that are executed and proven on a single computer outside the blockchain, with verification occurring within Ethereum. The verification cost is way cheaper than the time it takes to prove or execute it. Ethereum will evolve from a slow yet secure distributed mainframe, where execution time is shared among all users to run small programs, into a distributed computer that stores and verifies proofs generated externally from the blockchain.

Not only will blockchains benefit from the development of new cryptographic primitives like Zero Knowledge Proofs (ZKPs), but other areas will also be significantly impacted. As AI-generated content begins to overshadow human-generated content on the internet, ZKPs will become essential for verifying that such content was produced by unbiased AI models. "Proof of humanity" systems are already employing ZKPs to ensure the accurate computation of a human accessing specific resources.

Hardware is another area where ZKPs will make an impact. Similar to how graphics cards in the 1990s revolutionized the video game industry, zero-knowledge hardware acceleration will be integrated into computers to enhance efficiency.

ZKPs can also be utilized to balance storage and computation securely. For instance, security cameras generate vast amounts of data. ZKPs can provide a compact proof that AI models did not detect any critical information in the video, allowing the system to delete the footage and save storage space.

ZKPs will even be used for national security purposes. As energy production shifts from centralized power plants to distributed sources like solar panels and wind turbines, verifying the proper execution of software on their controllers becomes vital. In the coming decades, ZKPs will play a crucial role in securing these devices.

Software industry regulations are inevitable, and industries such as online casinos and ad networks using Real-Time Bidding protocols will be legally required to demonstrate that they have not deceived their clients. Laws protecting users from large tech corporations are already in place in Europe, partly due to concerns about data misuse by foreign powers to influence political campaigns.

Requirements for secure storage and processing of encrypted data will become increasingly necessary. Fully Homomorphic Encryption (FHE), a technology akin to ZKPs, will be one of the tools utilized for this purpose. FHE enables computation on encrypted data, ensuring privacy. As FHE becomes more efficient and practical, most databases will integrate some FHE functionality, preventing administrators from accessing user data directly.

Zero-knowledge proofs (ZKPs), which generate evidence for a third party to confirm the accurate execution of a computation, and Fully Homomorphic Encryption (FHE), which enables calculations on encrypted data, will be combined with distributed systems algorithms that are capable of tolerating significant network failures similar to those employed by Bitcoin. Together they will be utilized to comply with regulations while creating trustless applications.

In the past decade, we have successfully launched applications serving dozens of millions of users. Leveraging our expertise, we are now dedicated to providing both technical and financial support to help others create startups focused on developing and implementing these vital technologies. As society grapples with the challenges of our rapidly evolving world these innovations will prove to be indispensable.

---

## Series: Concrete

Concrete is a systems programming language designed for safe and predictable code, with semantics defined by a small core calculus formalized and proven sound in Lean. It combines linear types, static capability tracking, and region-scoped borrowing.

### The Concrete Programming Language: Systems Programming for Formal Reasoning

*Published: 2025-12-26*

> Concrete is a systems language proposal built for machine reasoning, aiming to combine low-level performance with formal verification, safety, and expressive design.

URL: https://federicocarrone.com/series/concrete/the-concrete-programming-language-systems-programming-for-formal-reasoning/

There's a tension at the heart of systems programming. We want languages expressive enough to build complex systems, yet simple enough to reason about with confidence. We want performance without sacrificing safety. We want the freedom to write low-level code and the guarantees that come from formal verification.

Concrete is an attempt to resolve these tensions through commitment to a single organizing principle: **every design choice must answer the question, can a machine reason about this?**

## On This Specification

This document describes what we're building, not what we've finished building. The kernel formalization in Lean is ongoing work. Until that formalization is complete, this specification likely contains mistakes, ambiguities, and internal contradictions.

We state this not as an apology but as a feature of our approach. Most language specifications accumulate contradictions silently over years, edge cases where the spec says one thing and the implementation does another, or where two parts of the spec conflict in ways nobody noticed. These contradictions become load-bearing bugs that can never be fixed without breaking existing code.

By designing Concrete around a formally verified kernel from the start, we force these contradictions into the open. When we formalize a feature in Lean, the proof assistant will reject inconsistencies. Features that seem reasonable on paper will turn out to be unsound, and we'll have to redesign them. This is the point. We'd rather discover that our linearity rules have a hole *before* a million lines of code depend on the broken behavior.

The specification and the formalization will co-evolve. As we prove properties in Lean, we'll update this document. As we write this document, we'll discover what needs proving. The goal is convergence: eventually, this specification will be a human-readable projection of a machine-checked artifact.

### Stability Promise

The kernel is versioned separately from the surface language. Once the kernel reaches 1.0, it is frozen. New surface features must elaborate to existing kernel constructs. If a feature can't be expressed in the kernel, the feature doesn't ship.

## The Core Idea

Most languages treat verification as something bolted on after the fact. You write code, then maybe you write tests, maybe you run a linter, maybe you bring in a theorem prover for critical sections. The language itself remains agnostic about provability.

Concrete inverts this relationship. The language is *designed around* a verified core, a small kernel calculus formalized in Lean 4 with mechanically-checked proofs of progress, preservation, linearity soundness, and effect soundness. The surface language exists only to elaborate into this kernel.

### What "Correct" Means

When we say a type-checked program is "correct by construction," we mean correct with respect to specific properties:

- **Memory safety**: no use-after-free, no double-free, no dangling references
- **Resource safety**: linear values consumed exactly once, no leaks
- **Effect correctness**: declared capabilities match actual effects

We do not guarantee termination. Recursive functions may diverge. We do not guarantee liveness or deadlock freedom. These properties are outside the current verification scope. The kernel proves progress (well-typed programs don't get stuck) and preservation (types are maintained during evaluation), which together yield memory and resource safety, not total correctness.

### The Trust Boundary

The kernel type system and its properties are mechanically checked in Lean. What remains trusted: the Lean proof checker itself, the elaborator (surface language to kernel), and the code generator (kernel to machine code). Verifying the elaborator and code generator is future work.

## Design Principles

From this kernel-centric design, six principles follow:

1. **Pure by default** — Functions without capability annotations are pure: no side effects, no allocation
2. **Explicit capabilities** — All effects tracked in function signatures
3. **Linear by default** — Values consumed exactly once unless marked `Copy`
4. **No hidden control flow** — All function calls, cleanup, and allocation visible in source
5. **Fits in your head** — Small enough for one person to fully understand
6. **LL(1) grammar** — Parseable with single token lookahead, no ambiguity

## The Compilation Pipeline

```
Source Code (.concrete)
       ↓
   Lexer/Parser (LL(1) recursive descent)
       ↓
   Surface AST
       ↓
   Elaboration
     - Type checking
     - Linearity checking
     - Capability checking
     - Borrow/region checking
     - Defer insertion points
     - Allocator binding resolution
       ↓
   Kernel IR (core calculus)
       ↓
   Kernel Checker ← proven sound in Lean
       ↓
   Code Generation
       ↓
   Machine Code
```

The kernel checkpoint is the semantic gate. Everything before it transforms syntax; everything after it preserves meaning.

## No Hidden Control Flow

When you read Concrete code, what you see is what executes.

**No implicit function calls.** Operators are not overloaded methods. `a + b` on integers is primitive addition, not a call to `Add::add`. There are no implicit conversions that invoke code.

**No implicit destruction.** The compiler never inserts destructor calls. Rust's RAII inserts `drop()` at scope boundaries invisibly; Concrete requires you to write `defer destroy(x)`. You see the cleanup in the source.

**No implicit allocation.** Allocation happens when you call a function with `Alloc` capability bound to an allocator. String concatenation, collection growth, closure capture—if they allocate, you see `with(Alloc)` in the signature.

**No invisible error handling.** Errors propagate only where `?` appears. There are no exceptions unwinding the stack behind your back.

The cost is verbosity. The benefit is that reading the code tells you what the code does. For auditing, debugging, and formal verification, this tradeoff is correct.

## Types

### Primitives

```
Bool
Int, Int8, Int16, Int32, Int64
Uint, Uint8, Uint16, Uint32, Uint64
Float32, Float64
Char, String
Unit
```

### Algebraic Data Types

```
type Option<T> {
    Some(T),
    None
}

type Result<T, E> {
    Ok(T),
    Err(E)
}

type List<T> {
    Nil,
    Cons(T, List<T>)
}
```

### Records

```
type Point {
    x: Float64,
    y: Float64
}
```

### Standard Library Types

For domains where precision matters, the standard library includes:

- **Decimal**: fixed-point decimal arithmetic for financial calculations
- **BigInt**: arbitrary-precision integers
- **BigDecimal**: arbitrary-precision decimals

These avoid floating-point representation errors in financial systems and cryptographic applications.

## Linearity and Copy

All values in Concrete are linear by default. A linear value must be consumed exactly once, not zero times (that's a leak), not twice (that's a double-free). This is closer to Austral's strict linearity than Rust's affine types, which allow values to be dropped without explicit consumption.

Consumption happens when you pass the value to a function that takes ownership, return it, destructure it via pattern matching, or explicitly call `destroy(x)`.

```
fn example!() {
    let f = open("data.txt")
    defer destroy(f)
    let content = read(&f)
    // destroy(f) runs here because of defer
}
```

If `f` isn't consumed on all paths, the program is rejected. If you try to use `f` after moving it, the program is rejected. This is compile-time enforcement, not runtime checking.

### The Copy Marker

Some types escape linear restrictions. The rules for `Copy` are:

1. **Copy is explicit and opt-in.** You must mark a type as `Copy`; it is never inferred.
2. **Copy is structural.** A type can be `Copy` only if all its fields are `Copy`.
3. **Copy types cannot have destructors.** If a type defines `destroy`, it cannot be `Copy`.
4. **Copy types cannot contain linear fields.** A `Copy` record with a `File` field is rejected.

```
type Copy Point {
    x: Float64,
    y: Float64
}
```

The primitive numeric types and `Bool` are built-in `Copy` types. `String` is linear. For generic types, linearity depends on the type parameter: `Option<Int>` is `Copy` because `Int` is; `Option<File>` is linear because `File` is.

`Copy` is not an escape hatch from thinking about resources. It's a marker for types that have no cleanup requirements and can be freely duplicated.

### Destructors

A linear type may define a destructor:

```
type File {
    handle: FileHandle
}

destroy File with(File) {
    close_handle(self.handle)
}
```

The destructor takes ownership of `self`, may require capabilities, and runs exactly once when explicitly invoked. `destroy(x)` is only valid if the type defines a destructor. A type without a destructor must be consumed by moving, returning, or destructuring.

### Defer

The `defer` statement schedules cleanup at scope exit, borrowed directly from Zig and Go:

```
fn process_files!() {
    let f1 = open("a.txt")
    defer destroy(f1)
    
    let f2 = open("b.txt")
    defer destroy(f2)
    
    // When scope exits:
    // 1. destroy(f2) runs
    // 2. destroy(f1) runs
}
```

Multiple `defer` statements execute in reverse order (LIFO). `defer` runs at scope exit including early returns and error propagation.

### Defer Reserves the Value

When a value is scheduled with `defer destroy(x)`, it becomes reserved. The rules:

1. After `defer destroy(x)`, you cannot move `x`
2. After `defer destroy(x)`, you cannot destroy `x` again
3. After `defer destroy(x)`, you cannot `defer destroy(x)` again
4. After `defer destroy(x)`, you cannot create borrows of `x` that might overlap the deferred destruction point

The value is still owned by the current scope until exit, but it is no longer available for use. This prevents double destruction and dangling borrows.

## Borrowing

Borrowing defers consumption without extending lifetime or weakening linearity.

References let you use values without consuming them. Concrete's borrowing model draws from Rust but simplifies it: references exist within lexical regions that bound their lifetime, with no lifetime parameters in function signatures.

```
borrow f as fref in R {
    // fref has type &[File, R]
    // f is unusable in this block
    let len = length(fref)
}
// f is usable again
```

Functions that accept references are generic over the region, but implicitly:

```
fn length<R>(file: &[File, R]) -> Uint {
    ...
}
```

The function cannot store the reference because it cannot name `R` outside the call.

For single-expression borrows, the region is anonymous:

```
let len = length(&f)  // borrows f for just this call
```

### Borrowing Rules

1. While borrowed, the original is unusable
2. Multiple immutable borrows allowed
3. Mutable borrows exclusive: one `&mut T` at a time, no simultaneous `&T`
4. References cannot escape their region
5. Nested borrows of the same owned value forbidden
6. Derived references can't outlive the original's region

Closures cannot capture references if the closure escapes the borrow region. This ensures references never outlive their lexical scope.

## Capabilities

### What Capabilities Are

A capability is a static permission annotation on a function. It declares which effects the function may perform. Capabilities are not runtime values—they cannot be created, passed, stored, or inspected at runtime. They exist only at the type level, checked by the compiler and erased before execution.

Capabilities are predefined names. Users cannot define new capabilities or create composite capability types. Function signatures may combine predefined capabilities using `+` in the `with()` clause, but only among names exported by the platform's capability universe. The set of capabilities is fixed per target platform and finite—user code cannot extend it. There is no capability arithmetic, no capability inheritance, no way to forge a capability your caller didn't have.

Capabilities that can be manufactured at runtime aren't capabilities—they're tokens.

### Purity

Concrete is **pure by default**, following Austral's approach to effect tracking. A function without capability annotations cannot perform IO, cannot allocate, cannot mutate external state. It computes a result from its inputs, nothing more.

Purity in Concrete has a precise definition: a function is pure if and only if it declares no capabilities and does not require `Alloc`. Equivalently, purity means an empty capability set.

Pure functions may use stack allocation and compile-time constants—these are not effects. Pure functions may diverge—termination is orthogonal to purity. A non-terminating function that performs no IO and touches no heap is still pure. This separates effect-freedom (what Concrete tracks) from totality (which Concrete does not guarantee).

Purity enables equational reasoning: a pure function called twice with the same arguments yields the same result. Totality would enable stronger claims about program termination, but enforcing it would require restricting recursion, which conflicts with systems programming.

When a function needs effects, it declares them:

```
fn read_file(path: String) with(File) -> String {
    ...
}

fn process_data() with(File, Network, Alloc) -> Result {
    ...
}
```

Capabilities propagate monotonically. If `f` calls `g`, and `g` requires `File`, then `f` must declare `File` too. No implicit granting, no ambient authority. The compiler enforces this transitively.

### The Std Capability

For application entry points, Concrete provides a shorthand. The `!` suffix declares the `Std` capability:

```
fn main!() {
    println("Hello")
}
```

This desugars to `fn main() with(Std)`. `Std` includes file operations, network, clock, environment, random, and allocation, but excludes `Unsafe`.

**Library code should prefer explicit capability lists.** This is a social convention, not a mechanical enforcement. The compiler won't reject a library function that uses `Std`. But explicit capabilities make dependencies auditable. `Std` is a convenience for applications, not a license for libraries.

### Security Model

Capabilities don't sandbox code. If a dependency declares `with(Network)`, it gets network access. What they provide is **auditability**. You can grep for `with(Network)` and find every function that touches the network. You can verify that your JSON parser has no capabilities. You can review dependency updates by diffing capability declarations.

### Capability Polymorphism

Currently, you cannot be generic over capability sets:

```
// Not allowed
fn map<T, U, C>(list: List<T>, f: fn(T) with(C) -> U) with(C) -> List<U>
```

Each capability set must be concrete. This means generic combinators must be duplicated per capability set. Capability polymorphism is future work; the theory is well-understood (effect polymorphism in Koka, Eff, Frank), but adds complexity to the type system and the Lean formalization.

### Parametricity

Generic functions cannot accidentally become effectful depending on instantiation. A function `fn map<T, U>(list: List<T>, f: fn(T) -> U) -> List<U>` is pure regardless of what `T` and `U` are. If `f` requires capabilities, that must be declared in the signature.

Capabilities are checked before monomorphization. When generic code is specialized to concrete types, capability requirements don't change. A pure generic function stays pure at every instantiation.

## Allocation

Allocation deserves special attention because it's often invisible. In most languages, many operations allocate behind your back: string concatenation, collection growth, closure creation.

Concrete treats allocation as a capability, with explicit allocator passing inspired by Zig. Functions that allocate declare `with(Alloc)`. The call site binds which allocator:

```
fn main!() {
    let arena = Arena.new()
    defer arena.deinit()
    
    let list = create_list<Int>() with(Alloc = arena)
    push(&mut list, 42) with(Alloc = arena)
}
```

Inside `with(Alloc)`, the bound allocator propagates through nested calls. At the boundary, you see exactly where allocation happens and which allocator serves it.

Stack allocation does not require `Alloc`:

```
fn example() {
    let x: Int = 42                    // stack
    let arr: [100]Uint8 = zeroed()     // stack
}
```

Allocation-free code is provably allocation-free.

### Allocator Binding Scope

Allocator binding is lexically scoped. A binding applies within the static extent of the call being evaluated and any nested calls that require `with(Alloc)`.

A nested binding may shadow an outer binding:

```
fn outer() with(Alloc) {
    inner()                         // uses outer binding
    inner() with(Alloc = arena2)    // shadows within this call
}
```

Closures capture allocator bindings only if the closure is invoked within the lexical scope where the binding is in effect. If a closure escapes that scope, it cannot rely on an implicit allocator binding and must instead accept an explicit allocator parameter or be rejected by the type checker.

### Allocator Types

```
// General-purpose heap allocator
let gpa = GeneralPurposeAllocator.new()
defer gpa.deinit()

// Arena allocator, free everything at once
let arena = Arena.new(gpa)
defer arena.deinit()

// Fixed buffer allocator, no heap
let buf: [1024]Uint8 = zeroed()
let fba = FixedBufferAllocator.new(&buf)
```

All allocators implement a common `Allocator` trait.

### Allocator Interface

```
trait Allocator {
    fn alloc<T>(&mut self, count: Uint) -> Result<&mut [T], AllocError>
    fn free<T>(&mut self, ptr: &mut [T])
    fn realloc<T>(&mut self, ptr: &mut [T], new_count: Uint) -> Result<&mut [T], AllocError>
}
```

The interface is minimal. `alloc` returns a mutable slice or fails. `free` releases memory. `realloc` resizes in place or relocates. All three take `&mut self`—allocators are stateful resources, not ambient services.

Custom allocators implement this trait. The standard library allocators (`GeneralPurposeAllocator`, `Arena`, `FixedBufferAllocator`) are implementations, not special cases.

## Error Handling

Errors are values, using `Result<T, E>` like Rust and the `?` operator for propagation:

```
fn parse(input: String) -> Result<Config, ParseError> {
    ...
}

fn load_config!() -> Result<Config, Error> {
    let f = open("config.toml")?
    defer destroy(f)
    
    let content = read(&f)
    let config = parse(content)?
    Ok(config)
}
```

The `?` operator propagates errors. When `?` triggers an early return, all `defer` statements in scope run first. Cleanup happens even on error paths.

No exceptions. No panic.

### Abort

Abort is immediate process termination, outside normal control flow and outside the language's semantic model. Following Zig's approach:

- Out-of-memory conditions trigger abort
- Stack overflow triggers abort
- Explicit `abort()` terminates immediately
- **Deferred cleanup does not run on abort**

`defer` is for normal control flow, not catastrophic failure. Abort is the escape hatch when `Result` isn't enough. The process stops. There are no guarantees about state after abort begins.

## What You're Giving Up

Concrete is not a general-purpose language. It's for code that must be correct: cryptographic implementations, financial systems, safety-critical software, blockchain infrastructure.

**No garbage collection.** Memory is managed through linear types and explicit destruction. No GC pauses, no unpredictable latency, no hidden memory pressure.

**No implicit allocation.** Allocation requires the `Alloc` capability. `grep with(Alloc)` finds every function that might touch the heap.

**No interior mutability.** All mutation flows through `&mut` references. An immutable reference `&T` guarantees immutability, no hidden mutation behind an immutable facade. This forbids patterns like shared caches and memoization behind shared references. If you need a cache, pass `&mut`. If you need lazy initialization, initialize before borrowing. For advanced patterns that genuinely require interior mutability, the standard library provides `UnsafeCell<T>` gated by the `Unsafe` capability.

**No reflection, no eval, no runtime metaprogramming.** All code paths are determined at compile time. There is no way to inspect types at runtime, call methods by name dynamically, or generate code during execution.

If macros are added in a future version, they will be constrained to preserve the "can a machine reason about this?" principle:

- **Hygienic** — no accidental variable capture
- **Phase-separated** — macro expansion completes before type checking
- **Syntactic** — macros transform syntax trees, not strings
- **Capability-tracked** — procedural macros that execute arbitrary code at compile time will require capability annotations, extending effect tracking to the compile-time phase

**No implicit global state.** All global interactions (file system, network, clock, environment) are mediated through capabilities.

**No variable shadowing.** Each variable name is unique within its scope.

**No null.** Optional values use `Option<T>`.

**No undefined behavior in safe code.** Kernel semantics are fully defined and proven sound. The `Unsafe` capability explicitly reintroduces the possibility of undefined behavior for FFI and low-level operations.

**No concurrency primitives.** The language provides no threads, no async/await, no channels. Concurrency is a library concern. This may change, but any future concurrency model must preserve determinism and linearity, likely through structured or deterministic concurrency. This is a design constraint, not an open question.

### Anti-Features Summary

| Concrete does not have | Rationale |
|------------------------|-----------|
| Garbage collection | Predictable latency, explicit resource management |
| Hidden control flow | Auditability, debuggability |
| Hidden allocation | Performance visibility, allocator control |
| Interior mutability | Simple reasoning, verification tractability |
| Reflection / eval | Static analysis, all paths known at compile time |
| Global mutable state | Effect tracking, reproducibility |
| Variable shadowing | Clarity, fewer subtle bugs |
| Null | Type safety via `Option<T>` |
| Exceptions | Errors as values, explicit propagation |
| Implicit conversions | No silent data loss or coercion |
| Function overloading | Except through traits with explicit bounds |
| Uninitialized variables | Memory safety |
| Macros | Undecided; if added, will be hygienic and capability-aware |
| Concurrency primitives | Undecided; must preserve linearity and determinism |
| Undefined behavior (in safe code) | Kernel semantics fully defined |

## Pattern Matching

Exhaustive pattern matching with linear type awareness:

```
fn describe(opt: Option<Int>) -> String {
    match opt {
        Some(n) => format("Got {}", n),
        None => "Nothing"
    }
}
```

Linear types in patterns must be consumed:

```
fn handle!(result: Result<Data, File>) {
    match result {
        Ok(data) => use_data(data),
        Err(f) => destroy(f)
    }
}
```

Borrowing in patterns:

```
fn peek(opt: &Option<Int>) -> Int {
    match opt {
        &Some(n) => n,
        &None => 0
    }
}
```

## Traits

Traits provide bounded polymorphism, similar to Rust's trait system:

```
trait Ord {
    fn compare(&self, other: &Self) -> Ordering
}

trait Show {
    fn show(&self) -> String
}

fn sort<T: Ord>(list: List<T>) with(Alloc) -> List<T> {
    ...
}
```

### Receiver Modes and Linear Types

Trait methods take the receiver in one of three forms:

- `&self` — borrows the value immutably
- `&mut self` — borrows the value mutably
- `self` — takes ownership, consuming the value

If a trait method takes `self`, calling it consumes the value. This follows linear consumption rules:

```
trait Consume {
    fn consume(self)
}

fn use_once<T: Consume>(x: T) {
    x.consume()  // x is consumed here
    // x.consume()  // ERROR: x already consumed
}
```

A trait implementation for a linear type must respect the receiver mode. An `&self` method cannot consume the value. An `&mut self` method cannot let the value escape. A `self` method consumes it.

## Type Inference

Type inference is **local only**. Function signatures must be fully annotated. Inside bodies, local types may be inferred:

```
fn process(data: List<Int>) with(Alloc) -> List<Int> {
    let doubled = map(data, fn(x) { x * 2 })  // inferred
    let filtered = filter(doubled, fn(x) { x > 0 })  // inferred
    filtered
}
```

You can always understand a function's interface without reading its body.

## Modules

```
module FileSystem

public fn open(path: String) with(File) -> Result<File, IOError> {
    ...
}

public fn read<R>(file: &[File, R]) with(File) -> String {
    ...
}

private fn validate(path: String) -> Bool {
    ...
}
```

Visibility is `public` or `private` (default). 

### Capabilities as Public Contract

Capabilities are part of a function's signature and therefore part of the public API contract. Changing the required capability set of a public function is a breaking change. This applies in both directions: adding a capability requirement breaks callers who don't have it; removing one changes the function's guarantees.

When reviewing dependency updates, diff the capability declarations. A library that adds `with(Network)` to a function that previously had none is a significant change, even if the types remain identical.

```
import FileSystem
import FileSystem.{open, read, write}
import FileSystem as FS
```

## Unsafe and FFI

The `Unsafe` capability gates operations the type system cannot verify: foreign function calls, raw pointer operations, type transmutation, inline assembly, and linearity bypasses.

```
fn transmute<T, U>(value: T) with(Unsafe) -> U

fn ptr_read<T>(ptr: Address[T]) with(Unsafe) -> T

fn ptr_write<T>(ptr: Address[T], value: T) with(Unsafe)
```

`Unsafe` propagates through the call graph like any other capability. Grep for `with(Unsafe)` to find all trust boundaries.

### Raw Pointers

Raw pointers exist for FFI and low-level memory manipulation:

```
Address[T]       // raw pointer to T
```

Raw pointers are `Copy`. They carry no lifetime information and no linearity guarantees. This is safe because:

- Creating a raw pointer is safe. `address_of(r)` extracts an address.
- Holding a raw pointer is safe. It's a number.
- Using a raw pointer requires `Unsafe`. Dereference, arithmetic, and casting are gated.

```
fn to_ptr<T>(r: &T) -> Address[T] {
    address_of(r)  // safe
}

fn deref<T>(ptr: Address[T]) with(Unsafe) -> T {
    read_ptr(ptr)  // unsafe: no guarantee ptr is valid
}
```

`Copy` does not imply usable. Raw pointers can be freely duplicated because they carry no guarantees. Safety is enforced at the point of use, not at the point of creation.

### Foreign Functions

Declare foreign functions with `Unsafe` and the `foreign` directive:

```
fn malloc(size: Uint) with(Unsafe) -> Address[Unit] =
    foreign("malloc")

fn free(ptr: Address[Unit]) with(Unsafe) =
    foreign("free")
```

The compiler generates calling convention glue and links the symbol. Foreign signatures are restricted to C-compatible types. Details of the type mapping are deferred to a future FFI specification.

## Implementation

### Determinism

Concrete aims for **bit-for-bit reproducible builds**: same source + same compiler = identical binary. No timestamps, random seeds, or environment-dependent data in output.

For debugging, **deterministic replay**: random generation requires `Random` with explicit seed, system time requires `Clock`. Same inputs produce identical execution.

### The Grammar

LL(1). Every parsing decision with a single token of lookahead. No ambiguity, no backtracking.

This is a permanent design constraint, not an implementation detail. Future language evolution is bounded by what LL(1) can express. We accept this constraint for tooling simplicity and error message quality.

### Compilation Targets

**Native** via MLIR/LLVM, **C** for portability, **WebAssembly** for browser and edge. Cross-compilation is first-class.

### Tooling

Concrete ships with package manager, formatter, linter, test runner, and REPL. Part of the distribution, not external dependencies.

### Profiling and Tracing

Profiling and tracing are first-class:

- Built into the runtime, not bolted on
- Low overhead when disabled
- Structured output for tooling integration

Code is read more often than written, but executed more often than read. Performance visibility matters for systems programming.

## What You Can Say About Programs

If a program type-checks:

**"This function is pure."** No capabilities declared. No side effects, no IO, no allocation.

**"This resource is used exactly once."** Linear type. No leaks, no double-free, no use-after-free.

**"These are the only effects this code can perform."** Capability set is explicit and complete.

**"This code cannot escape the type system."** Unsafe operations require `with(Unsafe)`.

**"Allocation happens here, using this allocator."** Call site binds the allocator.

**"Cleanup happens here."** `defer destroy(x)` is visible.

**"This build is reproducible."** Same inputs, same binary.

Mechanical guarantees from a type system proven sound in Lean. Not conventions, proofs.

## Example

```
module Main

import FileSystem.{open, read, write}
import Parse.{parse_csv}

fn process_file(input_path: String, output_path: String) with(File, Alloc) -> Result<Unit, Error> {
    let in_file = open(input_path)?
    defer destroy(in_file)
    
    let content = read(&in_file)
    let data = parse_csv(content)?
    let output = transform(&data)
    
    let out_file = open(output_path)?
    defer destroy(out_file)
    
    write(&mut out_file, output)
    Ok(())
}

fn transform(data: &List<Row>) -> String {
    ...
}

fn main!() {
    let arena = Arena.new()
    defer arena.deinit()
    
    match process_file("input.csv", "output.txt") with(Alloc = arena) {
        Ok(()) => println("Done"),
        Err(e) => println("Error: " + e.message())
    }
}
```

Everything is visible: resource acquisition, cleanup scheduling, error propagation, allocator binding.

## Influences

The kernel calculus is formalized in Lean 4. Coq could serve the same role; we chose Lean for its performance and active development.

Austral shaped the type system more than any other language. Linear types in Concrete are strict: every value must be consumed exactly once. Rust's affine types allow dropping values without explicit consumption; we don't. The capability system for effect tracking also comes from Austral.

From Rust: borrowing, traits, error handling, pattern matching. Concrete uses lexical regions instead of lifetime annotations, which simplifies the model but covers fewer cases. `Result<T, E>` and the `?` operator are lifted directly.

Zig's influence shows in explicit allocator passing and defer. Zig functions that allocate take an allocator parameter; Concrete expresses the same idea through `with(Alloc)` and allocator binding at call sites.

Go had defer first. Go also shipped gofmt, which ended style debates by making one canonical format. We ship a formatter too.

The `!` syntax for impure functions comes from Roc. `fn main!()` marks impurity at a glance.

Koka, Eff, and Frank are the algebraic effects languages. Concrete's capabilities are a simplified version of their effect systems. Capability polymorphism would bring us closer to their expressiveness; it's future work.

Haskell proved that pure-by-default is practical. Clean had uniqueness types (precursor to linear types) and purity before Haskell did.

Cyclone pioneered region-based memory, the research line that led to Rust's lifetimes and our lexical regions. ATS showed linear types and theorem proving can coexist. Ada/SPARK proved formal verification works in production: avionics, rail, security-critical systems.

CompCert and seL4 established that you can mechanically verify real systems software. A verified C compiler and a verified microkernel. That's the standard we're aiming for.

These ideas work. We're combining them and proving the combination sound.

## Who Should Use This

Concrete trades convenience for explicitness, flexibility for auditability. Prototyping is slower. Some patterns become verbose. You'll miss interior mutability for certain data structures.

But for cryptographic primitives, consensus protocols, financial transaction systems, medical device firmware, the trade is worth it. Strong claims about program behavior, mechanically verified.

A language you can trust the way you trust mathematics: not because someone promises it works, but because you can check the proof.

---

## Quick Reference

| Annotation | Meaning |
|------------|---------|
| `fn foo() -> T` | Pure function, no capabilities |
| `fn foo!() -> T` | Shorthand for `with(Std)` |
| `fn foo() with(C) -> T` | Requires capability set `C` |
| `with(Alloc)` | Function may allocate |
| `with(Alloc = a)` | Bind allocator `a` at call site |
| `T` | Linear type, consumed exactly once |
| `type Copy T` | Unrestricted type, freely duplicated |
| `&T` or `&[T, R]` | Immutable reference in region `R` |
| `&mut T` | Mutable reference |
| `Address[T]` | Raw pointer (unsafe to use) |
| `borrow x as y in R { }` | Explicit borrow with named region |
| `defer expr` | Run `expr` at scope exit |
| `destroy(x)` | Consume via destructor |
| `foreign("symbol")` | Foreign function binding |

---

## Appendix A: Standard Capabilities

The `Std` capability (accessed via `!`) bundles these individual capabilities:

| Capability | Gates |
|------------|-------|
| `File` | Open, read, write, close files. Directory operations. |
| `Network` | Sockets, HTTP, DNS resolution. |
| `Alloc` | Heap allocation. Requires allocator binding at call site. |
| `Clock` | System time, monotonic time, sleep. |
| `Random` | Random number generation. Requires explicit seed for reproducibility. |
| `Env` | Environment variables, command line arguments. |
| `Process` | Spawn processes, exit codes, signals. |
| `Console` | Stdin, stdout, stderr. |

Capabilities not in `Std`:

| Capability | Gates |
|------------|-------|
| `Unsafe` | Raw pointer operations, FFI calls, transmute, inline assembly. Never implicit. |

A function with no capability annotation is pure. A function with `!` has access to everything except `Unsafe`. A function with explicit `with(File, Alloc)` has exactly those capabilities.

Capabilities are not hierarchical. `Std` is a shorthand for a set, not a super-capability. You cannot request "half of Std."

---

## Appendix B: Open Questions

These are unresolved design decisions:

**Concurrency**

No concurrency primitives exist. The language is currently single-threaded. Any future model must preserve:
- Linearity (no data races from aliasing)
- Determinism (reproducible execution)
- Effect tracking (concurrency as capability)

Candidates: structured concurrency (like Trio/libdill), deterministic parallelism (like Haskell's `par`), actor model with linear message passing. Not decided.

**Capability Polymorphism**

Currently impossible:
```
fn map<T, U, C>(list: List<T>, f: fn(T) with(C) -> U) with(C) -> List<U>
```

This forces duplicating combinators for each capability set. The theory exists (Koka, Eff, Frank), but adds complexity. Open question: is the duplication acceptable, or do we need polymorphism?

**Effect Handlers**

Capabilities track effects but don't handle them. Full algebraic effects would allow:
```
fn with_mock_filesystem<T>(f: fn() with(File) -> T) -> T {
    handle File in f() {
        open(path) => resume(MockFile.new(path))
        read(file) => resume(mock_data)
    }
}
```

This enables testing, sandboxing, effect interception. Significant implementation complexity. Not committed.

**Module System**

Current design is minimal. Open questions:
- Parameterized modules (functors)?
- Module-level capability restrictions?
- Visibility modifiers beyond public/private?
- Separate compilation units?

**FFI Type Mapping**

The spec says "C-compatible types" without defining them. Need to specify:
- Integer mappings (is `Int` C's `int` or `intptr_t`?)
- Struct layout guarantees
- Calling conventions
- Nullable pointer representation
- String encoding at boundaries

**Variance**

Generic types have variance implications. `List<T>` is covariant in `T`. `fn(T) -> U` is contravariant in `T`. The spec doesn't address this. For linear types, variance interacts with consumption. Needs formalization.

**Macros**

No macro system. Options:
- None (keep it simple)
- Hygienic macros (Scheme-style)
- Procedural macros (Rust-style)
- Compile-time evaluation (Zig-style comptime)

Procedural macros would need capability restrictions. Not decided.

---

## Appendix C: Glossary

**Affine type**: A type whose values can be used at most once. Rust's ownership model is affine: you can drop a value without consuming it.

**Capability**: A token that grants permission to perform an effect. Functions declare required capabilities; callers must have them. Capabilities propagate: if `f` calls `g`, and `g` needs `File`, then `f` needs `File`.

**Consumption**: Using a linear value in a way that fulfills its "exactly once" obligation. Methods of consumption: pass to a function taking ownership, return, destructure via pattern match, call `destroy()`.

**Copy type**: A type exempt from linearity. Values can be duplicated freely. Must be explicitly marked. Cannot have destructors or linear fields.

**Destruction**: Consuming a linear value by invoking its destructor. `destroy(x)` calls the type's destructor and consumes `x`.

**Effect**: An observable interaction with the world outside pure computation: IO, allocation, mutation, non-determinism. Concrete tracks effects through capabilities.

**Elaboration**: The compiler phase that transforms surface syntax into kernel IR. Type checking, linearity checking, and capability checking happen here.

**Kernel**: The core calculus formalized in Lean. A small language with mechanically verified properties. The surface language elaborates into it.

**Lexical region**: A scope that bounds reference lifetimes. References created in a region cannot escape it. Unlike Rust's lifetime parameters, regions are always lexical and never appear in signatures.

**Linear type**: A type whose values must be used exactly once. Not zero (leak), not twice (double-use). Concrete's default.

**Purity**: Absence of effects. A pure function computes a result from its inputs without IO, allocation, or mutation. In Concrete, functions without capability annotations are pure.

**Raw pointer**: An `Address[T]` value. Carries no lifetime or linearity information. Safe to create and hold; unsafe to use.

**Reference**: A borrowed view of a value. `&T` for immutable, `&mut T` for mutable. The original value is inaccessible while borrowed.

**Region**: See lexical region.

**Std**: The standard capability set. Shorthand for `File`, `Network`, `Alloc`, `Clock`, `Random`, `Env`, `Process`, `Console`. Excludes `Unsafe`.

**Unsafe**: The capability that permits operations the type system cannot verify: FFI, raw pointer dereference, transmute.

---

## Appendix D: Comparison Table

| Feature | Concrete | Rust | Zig | Austral | Go |
|---------|----------|------|-----|---------|-----|
| Memory safety | Linear types | Ownership + borrow checker | Runtime checks (optional) | Linear types | GC |
| Linearity | Strict (exactly once) | Affine (at most once) | None | Strict | None |
| GC | None | None | None | None | Yes |
| Effect tracking | Capabilities | None | None | Capabilities | None |
| Pure by default | Yes | No | No | Yes | No |
| Explicit allocation | Capability + binding | Global allocator | Allocator parameter | No | GC |
| Null | None (`Option<T>`) | None (`Option<T>`) | Optional (`?T`) | None (`Option[T]`) | Yes (`nil`) |
| Exceptions | None | Panic (discouraged) | None | None | Panic |
| Error handling | `Result` + `?` | `Result` + `?` | Error unions | `Result` | Multiple returns |
| Lifetime annotations | None (lexical regions) | Yes | None | None | N/A (GC) |
| Formal verification | Kernel in Lean | External tools | None | None | None |
| Defer | Yes | No (RAII) | Yes | No | Yes |
| Interior mutability | None | `Cell`, `RefCell`, etc. | Pointers | None | Pointers |
| Unsafe escape hatch | `with(Unsafe)` | `unsafe` blocks | No safe/unsafe distinction | `Unsafe_Module` | No distinction |
| Macros | None (undecided) | Procedural + declarative | Comptime | None | None |
| Concurrency | None (undecided) | `async`, threads, channels | Threads, async | None | Goroutines, channels |
| Formatter | Ships with language | rustfmt (separate) | zig fmt | None | gofmt |
| Grammar | LL(1) | Complex | Simple | Simple | LALR |

---

## Appendix E: Error Messages

These are representative error messages. The actual compiler may differ.

**Linearity violation: value not consumed**
```
error[E0201]: linear value `f` is never consumed
  --> src/main.concrete:4:9
   |
 4 |     let f = open("data.txt")
   |         ^ this value has type `File` which is linear
   |
   = help: linear values must be consumed exactly once
   = help: add `defer destroy(f)` or pass `f` to a function that takes ownership
```

**Linearity violation: value consumed twice**
```
error[E0202]: value `f` consumed twice
  --> src/main.concrete:7:12
   |
 5 |     let content = read_all(f)
   |                           - first consumption here
 6 |     
 7 |     destroy(f)
   |            ^ second consumption here
   |
   = help: after passing `f` to `read_all`, you no longer own it
```

**Borrow escape**
```
error[E0301]: reference cannot escape its region
  --> src/main.concrete:3:12
   |
 2 |     borrow data as r in R {
   |                       --- region R starts here
 3 |         return r
   |                ^ cannot return reference with region R
 4 |     }
   |     - region R ends here
   |
   = help: references are only valid within their borrow region
```

**Missing capability**
```
error[E0401]: function requires capability `Network` which is not available
  --> src/main.concrete:8:5
   |
 8 |     fetch(url)
   |     ^^^^^^^^^^ requires `Network`
   |
   = note: the current function has capabilities: {File, Alloc}
   = help: add `Network` to the function's capability declaration:
   |
 2 | fn process(url: String) with(File, Alloc, Network) -> Result<Data, Error>
   |                                    +++++++++
```

**Capability leak through closure**
```
error[E0402]: closure captures capability `File` but escapes its scope
  --> src/main.concrete:5:18
   |
 5 |     let handler = fn() { read(&config_file) }
   |                   ^^^^ this closure requires `File`
 6 |     return handler
   |            ------- closure escapes here
   |
   = help: closures that escape cannot capture capabilities
   = help: pass the file as a parameter instead
```

**Mutable borrow conflict**
```
error[E0302]: cannot borrow `data` as immutable because it is already borrowed as mutable
  --> src/main.concrete:4:17
   |
 3 |     borrow mut data as m in R {
   |                       - mutable borrow occurs here
 4 |         let len = length(&data)
   |                          ^^^^^ immutable borrow attempted here
   |
   = help: mutable borrows are exclusive; no other borrows allowed
```

**Unsafe operation without capability**
```
error[E0501]: operation requires `Unsafe` capability
  --> src/main.concrete:6:5
   |
 6 |     ptr_read(addr)
   |     ^^^^^^^^^^^^^^ unsafe operation
   |
   = note: reading from raw pointers may cause undefined behavior
   = help: add `Unsafe` to the function's capabilities:
   |
 2 | fn dangerous(addr: Address[Int]) with(Unsafe) -> Int
   |                                  ++++++++++++
```

---

## References

### Languages

- [Lean 4](https://lean-lang.org/) — theorem prover and programming language
- [Austral](https://austral-lang.org/) — linear types and capabilities for systems programming
  - [Specification](https://austral-lang.org/spec/spec.html)
- [Rust](https://www.rust-lang.org/) — ownership, borrowing, traits
- [Zig](https://ziglang.org/) — explicit allocators, defer, no hidden control flow
- [Roc](https://www.roc-lang.org/) — pure functional, `!` for effects
- [Koka](https://koka-lang.github.io/koka/doc/index.html) — algebraic effects and handlers
- [Eff](https://www.eff-lang.org/) — algebraic effects research language
- [Frank](https://github.com/frank-lang/frank) — effects as calling conventions
- [Clean](https://clean.cs.ru.nl/) — uniqueness types, pure by default
- [ATS](https://www.ats-lang.org/) — linear types with theorem proving
- [Cyclone](https://cyclone.thelanguage.org/) — region-based memory for C
- [Ada/SPARK](https://www.adacore.com/about-spark) — formal verification in systems programming

### Verified Systems

- [CompCert](https://compcert.org/) — verified C compiler
- [seL4](https://sel4.systems/) — verified microkernel
- [CertiKOS](https://flint.cs.yale.edu/certikos/) — verified concurrent OS kernel
- [Iris](https://iris-project.org/) — higher-order concurrent separation logic

### Papers

- [Linear Logic](https://homepages.inf.ed.ac.uk/wadler/topics/linear-logic.html) — Wadler's papers on linear logic
- [Linearity and Uniqueness: An Entente Cordiale](https://granule-project.github.io/papers/esop22-paper.pdf) — linear vs unique vs affine
- [Cyclone](https://cyclone.thelanguage.org/) — region-based memory safety for C
- [An Introduction to Algebraic Effects and Handlers](https://www.eff-lang.org/handlers-tutorial.pdf) — Matija Pretnar
- [Capability Myths Demolished](https://srl.cs.jhu.edu/pubs/SRL2003-02.pdf) — what capabilities actually provide
- [The Next 700 Programming Languages](https://www.cs.cmu.edu/~crary/819-f09/Landin66.pdf) — Landin's classic
- [Hints on Programming Language Design](https://www.cs.yale.edu/flint/cs428/doc/HintsPL.pdf) — Tony Hoare
- [Growing a Language](https://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf) — Guy Steele
- [RustBelt: Securing the Foundations of the Rust Programming Language](https://plv.mpi-sws.org/rustbelt/popl18/paper.pdf) — formal verification of Rust
- [Stacked Borrows: An Aliasing Model for Rust](https://plv.mpi-sws.org/rustbelt/stacked-borrows/paper.pdf) — Ralf Jung on aliasing
- [Ownership is Theft: Experiences Building an Embedded OS in Rust](https://patpannuto.com/pubs/levy15ownership.pdf) — Tock OS
- [A History of Haskell: Being Lazy with Class](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf) — design decisions
- [Why Functional Programming Matters](https://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf) — John Hughes
- [On the Criteria To Be Used in Decomposing Systems into Modules](https://www.win.tue.nl/~wstomv/edu/2ip30/references/criteria_for_modularization.pdf) — Parnas
- [Clean](https://clean.cs.ru.nl/) — uniqueness types specification
- [Using Lightweight Formal Methods to Validate a Key-Value Storage Node](https://www.amazon.science/publications/using-lightweight-formal-methods-to-validate-a-key-value-storage-node-in-amazon-s3) — practical verification at AWS

### Blog Posts

**Language Design Philosophy**
- [Worse is Better](https://www.dreamsongs.com/RiseOfWorseIsBetter.html) — Richard Gabriel on simplicity vs correctness
- [The Hundred-Year Language](https://paulgraham.com/hundred.html) — Paul Graham
- [Less is more: language features](https://blog.ploeh.dk/2015/04/13/less-is-more-language-features/) — Mark Seemann on constraints
- [Out of the Tar Pit](https://curtclifton.net/papers/MosessonClifton06.pdf) — Moseley and Marks on complexity
- [Design Principles Behind Smalltalk](https://www.cs.virginia.edu/~evans/cs655/readings/smalltalk.html) — Dan Ingalls
- [What to Know Before Debating Type Systems](https://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/) — Chris Smith
- [Execution in the Kingdom of Nouns](https://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html) — Steve Yegge

**Austral and Linear Types**
- [Introducing Austral](https://borretti.me/article/introducing-austral) — Fernando Borretti's rationale
- [How Austral's Linear Type Checker Works](https://borretti.me/article/how-australs-linear-type-checker-works) — implementation decisions
- [How Capabilities Work in Austral](https://borretti.me/article/how-capabilities-work-austral) — how capabilities compose
- [Type Systems for Memory Safety](https://borretti.me/article/type-systems-memory-safety) — survey of approaches
- [Linear types can change the world!](https://homepages.inf.ed.ac.uk/wadler/topics/linear-logic.html) — Wadler
- [Retrofitting Linear Types](https://www.tweag.io/blog/2017-03-13-linear-types/) — adding linear types to Haskell

**Rust Design Decisions**
- [The Problem with Single-threaded Shared Mutability](https://manishearth.github.io/blog/2015/05/17/the-problem-with-shared-mutability/) — why Rust forbids it
- [Rust: A unique perspective](https://limpet.net/mbrubeck/2019/02/07/rust-a-unique-perspective.html) — ownership from first principles
- [The Edition Guide: Non-Lexical Lifetimes](https://doc.rust-lang.org/edition-guide/rust-2018/ownership-and-lifetimes/non-lexical-lifetimes.html) — why Rust moved beyond lexical scopes
- [Polonius: the future of the borrow checker](https://smallcultfollowing.com/babysteps/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/) — Niko Matsakis
- [After NLL: Moving from borrowed data](https://smallcultfollowing.com/babysteps/blog/2018/11/10/after-nll-moving-from-borrowed-data-and-the-sentinel-pattern/) — borrow checker limitations
- [Ralf Jung's Blog](https://www.ralfj.de/blog/) — Stacked Borrows, unsafe, formal semantics
- [Learn Rust With Entirely Too Many Linked Lists](https://rust-unofficial.github.io/too-many-lists/) — ownership through pain
- [The Rustonomicon](https://doc.rust-lang.org/nomicon/) — dark arts of unsafe Rust

**Graydon Hoare (Rust creator)**
- [The Rust I Wanted Had No Future](https://graydon2.dreamwidth.org/307105.html) — original vision
- [Not Rust](https://graydon2.dreamwidth.org/307291.html) — what Rust deliberately avoided
- [What next for compiled languages?](https://graydon2.dreamwidth.org/253769.html) — language evolution

**Zig Design Decisions**
- [Allocgate](https://github.com/ziglang/zig/issues/10052) — why Zig's allocator design changed
- [What is Zig's Comptime](https://kristoff.it/blog/what-is-zig-comptime/) — compile-time execution design

**Effects and Capabilities**
- [Algebraic Effects for the Rest of Us](https://overreacted.io/algebraic-effects-for-the-rest-of-us/) — Dan Abramov
- [What Color is Your Function?](https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/) — Bob Nystrom on effect tracking
- [Structured Concurrency](https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/) — Nathaniel Smith
- [Eff Programming Language](https://www.eff-lang.org/) — algebraic effects research language
- [Koka: Programming with Row-polymorphic Effect Types](https://koka-lang.github.io/koka/doc/book.html) — official book

**Roc and Purity**
- [Roc FAQ](https://www.roc-lang.org/faq) — official FAQ and design rationale
- [Why Roc Uses Platform/App Split](https://www.roc-lang.org/platforms) — effect isolation design

**Go Design Decisions**
- [Go at Google: Language Design in the Service of Software Engineering](https://go.dev/talks/2012/splash.article) — Rob Pike
- [Simplicity is Complicated](https://go.dev/talks/2015/simplicity-is-complicated.slide) — Rob Pike
- [Errors are values](https://go.dev/blog/errors-are-values) — Rob Pike
- [Go Proverbs](https://go-proverbs.github.io/) — design philosophy
- [Toward Go 2](https://go.dev/blog/toward-go2) — Russ Cox on language evolution

**Memory and Allocators**
- [Untangling Lifetimes: The Arena Allocator](https://www.rfleury.com/p/untangling-lifetimes-the-arena-allocator) — Ryan Fleury
- [Always Bump Downwards](https://fitzgeraldnick.com/2019/11/01/always-bump-downwards.html) — Nick Fitzgerald
- [Memory Allocation Strategies](https://www.gingerbill.org/series/memory-allocation-strategies/) — Bill Hall's series

**Error Handling Design**
- [The Error Model](https://joeduffyblog.com/2016/02/07/the-error-model/) — Joe Duffy on Midori's approach
- [Error Handling in a Correctness-Critical Rust Project](https://sled.rs/errors) — sled database

**Formal Methods in Practice**
- [Proofs About Programs](https://www.hillelwayne.com/post/theorem-prover-showdown/) — Hillel Wayne
- [Formal Methods Only Solve Half My Problems](https://brooker.co.za/blog/2022/06/02/formal.html) — Marc Brooker at AWS
- [How AWS Uses Formal Methods](https://lamport.azurewebsites.net/tla/amazon-excerpt.html) — Amazon's TLA+ experience

**Lean 4**
- [Functional Programming in Lean](https://lean-lang.org/functional_programming_in_lean/) — official book
- [Theorem Proving in Lean 4](https://lean-lang.org/theorem_proving_in_lean4/) — official book
- [Metaprogramming in Lean 4](https://github.com/leanprover-community/lean4-metaprogramming-book) — macros and tactics

**Type System Design**
- [Parse, Don't Validate](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/) — Alexis King
- [Names Are Not Type Safety](https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety/) — Alexis King
- [Types as Axioms, or: Playing God with Static Types](https://lexi-lambda.github.io/blog/2020/08/13/types-as-axioms-or-playing-god-with-static-types/) — Alexis King
- [The Expression Problem](https://homepages.inf.ed.ac.uk/wadler/papers/expression/expression.txt) — Philip Wadler

**Bob Harper**
- [Dynamic languages are static languages](https://existentialtype.wordpress.com/2011/03/19/dynamic-languages-are-static-languages/)
- [Modules matter most](https://existentialtype.wordpress.com/2011/04/16/modules-matter-most/)

### Talks

- [Effects for Less](https://www.youtube.com/watch?v=0jI-AlWEwYI) — Alexis King on effects (essential)
- [The Road to Zig 1.0](https://www.youtube.com/watch?v=Unq712gqu2U) — Andrew Kelley
- [Is It Time to Rewrite the OS in Rust?](https://www.youtube.com/watch?v=HgtRAbE1nBM) — Bryan Cantrill
- [Propositions as Types](https://www.youtube.com/watch?v=IOiZatlZtGU) — Philip Wadler
- [Correctness by Construction](https://www.youtube.com/watch?v=nV3r1rB5_6E) — Derek Dreyer on RustBelt
- [Simple Made Easy](https://www.infoq.com/presentations/Simple-Made-Easy/) — Rich Hickey
- [Constraints Liberate, Liberties Constrain](https://www.youtube.com/watch?v=GqmsQeSzMdw) — Runar Bjarnason
- [Growing a Language](https://www.youtube.com/watch?v=_ahvzDzKdB0) — Guy Steele (watch this)
- [Why Algebraic Effects Matter](https://www.youtube.com/watch?v=7GcrT0SBSnI) — Daan Leijen
- [Outperforming Imperative with Pure Functional Languages](https://www.youtube.com/watch?v=vzfy4EKwG_Y) — Richard Feldman
- [Why Roc?](https://www.youtube.com/watch?v=cpQwtwVKAfU) — Richard Feldman
- [Preventing the Collapse of Civilization](https://www.youtube.com/watch?v=pW-SOdj4Kkk) — Jonathan Blow on why new languages matter
- [Ideas about a new programming language for games](https://www.youtube.com/watch?v=TH9VCN6UkyQ) — Jonathan Blow on Jai
- [Linear Types for Low-latency, High-throughput Systems](https://www.youtube.com/watch?v=t0mhvd3-60Y) — Jean-Philippe Bernardy
- [seL4 and Formal Verification](https://www.youtube.com/watch?v=Sj3b8Sltx1s) — Gernot Heiser

### Books

- [Types and Programming Languages](https://www.cis.upenn.edu/~bcpierce/tapl/) — Pierce
- [Practical Foundations for Programming Languages](https://www.cs.cmu.edu/~rwh/pfpl/) — Harper
- [Certified Programming with Dependent Types](https://adam.chlipala.net/cpdt/) — Chlipala
- [Software Foundations](https://softwarefoundations.cis.upenn.edu/) — interactive Coq textbook
- [Programming Language Foundations in Agda](https://plfa.github.io/) — Wadler and Kokke
- [The Little Typer](https://mitpress.mit.edu/9780262536431/the-little-typer/) — Friedman and Christiansen
- [Crafting Interpreters](https://craftinginterpreters.com/) — Bob Nystrom

---

## Series: Ethereum

Modern economic systems rest on two foundations: tools that expand productive capacity and institutions that define who controls their output. The internet transformed how information moves, but it did not reconstruct the institutional machinery that governs ownership and exchange. Digital economic life therefore expanded without a durable system of rights, enforcement, or jurisdiction. Blockchain networks, and Ethereum in particular, address this gap by embedding institutional functions in software and enforcing them through economic incentives and cryptographic verification.

### The new financial backend of the world

*Published: 2025-12-09*

> Ethereum is emerging as a neutral financial backend, lowering the cost of global financial services by encoding ownership and obligations in shared infrastructure.

URL: https://federicocarrone.com/series/ethereum/the-new-financial-backend-of-the-world/

**By Federico Carrone and Roberto Catalan**  

Ethereum is emerging as a general purpose financial backend that reduces the cost and complexity of building financial services while improving their speed and security. For decades the internet accelerated communication but did not create a neutral system for defining ownership or enforcing obligations. Economic activity moved online without the accompanying machinery of rights, records, and jurisdiction. Ethereum fills this gap by embedding these functions in software and enforcing them through a distributed validator set.

Markets depend on property rights, and property rights depend on reliable systems for recording ownership, supporting transfer, and enforcing obligations. Prices then communicate scarcity and preference, enabling coordination at scale. Technological progress has repeatedly lowered the cost of transmitting information and synchronizing action. Ethereum extends this pattern by lowering the cost of establishing and verifying ownership across borders.

## From internet native to global infrastructure

Ethereum’s early innovation was the introduction of programmable digital assets with defined economic properties. Issuers could establish monetary rules, engineer scarcity, and integrate assets into applications. Before Ethereum, such experimentation required constructing a  network and persuading others to secure it, a process limited to technically sophisticated groups. Ethereum replaced infrastructure duplication with shared security and a general purpose environment, turning issuance from a capital intensive undertaking into a software driven activity.

The more consequential development has been the recognition that Ethereum can reconstruct traditional financial services in a form that is more transparent and less operationally burdensome. Financial institutions devote substantial resources to authorization, accounting, monitoring, dispute resolution, and reporting. Consumer interfaces sit atop complex internal systems designed to prevent error and misconduct. Ethereum substitutes a portion of this apparatus with a shared ledger, a programmable execution environment, and cryptographic enforcement. Administrative complexity is reduced because core functions are delegated to software rather than replicated within each institution.

Ethereum reduces that burden by providing a shared ledger with real time updates, a programmable space for defining rules, and cryptographic enforcement. It does not remove institutions but changes which parts of the financial stack they must build themselves. Issuance becomes simpler, custody more secure, and administration less dependent on proprietary infrastructure.

## Software, trust and the reduction of friction

Some economists describe transaction costs through three frictions: triangulation, transfer and trust. Triangulation concerns how economic actors identify each other and agree on terms. Transfer concerns how value moves between them. Trust concerns the enforcement of obligations. Traditional financial architecture manages these frictions through scale, proprietary systems, and coordination among intermediaries.

Ethereum remove middlemen and therefore lowers the three frictions enumerated before. Open marketplaces support discovery of assets and prices. Digital value can settle globally within minutes without the layers of correspondent banking. Obligations can be executed automatically and verified publicly. These capabilities do not eliminate institutional functions but shift part of the work from organizations to software, reducing cost and operational risk.

New entrants benefit immediately. They can rely on infrastructure maintained by thousands of engineers rather than building their own systems for settlement, custody, and enforcement. Business logic becomes code. Obligations can be automated. Settlement becomes immediate. Users retain custody. This expands the range of viable business models and allows firms to serve markets that incumbents consider too small or too complex.

Having a single global ledger also changes operational dynamics. Many institutions operate multiple databases that require frequent reconciliation and remain vulnerable to error. Ethereum maintains a continuously updated and replicated record that cannot be amended retroactively. Redundancy and recoverability become default properties rather than costly internal functions.

Security follows the same pattern. Instead of defending a central database, Ethereum distributes verification among many independent actors. Altering history requires coordination at scale and becomes prohibitively expensive. Confidence arises from system design rather than institutional promises.

## New financial services and global reach

These features enable services that resemble established financial activities but operate with different cost structures. International transfers can use digital dollars rather than correspondent networks. Loans can enforce collateral rules in code. Local payment systems can interoperate without proprietary standards. Individuals in unstable economies can store value in digital instruments independent of local monetary fragility.

Clearing, custody, reconciliation, monitoring, and enforcement shift from organizational processes into shared software. Companies can focus on product design and distribution rather than maintaining complex internal infrastructure. Scale is achieved by acquiring users, because infrastructure is shared. Value accrues to applications rather than to duplicated internal systems.

The impact is most visible in markets with fragile financial systems. In economies with unstable currencies or slow payment networks, Ethereum provides immediate functional gains. In developed markets the benefits appear incremental but accumulate as more instruments and processes become programmable.

## Institutional transformation and long term dynamics

Many financial instruments are heterogeneous. Corporate debt is a clear example. Terms differ by maturity, coupon, covenants, collateral, and risk. Trading depends on bilateral negotiation and intermediaries who maintain records and enforce obligations. Ethereum can represent these instruments digitally, track ownership, and execute terms automatically. Contracts retain their specificity, while administration becomes standardized and interoperable.

This suggests a shift in institutional architecture. Regulation and legal systems remain central, but the boundary between what firms must build and what software can enforce changes. Institutions evolve from infrastructure providers to service designers. Cost structures diverge between firms that maintain legacy systems and those that rely on shared infrastructure.

Ethereum already functions as an alternative financial rail. Its reliability, the presence of multiple independently developed clients, substantial real world usage, active research community, and commitment to openness and verification distinguish it from other blockchain networks. These qualities align with the requirements of durable financial infrastructure.

## Conclusion

Ethereum converts core financial frictions into software functions. This changes the economics of building and operating financial services. Talent and capital shift from operations to innovation in product design. Institutions become lighter and more focused. Those who will adopt Ethereum will have lower costs of operation and will have a head start against competitors.

Technological transitions begin in niches where incumbents do not meet demand. As systems mature, costs fall and broader adoption becomes feasible. Ethereum followed this path. It began with internet native communities, expanded across emerging markets where users lacked reliable financial tools, and is now positioned to upgrade mainstream markets by making financial companies easier to create and operate.

The broader implication is that software is becoming the organizing principle of financial infrastructure. Ethereum makes this shift concrete. Whether it becomes foundational will depend on regulation and institutional adaptation, but the economic incentives are increasingly aligned with systems that are open, verifiable, and resilient.

---

### The missing institution of the Internet

*Published: 2025-12-02*

> The internet scaled information but not ownership institutions; Ethereum addresses this gap by embedding rights, records, and enforcement into programmable economic infrastructure.

URL: https://federicocarrone.com/series/ethereum/the-missing-institution-of-the-internet/

**By Federico Carrone and Roberto Catalan**  

Modern economic systems rest on two foundations: tools that expand productive capacity and institutions that define who controls their output. The internet transformed how information moves, but it did not reconstruct the institutional machinery that governs ownership and exchange. Digital economic life therefore expanded without a durable system of rights, enforcement, or jurisdiction. Blockchain networks, and Ethereum in particular, address this gap by embedding institutional functions in software and enforcing them through economic incentives and cryptographic verification.  

## Technology, Culture and Institutional Design  

In most species, behavior is shaped by biology and fixed through genetic inheritance. Humans diverged by inventing technologies that alter their environment more rapidly than biological evolution can adapt to it. Fire, agriculture, medicine and computing enabled a physically vulnerable species to extend its productive frontiers.  

Equally significant was the emergence of institutions that facilitated cooperation beyond small groups. Human societies are organized not through inherited instinct but through constructed systems of norms, laws and symbolic abstractions that can be revised in response to changing conditions. Cultural evolution permits continuous redesign and operates on a faster timescale than genetic change.  

This dual process, technological augmentation and institutional invention, generated compounding effects. Tools expanded individual capacity and institutions aggregated that capacity into collective action. Property rights, contracts, markets and corporate entities emerged as mechanisms to coordinate behavior at scale by defining entitlements and aligning incentives.  

## Property Rights and Markets as Social Technologies  

Economic development depends not only on productive capability but on credible commitments. Individuals and firms invest when they can expect to benefit from their efforts and be protected from arbitrary interference. Property rights provide that assurance by specifying ownership, use and exclusion. Markets, layered on top of these rights, coordinate production and exchange by allocating resources through price signals.  

These arrangements are often treated as natural features of economic life. They are engineered agreements constructed through law and political settlement. Their value lies in enabling investment, specialization and trade under uncertainty. Prices, money and contracts compress information about scarcity, preferences and risk, enabling production to be coordinated across large populations without centralized direction.  

The global expansion of trade in the twentieth century reflected these institutional foundations. Specialization increased productivity and interdependence reduced conflict by raising the cost of disruption. Innovations such as neutral jurisdictions and corporate structures enabled strangers to transact under shared rules. Legal entities functioned as containers that allowed participants from different regulatory environments to collaborate.  

This infrastructure, whether admired or criticized, underwrote the international economic order of the late twentieth century.  

## The Missing Architecture of Digital Ownership  

The internet lowered the cost of communication and commerce across borders, but it did not establish a neutral mechanism for defining and enforcing claims on digital assets. Offline, ownership is adjudicated by courts, enforced by states and geographically bounded. Online, in the absence of a global authority, ownership defaults to either national legal systems or to the platforms that mediate activity.  

Corporations filled this vacuum by providing infrastructure for identity, communication and exchange. They set terms of access, mediate transactions and retain discretionary control over assets generated within their systems. Users and firms may create content, build businesses and accumulate value, but their rights are contingent on the policies of the platform operator.  

The experience of Zynga illustrates this dynamic. The company developed a profitable games business on Facebook and briefly achieved a valuation exceeding that of Electronic Arts. Its fortunes deteriorated when Facebook revised its policies and altered its revenue share. Zynga owned its intellectual property and its infrastructure but not the environment on which its business model depended, a common position for firms built on platform economies. In digital markets, platforms function as de facto landlords.  

This is not an isolated case but a structural feature of platform centered economies: extensive participation paired with limited control.  

## Ethereum as an Institutional Experiment  

Ethereum is a response to this institutional absence. It provides a mechanism for creating, transferring and enforcing digital assets without reliance on corporate or national intermediaries. The system operates as a verifiable computing environment in which rules are encoded in software and enforced collectively by a distributed network.  

Traditional computing systems require users to trust the operator. Ethereum distributes computation across thousands of machines that execute identical code and verify each other in a continuous process. Outputs are accepted when consensus is reached and misbehavior is economically penalized. Under these conditions, property rights and contractual commitments can be represented as digital objects whose enforcement does not depend on courts or discretionary authority.  

This architecture automates functions normally performed by institutions. Auditors repeat financial records to detect manipulation. Courts resolve disputes. Regulators impose compliance standards. These systems are essential but costly and slow. Ethereum replicates aspects of verification and enforcement at the system level using software, mathematics and economic incentives.  

The network is open to participation without authorization and resistant to censorship because no single entity can unilaterally block or rewrite transactions. These properties arise from the structure of the system rather than ideological intent.  

## The Emergence of a Digital Financial System  

The first adopters of Ethereum were technologists experimenting with new mechanisms for ownership and coordination. Most of the culture and products were created for themselves. Over time, a broader range of actors began using the system for financial services.  

The most consequential development has been the rise of stablecoins, digital representations of fiat currency backed by real world assets. Their combined market capitalization exceeds three hundred billion dollars, with a majority circulating on Ethereum. Transaction volumes on blockchain networks now approach those processed by major payment systems.  

Stablecoins replicate core financial functions such as store of value and transfer of funds without geographic restrictions and with continuous settlement. Their programmability enabled the construction of lending protocols that allow users to lend and borrow assets with risk parameters enforced in software rather than through institutional mediation.  

These systems differ from traditional financial infrastructure. Participation is global rather than jurisdictional. Switching costs are low because services are built on interoperable standards. Exit is immediate. Risk is transparent though often misunderstood.  

Compare that to countries like Argentina, where interoperability between banks and fintech wallets, something as trivial as scanning a QR code, is still an ongoing regulatory battle. Incumbents try to use their market position to avoid being interoperable. On Ethereum, interoperability is structural. Individuals can receive payment, convert assets, provide liquidity and borrow collateralized funds within minutes from a mobile device. In legacy systems, similar transactions take days and incur high fees. Adoption reflects demand for neutral infrastructure in environments where intermediation is unreliable.  

## Implications  

Several areas of financial activity are migrating to blockchain based systems, including remittances, trade finance and private credit. Others, such as corporate debt markets, remain fragmented and costly but exhibit characteristics that may make them suitable for digital reconstruction on top of Ethereum.  

Significant obstacles remain. Regulatory uncertainty, operational risk and user experience challenges constrain adoption. Scaling transaction throughput without compromising decentralization is an engineering problem that has not been solved conclusively. Software vulnerabilities and governance failures present meaningful risk.  

These challenges appear tractable. Early evidence suggests that elements of financial intermediation can be automated at lower cost and with greater transparency than existing systems. The trajectory of adoption will depend on institutional responses as much as technical progress.  

## Artificial Intelligence and Coordination  

Artificial intelligence increases productive capacity by automating tasks but does not resolve questions of ownership, governance or compliance. Output may be generated more efficiently, but disputes over entitlement, liability and compensation persist.  

Artificial intelligence and blockchains, but in particular Ethereum, are the two biggest innovations in the decades to come. The two technologies solve concrete core primitives of humans: productivity gains and coordination. Artificial intelligence will make people more productive, but it will not eliminate the bureaucratic machinery required to verify and enforce outcomes. Ethereum introduces a technology that complements AI: a system where humans and autonomous agents can coordinate, trade, and settle disputes directly through code, without relying on institutions to prove that everyone followed the rules.  

## Conclusion  

The internet lowered the cost of transmitting information but did not create institutions for defining and enforcing rights over digital assets. The result has been an economy coordinated by private platforms rather than neutral systems of governance. Ethereum reconstructs elements of property rights and contractual enforcement as public infrastructure encoded in software.  

Whether such systems become core infrastructure or remain specialized instruments will depend on institutional adaptation, regulation and technological progress. They have already demonstrated an alternative cost structure for financial coordination and introduced mechanisms for digital property that do not rely on centralized administration.  

The internet built an economy without institutions. Ethereum is an attempt to build them.

---

## Series: Leptokurtic

Financial returns have fat tails. Crashes that Gaussian models call impossible happen regularly. This series assembles twenty centuries of financial data, builds a crash-detection toolkit from 17 methods, tests tail hedging with real options data, and then explains the geometric logic tying the whole picture together.

### At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.

*Published: 2026-03-05*

> A 400-year journey through logarithms, Kelly, ergodicity, and tail risk to show why geometry sits at the core of finance and why Jensen's inequality ties the whole story together.

URL: https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/

Never cross a river that is on average four feet deep. This is Nassim Taleb's warning about the difference between arithmetic averages and the reality of survival. If the river is eight feet deep in the middle and dry on the sides, the average tells you nothing about whether you will drown. You will drown in the middle, or you won't. There is no averaging across parallel universes where you both survive and die.

This observation is not just a heuristic. It is a mathematical fact about the geometry of multiplicative processes, processes where wealth compounds, where growth feeds on itself, and where a loss of 50% requires a gain of 100% just to break even. The mathematics that governs survival in these environments was invented 400 years ago to help astronomers multiply large numbers, partially rediscovered in the 18th century by Daniel Bernoulli, formalized again in the 20th century through information theory, and then largely obscured by theories that optimized across hypothetical worlds instead of along a single path through time.

This is the story of why geometry sits at the core of finance, why Jensen's inequality is the clearest expression of that geometry, and why reducing variance can be more valuable than increasing returns.

<!-- more -->

## I. The Invention of Linearization (1614)

In the early 17th century, astronomers were drowning in calculation. Johannes Kepler had just published his laws of planetary motion, and navigators were trying to compute positions using spherical trigonometry. A single problem might require multiplying two seven-digit numbers, a process that took skilled calculators half an hour and was prone to error.

John Napier, a Scottish laird and amateur mathematician, spent 20 years searching for a way to simplify this. His insight: if you could convert multiplication into addition, calculations would become trivial. He invented "logarithms" (from Greek *logos* = ratio, *arithmos* = number), a table that mapped every number to its "ratio-representative."

The crucial property: $\log(ab) = \log(a) + \log(b)$. Multiplication becomes addition. Division becomes subtraction. Exponentiation becomes multiplication.

But Napier's invention was more than a computational trick. He had discovered the mathematical tool for linearizing multiplicative processes. Four hundred years later, this same tool would reveal why volatility destroys wealth and why tail hedging works.

## II. The Number That Grows Continuously (1685)

In 1685, Jacob Bernoulli investigated a financial puzzle: if you lend money at 100% annual interest, what happens if you compound it more frequently?

- Annually: $(1 + 1)^1 = 2$
- Monthly: $(1 + 1/12)^{12} \approx 2.61$
- Daily: $(1 + 1/365)^{365} \approx 2.71$

Bernoulli discovered that as the compounding interval shrinks toward zero, the result approaches a limit:

$$e = \lim_{n \to \infty} \left(1 + \frac{1}{n}\right)^n \approx 2.71828$$

The number $e$ emerges naturally from continuous compound growth. It is the base of the logarithm that measures growth rates, the "natural" logarithm $\ln(x)$. Where Napier gave us the tool to linearize multiplication, Bernoulli gave us the base that emerges when growth becomes continuous.

Together, Napier and Bernoulli provided the mathematical foundation: wealth grows multiplicatively, and logarithms convert this multiplicative growth into additive increments.

## III. Jensen's Breakthrough (1906)

The missing geometric step arrived in 1906. Danish mathematician Johan Jensen proved the inequality that now bears his name. He showed that whenever a function is concave, averaging inputs before applying the function gives a larger result than applying the function first and then averaging.

That result sounds abstract, but it is exactly the bridge this story needs. Napier gave us logarithms. Bernoulli showed why the natural logarithm belongs to compounding. Jensen explained why variability hurts once the relevant function bends downward.

## IV. Jensen's Inequality: The Geometry of Concavity

A function is **concave** if it bends downward. If you draw a straight line between two points on the curve, that line sits below the curve itself. The logarithm is concave. That simple geometric fact turns out to matter enormously in finance, because wealth compounds.

**Jensen's Inequality** states that for a concave function $\varphi$:

$$\mathbb{E}[\varphi(X)] \leq \varphi(\mathbb{E}[X])$$

The notation is simple once you unpack it. $\mathbb{E}[X]$ means the expected value, or average, of $X$. The symbol $\varphi(X)$ means "evaluate the function $\varphi$ at $X$." So the inequality says: for a concave function, the expected value of the transformed variable is less than or equal to the transformed expected value. The gap between those two quantities is the **Jensen gap**. It is the mathematical penalty created by variability under concavity.

Here is the simplest possible example. Suppose your wealth factor is either $1.5$ or $0.5$ with equal probability. In other words, you either gain 50% or lose 50%.

- The arithmetic average wealth factor is $(1.5 + 0.5)/2 = 1.0$
- The log of that average is $\ln(1.0) = 0$
- But the average log is $\frac{\ln(1.5) + \ln(0.5)}{2} \approx -0.144$

So the average outcome looks flat, but the expected log-growth is negative. Under repeated exposure to the same kind of multiplicative gamble, that means long-run compound growth is negative even though the arithmetic average looks harmless. That is Jensen's inequality in action.

To see this more viscerally, start with \$100:
- After a 50% gain: $\$100 \times 1.5 = \$150$
- After a 50% loss: $\$150 \times 0.5 = \$75$

You end with \$75, a 25% total loss, despite the arithmetic average return being zero. The loss applied to a larger base (\$150) than the gain (\$100), so it took away more than the gain added. Compounding makes volatility expensive.

Apply this to log-wealth:

$$\mathbb{E}[\ln(1+R)] \leq \ln(1 + \mathbb{E}[R])$$

Here $R$ means the return in one period. If you gain 10%, then $R = 0.10$. If you lose 20%, then $R = -0.20$. The expression $1+R$ is your **wealth factor**, the number your money gets multiplied by over that period.

The left side is the average log-growth rate, which is what determines long-run compounding. The right side is the log of one plus the average return. The inequality is strict whenever returns vary. If returns were perfectly constant, the two sides would be the same. Variability is what creates the gap.

That distinction matters. The Jensen gap is the general geometric fact. **Variance drag** is the second-order approximation to that fact when returns are not too large.

To estimate the size of the gap, expand $\ln(1+R)$ in a Taylor series around the mean return $\mu$:

$$\ln(1+R) \approx \ln(1+\mu) + \frac{R-\mu}{1+\mu} - \frac{(R-\mu)^2}{2(1+\mu)^2} + ...$$

Now take expectations term by term. Because $\mu = \mathbb{E}[R]$, we have $\mathbb{E}[R-\mu] = 0$, so the linear term vanishes. And because $\sigma^2 = \mathbb{E}[(R-\mu)^2]$, the quadratic term becomes the variance. That gives:

$$\mathbb{E}[\ln(1+R)] \approx \ln(1+\mu) - \frac{\sigma^2}{2(1+\mu)^2}$$

For small returns, and therefore for $|\mu| \ll 1$, we can use $\ln(1+\mu) \approx \mu$ and $(1+\mu)^2 \approx 1$. Then this simplifies to:

$$G \approx \mu - \frac{\sigma^2}{2}$$

Here $\mu$ is the arithmetic mean return, $\sigma^2$ is the variance of returns, and $G$ is the approximate geometric growth rate. So the geometric growth rate is approximately equal to the arithmetic mean $\mu$ minus half the variance. **Variance drag is literally the second-order term in the Taylor expansion of a concave function.**

This is Jensen's inequality showing up in a form that is easy to compute. Because the logarithm is concave, variance lowers expected log-return. A portfolio with $\mu = 10\%$ and $\sigma = 20\%$ has geometric growth of approximately $8\%$, which means a 2% annual drag from volatility alone.

## V. Kelly's Criterion: Optimization Under Concavity

Long before Kelly, Daniel Bernoulli had already moved in the right direction. In 1738, in his analysis of the St. Petersburg paradox, he proposed logarithmic utility. He did not frame it in terms of time averages or ergodicity, but he correctly saw that the value of money is not linear and that multiplicative risk changes the problem.

Claude Shannon is not just background here. His work is foundational. In 1948, Shannon created information theory, a mathematical framework for reasoning about signal, noise, and transmission. The deep idea is that information can be measured, that uncertainty has structure, and that better information changes what an optimal repeated decision looks like. That is the intellectual foundation Kelly stands on.

In Shannon's framework, information has operational value because it improves your ability to act under uncertainty. Kelly's central insight was to take that logic and apply it to betting and capital allocation. If information improves the quality of your edge, then there is an optimal way to convert that edge into compounded wealth growth.

In 1956, John Larry Kelly Jr., a researcher at Bell Labs, derived the optimal betting strategy for a gambler with a private wire giving him noisy information about horse races. His derivation came directly from Shannon's information theory, the mathematics of signal transmission through noisy channels.

Kelly maximized the expected **logarithm** of wealth because wealth compounds multiplicatively, and the logarithm converts this into additive growth rates that can be averaged over time.

The Kelly criterion maximizes:

$$G = \mathbb{E}[\ln(W_t/W_{t-1})]$$

Here $W_t$ means your wealth at time $t$. So $W_t/W_{t-1}$ is simply "how much your wealth changed this period," and the logarithm turns that multiplicative change into something you can add across time.

For a simple bet with probability $p$ of winning, odds $b$, and probability $q=1-p$ of losing, the optimal fraction $f^*$ of wealth to bet is:

$$f^* = \frac{pb - q}{b} = \frac{p(b+1) - 1}{b}$$

For continuous returns, where $\mu$ is the expected return, $r$ is the risk-free rate, and $\sigma^2$ is the variance, the Kelly fraction becomes:

$$f^* = \frac{\mu - r}{\sigma^2}$$

**Notice what appears in the denominator: variance.** Kelly's formula explicitly shows that optimal position size depends on the ratio of edge to variance. A strategy with twice the edge but twice the variance gets the same allocation as the original. Variance is not just a measure of "risk." It is a direct divisor of optimal exposure.

Kelly betting grows wealth faster than any other strategy in the long run. It dominates all other strategies with probability approaching 1 as time goes to infinity.

Edward O. Thorp was the person who carried this from theory into practice. He used Kelly-style reasoning first in blackjack and then in markets, showing that log-optimal sizing was not just an elegant theorem. It was a workable decision rule under uncertainty.

Leo Breiman gave the result one of its clearest mathematical statements. In 1961, he showed that the log-optimal strategy asymptotically dominates alternative strategies under broad conditions. Kelly gave the rule. Thorp made it operational. Breiman helped make the long-run claim precise.

### Fractional Kelly

Practitioners rarely use full Kelly. The optimal fraction maximizes growth rate but produces extreme volatility, and drawdowns of 50% or more are common. Instead, they use half-Kelly or quarter-Kelly:

$$f_{half} = \frac{f^*}{2}$$

Under the standard local approximation around the Kelly optimum, this reduces growth rate by about 25% but cuts volatility in half. It also provides a safety margin against estimation error. If your estimate of $\mu$ or $\sigma$ is wrong, full Kelly can be catastrophic. Fractional Kelly is the recognition that maximizing geometric growth is the goal, but estimation uncertainty requires humility.

## VI. Ergodicity Economics: Time vs. Ensemble

In 2019, physicist Ole Peters published a paper in *Nature Physics* that should have ended 250 years of economic confusion. Peters is one of the central modern figures in this article's argument because he does not merely add another risk model. He redefines the objective. His broader research program, including earlier work with Alexander Adamou, showed that expected utility theory, the foundation of modern economics, rests on the implicit assumption of **ergodicity**.

To understand the force of that critique, it helps to name the benchmark. In 1944, John von Neumann and Oskar Morgenstern formalized **expected utility theory** in *Theory of Games and Economic Behavior*. Their framework asks which action maximizes average utility across possible states of the world. It became the dominant mathematical language of rational choice. The Peters critique is not aimed at a vague intuition. It is aimed at this formal benchmark.

A process is ergodic if the time average equals the ensemble average. If 100 people flip a coin once, the average outcome equals one person flipping a coin 100 times. In ergodic systems, you can replace time with parallel trials.

**Wealth growth is not ergodic.** If 100 people each bet their entire net worth on a fair coin flip, about 50 will be wiped out. If one person bets their entire net worth 100 times, they will be wiped out with certainty. The ensemble average (50% survive) looks fine. The time average (certain ruin) is catastrophic.

This is where the distinction between **ensemble optimality** and **pathwise optimality** becomes decisive. A strategy can look optimal when you average across many parallel worlds, yet still be disastrous for one person living through one realized sequence of outcomes.

Peters' insight, sharpened in related work with Adamou, is that economists since Bernoulli (1738) have maximized expected *utility*, an ensemble average across parallel states of the world. But individual investors experience *time* averages. They cannot access parallel universes where they both survived and went bankrupt.

Adamou's contribution matters here because the joint Peters-Adamou work did not just restate the ergodicity objection in abstract terms. It used specific paradoxes, especially the St. Petersburg paradox, to show how changing the time resolution of the problem changes what counts as a rational decision. That made the time-average interpretation concrete rather than merely philosophical.

The correction is simple: maximize the time-average growth rate. This is exactly Kelly's criterion. The logarithmic utility that Daniel Bernoulli invented to solve the St. Petersburg paradox in 1738 was correct, but economists forgot *why*: it emerges naturally from the non-ergodicity of multiplicative growth.

This also clarifies the limit of mean-variance thinking. Markowitz's framework is useful as a first approximation, but it treats risk as a tradeoff between average return and dispersion in a single period. It does not, by itself, encode the asymmetry of compounding through time or the special importance of ruin.

Paul Samuelson spent years arguing against overextending Kelly logic. His objections were not trivial; they forced the distinction between maximizing expected utility and maximizing long-run growth into the open. Even if one ultimately sides with Kelly and Peters for multiplicative wealth, Samuelson is part of the reason the debate became intellectually sharp.

## VII. Absorbing Barriers and the River

Taleb's river analogy makes the mathematics visceral:

> Never cross a river that is on average four feet deep.

If the river is eight feet deep in the middle and dry on the sides:
- **Arithmetic mean:** $(0 + 8)/2 = 4$ feet, which seems safe
- **Actual constraint:** if you are shorter than eight feet, the deep section still kills you

The river's average depth is irrelevant. What matters is whether any point along the path is deep enough to kill you. The same logic applies to multiplicative wealth: a single ruinous outcome matters more than an average taken across hypothetical parallel paths.

Mathematically, zero is an **absorbing barrier**. If wealth hits zero, the process stops. You cannot recover. The arithmetic mean ignores this because it averages across paths where you survived and paths where you died. The geometric mean, via the logarithm, assigns infinite negative utility to zero: $\ln(0) = -\infty$.

This is why the Kelly criterion never bets the entire bankroll on any positive-expected-value bet, no matter how good the odds. The logarithm's concavity makes ruin infinitely worse than any potential gain can compensate for.

## VIII. Fat Tails and Higher Moments

The Taylor expansion of $\ln(1+R)$ assumes returns are approximately normal. But financial returns live in **Extremistan** (Taleb's term for domains governed by fat-tailed distributions), where extreme events are far more likely than the normal distribution predicts.

Benoit Mandelbrot is the foundational figure here. Long before Taleb, Mandelbrot argued that speculative prices do not behave like neat Gaussian variables. They jump, cluster, and produce extreme moves far more often than classical models would suggest. Once that is true, the simple variance-based approximation is no longer enough. The tails start to dominate the economics.

Jean-Philippe Bouchaud pushed this critique further by arguing that the standard equilibrium picture of markets is too clean. His distinct contribution is to connect fat tails to market microstructure and crowd dynamics. Prices are shaped by crowding, feedback, market impact, and institutional structure, not just by tidy distributions around a stable mean. That matters because it means tail risk is not merely a statistical annoyance. It is built into how markets actually work.

Didier Sornette adds another important layer. His distinct contribution is to model bubbles and crashes as endogenous critical phenomena generated by positive feedback, imitation, and unstable market structure. In his framework, some of the biggest crashes are not random bolts from the blue. They are the natural end point of a system that has become reflexive and fragile.

When returns are fat-tailed, higher moments matter. The cleanest way to see that is to expand the logarithm directly around $R=0$. For $|R|<1$,

$$\ln(1+R) = R - \frac{R^2}{2} + \frac{R^3}{3} - \frac{R^4}{4} + ...$$

Taking expectations gives

$$\mathbb{E}[\ln(1+R)] = \mathbb{E}[R] - \frac{1}{2}\mathbb{E}[R^2] + \frac{1}{3}\mathbb{E}[R^3] - \frac{1}{4}\mathbb{E}[R^4] + ...$$

This version is more explicit than the shorthand mean-variance formula. The second moment enters with a negative sign, so dispersion hurts growth. The third moment enters with a positive sign, so positive skew helps and negative skew hurts. The fourth raw moment also enters with a negative sign, which means that large extreme moves, whether positive or negative, reduce expected log-growth unless they are offset elsewhere in the distribution. In practice, that means the simple mean-and-variance picture stops being enough once extreme moves become common.

Standard option pricing models (Black-Scholes) assume log-normal returns with thin tails. In practice, traders partially correct for this with volatility smiles and skews, but any framework that stays too close to a thin-tailed world will still understate the probability of extreme moves. The exact crash frequency matters less than the broader implication: left-tail events occur materially more often than a naive Gaussian calibration suggests.

**Taleb's barbell strategy** is the practical philosophical response to this entire section. His central point is not just that tails are fatter than standard models admit. It is that the right response to a fat-tailed world is to organize a portfolio so that ordinary outcomes are survivable and extraordinary dislocations become opportunities rather than existential threats. A barbell does exactly that: most of the capital sits in positions that are robust to ordinary noise, while a small allocation buys extreme convexity.

That structure matters because it changes the shape of the return distribution itself. The puts have bounded downside (premium paid) and very large upside in crashes. That creates positive skewness, limits exposure to ruinous left-tail states, and preserves the possibility of large gains when the system breaks. In Jensen-Kelly terms, Taleb's contribution is to insist that the objective is not to maximize a smooth average in a well-behaved world. It is to survive and compound in a discontinuous one.

Hyman Minsky belongs in this picture too. His core idea was that stability breeds fragility. Long calm periods encourage leverage, maturity mismatch, and crowded positioning, which makes the eventual break far more violent. That is exactly the kind of environment where average outcomes look benign right up until the left tail arrives. Mandelbrot, Bouchaud, Sornette, and Minsky all push in the same direction: the left tail is part of the structure of the world itself, not a small correction to a calm baseline.

## IX. The Put Strategy as Jensen-Optimal

Spitznagel's tail hedge, 100% SPY plus deep OTM puts, is where the abstract logic becomes a concrete portfolio. He is not simply buying disaster insurance in the conventional sense. He is taking the full chain, compounding, concavity, variance drag, fat tails, and non-ergodicity, and turning it into a specific capital-allocation rule.

That is why Spitznagel is central to this argument rather than an optional practitioner example. Many investors understand the words "fat tails" and still build portfolios as though crashes are just unpleasant drawdowns around a stable mean. Spitznagel's contribution is to force the implication all the way through: if the left tail dominates long-run outcomes, then convexity is not a cosmetic add-on. It belongs inside the core architecture of the portfolio.

Even in the simplest second-order approximation, the mechanism is clear. Recall the approximate growth formula:

$$G \approx \mu - \frac{\sigma^2}{2}$$

In this approximation, the comparison is explicit: the hedge is beneficial if the reduction in the variance term, $\sigma^2/2$, is larger than the reduction in the mean term, $\mu$. But that is only the local mean-variance description. The deeper mechanism is that the hedge truncates the left tail, which protects compounding from the states that do the most long-run damage. In symbols:
- **The cost:** Puts have negative expected return (they expire worthless ~95% of the time). This reduces $\mu$ by the premium paid, say 0.5% annually.
- **The benefit:** In a crash, puts can pay off many multiples of premium, truncating the left tail. In a local approximation, that often reduces the effective variance term. More importantly, it removes the states that are most destructive to future compounding.

Because the variance drag is quadratic in the local approximation, a modest reduction in tail risk can outweigh a linear premium cost. In the more realistic fat-tailed setting, the stronger statement is that sacrificing some carry can be worthwhile if it materially reduces exposure to ruinous states. Whether the trade is favorable depends on the price of convexity, the strike selection, the roll discipline, and how persistently the market underprices left-tail risk.

This is the geometry of the concave log function at work. The put strategy engineers a return distribution with less destructive left-tail exposure. In a local approximation, you can describe that as reducing the variance penalty more than the mean. In the fuller compounding picture, the more important fact is that the hedge protects you from the states that do disproportionate damage to long-run growth.

This is also where Taleb and Spitznagel meet. Taleb supplies the philosophical and statistical doctrine: avoid ruin, respect discontinuities, seek convexity. Spitznagel supplies the portfolio expression of that doctrine: keep the growth engine, then add a small convex structure that changes what happens in the worst states. The combination is what makes tail hedging more than a fear trade. It becomes a compounding strategy.

## X. Beyond Equities: Where Convexity is Cheaper

The SPY put strategy works, but it may not be optimal. The same Jensen-Kelly logic applies wherever there is reliable asymmetry between calm and crisis:

**Rates:** Central banks have a reaction function. They always cut in crises. The Fed dropped rates 500bps in 2008 and 150bps in two weeks during COVID. Rate options may underprice these panic-cut scenarios because models assume mean-reversion to stable levels. The trade: earn the risk-free rate, buy OTM calls on SOFR futures.

**FX Carry:** Currencies like AUD/JPY offer 3-5% annual carry. In stable times, you collect the differential. When risk-off hits, carry trades unwind violently. AUD/JPY dropped 40% in weeks during 2008. The carry itself can fund OTM puts on the high-yielder, creating a self-financing hedge.

**Credit:** Investment-grade bonds earn spread, but credit events cluster. CDS protection pays off exponentially when spreads blow out (IG: 50→250bps, HY: 300→2000bps in 2008). The barbell here is: earn IG spread, buy HY CDS protection.

**Commodities:** Oil exhibits extreme asymmetry. It grinds in a $60-80 range, then spikes to $140 (supply shock) or crashes to $20 (demand shock). Deep OTM strangles on crude capture both tails.

The Universa insight is to scan across these markets for wherever tail convexity is cheapest *right now*. Sometimes that's equity vol (2007), sometimes credit (2006), sometimes rates (2019).

## XI. How This Fits the Series

This article is the theoretical capstone of the previous three pieces in the *Leptokurtic* series. It is the point where the series stops presenting separate facts and starts presenting one unified structure.

The first article, [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](/series/leptokurtic/nine-centuries-of-exchange-rates/), used the [forex-centuries](https://github.com/unbalancedparentheses/forex-centuries) repository to show that fat tails, devaluations, regime shifts, and currency breakages are not modern anomalies. They are the historical baseline. That article established the empirical backdrop: the world is structurally more discontinuous than Gaussian finance admits.

The second article, [Detecting Crashes with Fat-Tail Statistics](/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/), used the [fatcrash](https://github.com/unbalancedparentheses/fatcrash) repository to test methods drawn from Sornette, Bouchaud, Taleb, and extreme value theory. That article moved from long-run historical evidence to live statistical diagnostics. It showed that crashes are not random noise around a stable mean. They often have detectable precursors, structural signatures, and tail behavior that standard tools miss.

The third article, [The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question](/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/), used the [options_backtester](https://github.com/lambdaclass/options_backtester) repository to test the put-overlay debate directly on real SPY options data. That article supplied the portfolio evidence: deep out-of-the-money convexity can improve the realized path of a portfolio when it is funded and sized in the way Spitznagel actually describes.

This article explains *why* those three results belong together. The long-run historical data, the crash-detection toolkit, and the options backtests all point at the same underlying structure: wealth compounds multiplicatively, the logarithm is concave, and variability is not a cosmetic annoyance. It changes the geometry of the process. The previous three pieces supplied the data, the diagnostics, and the implementation. This piece supplies the unifying mathematics, and makes explicit why they are all consequences of the same underlying logic.

## XII. The Full Intellectual Lineage

What looks like a single modern investing idea is actually a chain in which each figure adds one missing piece, or corrects one mistake, left by the previous framework:

1. **Napier (1614):** Napier gives the first indispensable tool. Logarithms linearize multiplicative processes, so $\log(ab) = \log(a) + \log(b)$. Without that move, there is no clean way to turn compounding into something that can be analyzed additively.

2. **Jacob Bernoulli (1685):** Bernoulli adds the natural growth constant $e$. That connects Napier's logarithmic tool to continuous compounding. Napier gives the language. Bernoulli gives the natural base for that language.

3. **Daniel Bernoulli (1738):** Daniel Bernoulli is the first major bridge from pure mathematics to decision theory. He takes the logarithm and applies it to risky choice, arguing that multiplicative risk changes rational behavior. He does not yet have Kelly or ergodicity, but he points in their direction.

4. **Jensen (1906):** Jensen supplies the missing geometric theorem. If the relevant function is concave, variability is penalized. That turns Daniel Bernoulli's logarithmic intuition into a general structural fact: once wealth is evaluated through a concave function, randomness has a systematic cost.

5. **von Neumann and Morgenstern (1944):** They formalize expected utility as the dominant benchmark for rational choice. This is the framework that later thinkers will refine, challenge, or partially reject. Their role is not to solve the compounding problem. Their role is to define the benchmark that Peters will later criticize.

6. **Shannon (1948):** Shannon makes uncertainty operational. Information is measurable, noise has structure, and better signals change what an optimal repeated decision looks like. This is the mathematical foundation that Kelly later turns into a capital-allocation rule.

7. **Markowitz (1952):** Markowitz gives finance a tractable one-period approximation through mean-variance analysis. That is a real advance, but it is also a simplification. He makes portfolio choice practical, while leaving compounding, path dependence, and ruin underemphasized.

8. **Kelly (1956):** Kelly takes Shannon's information-theoretic framework and translates it into repeated betting and investment. He shows how an edge should be converted into position size when the objective is long-run compound growth. This is where logarithms, information, and compounding become one explicit rule.

9. **Thorp and Breiman (1961 onward):** Thorp shows Kelly can be used in practice, and Breiman gives the long-run dominance result mathematical force.

10. **Mandelbrot (1963):** Mandelbrot challenges the statistical comfort behind standard finance. Returns are not well described by thin-tailed Gaussian assumptions. Once that is true, simple mean-variance reasoning becomes less reliable, and the left tail matters much more.

11. **Samuelson (1969):** Samuelson is the critic who forces the distinction between expected utility and long-run growth to be stated clearly.

12. **Minsky (1986):** Minsky adds the macro-financial mechanism: stability breeds fragility, so left-tail risk is generated by the system itself.

13. **Bouchaud (2003 to 2008):** Bouchaud links fat tails to market microstructure, feedback, and crowd behavior.

14. **Sornette (2003):** Sornette models bubbles and crashes as endogenous critical phenomena rather than exogenous shocks.

15. **Taleb (2007 to 2012):** Taleb is one of the true centers of gravity in the modern part of this story. He takes the statistical critique of fat tails and turns it into a doctrine of survival. Ruin, fragility, convexity, and asymmetry stop being technical side notes and become the core portfolio problem. Mandelbrot tells you the tails are fatter than you think. Taleb tells you that once you accept that fact, the whole logic of risk-taking has to change.

16. **Peters and Adamou (2011 to 2019):** Peters and Adamou are the other true center of gravity in the modern part of the argument. They reopen the foundations of decision theory by showing that non-ergodic multiplicative processes must be evaluated along time paths, not across hypothetical ensembles. This reconnects Kelly to a deeper justification: it is not just a clever betting rule. It is the correct objective for a non-ergodic compounding process.

17. **Spitznagel (2021):** Spitznagel is the implementation layer, and one of the central modern figures in the article's thesis. He takes the whole chain, logarithms, concavity, Kelly sizing, fat tails, fragility, and non-ergodicity, and turns it into a practical portfolio architecture built around convex protection and survival through crashes. He is the point where the mathematics ceases to be interpretation and becomes an actual portfolio design.

Seen this way, the chain is continuous. Wealth compounds multiplicatively, not additively, so the basic object is a product of wealth factors through time. Logarithms turn that product into a sum, which makes long-run growth analyzable. Once the relevant function is concave, Jensen's inequality tells you that variability lowers time-average growth relative to the arithmetic average. Kelly converts that geometry into a sizing rule: maximize expected log-growth, which means take as much exposure as your edge justifies but no more than your variance can support. Peters and Adamou then show why this is not just a preference for one utility function. In a non-ergodic compounding process, time-average growth is the mathematically relevant objective because a single investor lives through one realized path, not across many parallel worlds. That makes survival a structural requirement rather than a matter of taste. Tail hedging is the portfolio implementation of that logic: accept a small recurring cost in ordinary states to reduce exposure to the rare left-tail states that do the most damage to long-run compounding.

## Conclusion: The Geometry of Survival

Jensen's inequality is not just a mathematical curiosity. It is the geometry of survival in a multiplicative world. The concavity of the logarithm means that volatility is not just unpleasant. It is geometrically destructive, a quadratic drag on compound growth that compounds over time.

The 250-year mistake in economics was maximizing the wrong average. The arithmetic mean looks at what happens across a population of investors in parallel. The geometric mean looks at what happens to you through time. For a single investor compounding over decades, only the time average matters.

Tail hedging works because it respects this geometry. It accepts a small, certain reduction in arithmetic return (the put premium) in exchange for protection against the states that do the most damage to compound growth. In a local approximation, that looks like paying to reduce a quadratic variance penalty. In the fuller fat-tailed picture, it is better understood as paying to reduce exposure to ruinous left-tail paths.

The logarithm was invented to help astronomers multiply. Four hundred years later, it reveals why you should buy insurance on your stock portfolio, and why, in a world of compounding returns, survival comes before growth.

## References

- Bernoulli, J. (1685). *Ars Conjectandi* (posthumous, 1713)
- Bernoulli, D. (1738). "Specimen Theoriae Novae de Mensura Sortis."
- Jensen, J. L. W. V. (1906). "Sur les fonctions convexes et les inégalités entre les valeurs moyennes." *Acta Mathematica*, 30.
- Breiman, L. (1961). "Optimal Gambling Systems for Favorable Games."
- Bouchaud, J. P. (2008). "Economics Needs a Scientific Revolution." *Nature*, 455.
- Bouchaud, J. P. & Potters, M. (2003). *Theory of Financial Risk and Derivative Pricing*. Cambridge University Press.
- Kelly, J. L. (1956). "A New Interpretation of Information Rate." *Bell System Technical Journal*, 35(4).
- Mandelbrot, B. (1963). "The Variation of Certain Speculative Prices." *The Journal of Business*, 36(4).
- Mandelbrot, B. & Hudson, R. L. (2004). *The (Mis)Behavior of Markets*. Basic Books.
- Markowitz, H. (1952). "Portfolio Selection." *The Journal of Finance*, 7(1).
- Minsky, H. P. (1986). *Stabilizing an Unstable Economy*. Yale University Press.
- Napier, J. (1614). *Mirifici Logarithmorum Canonis Descriptio*
- Peters, O. & Adamou, A. (2011). "The Time Resolution of the St Petersburg Paradox." *Philosophical Transactions of the Royal Society A*, 369(1956).
- Sornette, D. (2003). *Why Stock Markets Crash*. Princeton University Press.
- Sornette, D. (2017). *Why Stock Markets Crash: Critical Events in Complex Financial Systems* (updated edition). Princeton University Press.
- Peters, O. (2019). "The Ergodicity Problem in Economics." *Nature Physics*, 15.
- Peters, O. & Gell-Mann, M. (2016). "Evaluating Gambles Using Dynamics." *Chaos*, 26(2).
- Samuelson, P. A. (1969). "Lifetime Portfolio Selection by Dynamic Stochastic Programming."
- Shannon, C. E. (1948). "A Mathematical Theory of Communication." *Bell System Technical Journal*, 27.
- Spitznagel, M. (2021). *Safe Haven: Investing for Financial Storms*. Wiley.
- Taleb, N. N. (2007). *The Black Swan: The Impact of the Highly Improbable*. Random House.
- Taleb, N. N. (2012). *Antifragile: Things That Gain from Disorder*. Random House.
- Thorp, E. O. (1997). "The Kelly Criterion in Blackjack, Sports Betting, and the Stock Market."
- von Neumann, J. & Morgenstern, O. (1944). *Theory of Games and Economic Behavior*. Princeton University Press.

---

### The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question

*Published: 2026-02-26*

> We tested Spitznagel's tail hedging strategy and AQR's critique with 17 years of real SPY options data. AQR is correct about self-funded protective puts, but that is not the portfolio Spitznagel and Universa describe. The externally funded deep OTM put overlay wins on every metric in our sample. Biweekly rebalancing is the practical default (Sharpe 1.1); weekly pushes the backtest Sharpe to 2.0 but should be read as an upper bound. Macro signals are useless for timing.

URL: https://federicocarrone.com/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/

Stock markets crash. The S&P 500 price index fell about 57% from October 9, 2007 to March 9, 2009, and about 34% from February 19, 2020 to March 23, 2020.[^sp500_2007_2009][^sp500_2020] A **put option** is a contract that pays you when the market falls below a certain price (the "strike"). If you hold stocks and also hold puts, the puts can offset some of your losses during a crash. The question is whether the cost of buying puts is worth the protection they provide.

There are two sides. AQR Capital Management, one of the largest hedge funds in the world, published ["Chasing Your Own Tail (Risk)"](https://www.aqr.com/-/media/AQR/Documents/Insights/White-Papers/AQR-Chasing-Your-Own-Tail-Risk.pdf) ([Nielsen, Villalon, and Berger, 2011](https://www.aqr.com/-/media/AQR/Documents/Insights/White-Papers/AQR-Chasing-Your-Own-Tail-Risk.pdf)). They argue that buying puts systematically costs more than it saves. On the other side, Mark Spitznagel at Universa Investments, where Nassim Taleb is scientific advisor, argues that a small put allocation improves long-term returns ([Spitznagel, 2021](https://www.wiley.com/en-us/Safe+Haven%3A+Investing+for+Financial+Storms-p-9781119401797)). Universa reported a 3,612% gain in March 2020 (via an investor letter, as reported by Bloomberg).[^universa_2020]

We tested both claims with [our open-source options backtester](https://github.com/lambdaclass/options_backtester) on 17 years of real SPY options data (2008 to 2025), covering three crashes: the 2008 financial crisis, COVID, and the 2022 bear market.

Spitznagel is right about the strategy he actually proposes. AQR is right about self-funded protective puts, but that is not the portfolio Spitznagel and Universa say they run.

<!-- more -->

## Why puts are expensive

To understand this debate, we need to start with how options are priced.

An option's price depends heavily on **implied volatility** (IV): the market's estimate of how much the stock price will move in the future. Higher expected movement means the option is worth more, because there's a greater chance it will end up profitable.

In practice, implied volatility is consistently higher than what actually materializes (**realized volatility**). This gap is called the **Variance Risk Premium** (VRP):

$$\text{VRP} = \sigma^2\_{\text{implied}} - \sigma^2\_{\text{realized}}$$

Think of it this way: $\sigma^2\_{\text{implied}}$ is what the market *expects* the variance to be. $\sigma^2\_{\text{realized}}$ is what *actually happens*. The difference is the premium that option buyers pay over fair value.

[Carr and Wu (2009)](https://academic.oup.com/rfs/article-abstract/22/3/1311/1581057) documented that this spread is persistently positive. Put buyers pay more than fair value on average. The reason is that investors are willing to overpay for crash protection, the same way homeowners overpay for fire insurance relative to the expected loss from fire. [Bollerslev, Tauchen, and Zhou (2009)](https://scholars.duke.edu/publication/732839) went further: they showed that the VRP is not just a cost — it *predicts* future stock returns. When the gap between implied and realized variance is wide, future equity returns tend to be higher. The same force that makes puts expensive (fear of crashes) also drives the equity premium that the stock portion of the portfolio earns.

[Israelov (2019)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2934538) confirmed the negative average return of puts and titled his paper "Pathetic Protection: The Elusive Benefits of Protective Puts." The [CBOE S&P 500 Put Protection Index (PPUT)](https://cdn.cboe.com/api/global/us_indices/governance/Cboe_SP_500_Put_Protection_Indices_Methodology.pdf) formalizes this as a benchmark: it holds the S&P 500 and buys monthly at-the-money (ATM) puts. It has underperformed the unhedged index over most periods. But ATM puts are the most expensive possible hedge — they have the highest theta decay and the lowest convexity. Testing ATM puts and concluding "puts don't work" is like testing a sedan on a racetrack and concluding "cars are slow." The deep OTM puts Spitznagel uses cost a fraction of ATM puts and have far more convexity.

AQR's argument stops here. Puts lose money on average. Therefore they hurt portfolio performance.

This reasoning is incomplete. It looks only at the average return of the put (the **first statistical moment**, the mean). It ignores what the put does to the volatility of the portfolio (the **second moment**, the variance). Compounding depends on both.

## How volatility destroys compounding

If you invest money and earn the same return every year, your wealth compounds smoothly. But if returns fluctuate, even with the same *average*, you end up with less. This is called **variance drain**, and it's the key to understanding why Spitznagel's strategy works.

The geometric (compound) growth rate of a portfolio is approximately (for returns that are small relative to 1, under lognormal assumptions):

$$G \approx \mu - \frac{\sigma^2}{2}$$

Here $\mu$ is the arithmetic mean return (the simple average of all yearly returns) and $\sigma$ is the standard deviation of those returns (a measure of how much they fluctuate). The term $\frac{\sigma^2}{2}$ is the variance drain: the penalty that volatility imposes on compounding.

A simple example shows why this happens. Start with 100 dollars. Gain 50% one year, lose 50% the next.

- After year 1: $100 \times 1.5 = 150$
- After year 2: $150 \times 0.5 = 75$

The arithmetic average return is $\frac{+50\% + (-50\%)}{2} = 0\%$. But you do not end up with 100 dollars. You end up with 75 dollars. You lost 25% despite an average return of zero. The gain and loss were symmetric in percentage terms, but the loss applied to a larger base (150 dollars), so it took away more than the gain added. That is variance drain.

The drain is **quadratic** in volatility, meaning it grows with the *square* of the fluctuations:

| Portfolio volatility ($\sigma$) | Variance drain ($\frac{\sigma^2}{2}$) |
|:-------------------------------:|:-------------------------------------:|
| 10% | 0.5%/yr |
| 20% | 2.0%/yr |
| 40% | 8.0%/yr |

Doubling volatility quadruples the drain. This means that large drawdowns are disproportionately costly to long-run wealth. A single 50% crash costs more in compounding terms than ten 5% corrections, even if the total percentage lost is the same.

On SPY (2008 to 2025):
- Arithmetic mean ($\mu$): 12.50%/yr
- Geometric mean ($G$): 11.07%/yr
- Variance drain ($\frac{\sigma^2}{2}$): 1.43%/yr
- Peak rolling drain during the 2008 crisis: 10.5%/yr

**Spitznagel's thesis is about this second term.** If puts reduce portfolio variance by cutting off the worst drawdowns, the reduction in $\frac{\sigma^2}{2}$ can exceed the premium paid. A put costs money on average (it hurts $\mu$, the first moment). But by truncating the worst losses, it reduces the quadratic drag on the portfolio (it helps $\frac{\sigma^2}{2}$, the second moment). The net effect on compound growth $G$ can be positive because the variance drain grows with the *square* of the loss. Preventing a few large drawdowns saves more in compounding terms than the cumulative premium costs.

## Fat tails and put mispricing

Taleb makes a related but distinct argument in [*The Black Swan*](https://en.wikipedia.org/wiki/The_Black_Swan:_The_Impact_of_the_Highly_Improbable) and [*Statistical Consequences of Fat Tails*](https://arxiv.org/abs/2001.10488).

Standard option pricing models (like [Black-Scholes](https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model)) assume returns follow something close to a **normal (Gaussian) distribution**. In a normal distribution, events far from the average are extraordinarily rare. A 50% crash in the S&P 500 would be roughly a 4-sigma event, with a probability near one in 30,000 years.

In reality, we have had three drawdowns exceeding 30% in the last 17 years alone. Real markets have **fat tails**: extreme events are far more frequent than Gaussian models predict. The probability of a large crash is not $10^{-5}$ (as Gaussian models suggest) but more like $10^{-1}$ or $10^{-2}$.

This has a direct consequence for put pricing. Option markets do price skew: deep OTM puts trade at higher implied volatility than ATM options, reflecting some awareness of tail risk. But even after skew is priced, **deep OTM puts may still be cheap relative to the realized frequency of crashes**. The VRP shows that puts are expensive relative to realized volatility in normal times. But the relevant comparison for deep OTM puts is not average realized volatility; it is the actual frequency and magnitude of extreme drawdowns. Three drawdowns exceeding 30% in 17 years is far more than skew-adjusted models typically imply.

Taleb calls this the difference between **Mediocristan** (where Gaussian statistics work, like human height) and **Extremistan** (where they do not, like financial returns). The S&P 500 lives in Extremistan.

The variance drain argument (puts reduce $\frac{\sigma^2}{2}$) and the mispricing argument (puts are cheap relative to true tail probabilities) are independent. Either one alone could justify the strategy. Together they explain why the results are as strong as they are.

## Theoretical foundations

The variance drain argument and the fat-tail mispricing argument have deeper roots than the Spitznagel-AQR debate suggests.

**Ole Peters and ergodicity economics.** The variance drain formula $G \approx \mu - \frac{\sigma^2}{2}$ is a special case of a broader insight. [Peters (2019)](https://www.nature.com/articles/s41567-019-0732-0) argues that classical expected-value reasoning fails for multiplicative processes like portfolio growth. The ensemble average (what happens across many parallel investors) diverges from the time average (what happens to one investor over many periods). For a single investor compounding over decades, the time average is what matters — and it is always lower than the ensemble average when returns fluctuate. Spitznagel's strategy works because it improves the time-average growth rate, even though it reduces the ensemble-average return (by paying premium). Most of finance optimizes for the wrong average.

**Bouchaud on fat tails and hedging.** [Bouchaud, Iori, and Sornette (1996)](https://www.cfm.com/wp-content/uploads/2022/12/237-1994-real-world-options-smile-and-residual-risk.pdf) showed that in fat-tailed markets, Black-Scholes delta hedging leaves large residual risk. The standard model assumes continuous rebalancing in a Gaussian world; real markets have jumps and heavy tails that make perfect hedging impossible. This means option sellers bear more risk than their models suggest — and option buyers (like tail hedgers) get more protection than the models price in. This is the theoretical basis for why deep OTM puts may be systematically cheap relative to true tail risk.

**Sornette on endogenous crashes.** [Sornette (2003)](https://press.princeton.edu/books/paperback/9780691175959/why-stock-markets-crash) argues that large crashes are not exogenous shocks but endogenous instabilities — the result of self-reinforcing feedback loops (herding, leverage, procyclical risk management) that build up over months or years before releasing suddenly. His [Log-Periodic Power Law Singularity (LPPLS)](https://en.wikipedia.org/wiki/Didier_Sornette#Log-periodic_power_law_model) model attempts to detect these signatures. This is relevant to our macro-signal finding: standard indicators (VIX, yield curve, credit spreads) measure risk levels but not the endogenous buildup that precedes crashes. Sornette's approach is structurally different — it looks for acceleration patterns in price itself — though its real-time track record remains debated.

**Rare disaster models.** [Barro (2006)](https://academic.oup.com/qje/article-abstract/121/3/823/1917876) formalized the idea that the equity premium itself may be compensation for rare catastrophic events. If investors demand higher average returns because crashes happen, then the equity premium and the tail-hedge premium are two sides of the same coin. [Kelly and Jiang (2014)](https://academic.oup.com/rfs/article/27/10/2841/1607080) showed that time-varying tail risk is priced in equity cross-sections. [Bollerslev, Tauchen, and Zhou (2009)](https://scholars.duke.edu/publication/732839) demonstrated that the variance risk premium predicts future stock returns — the same VRP that makes puts expensive also signals future equity returns.

## The core disagreement

This is where the debate breaks down. The two sides are not testing the same portfolio.

### What AQR tests

AQR tests portfolios where you **sell some of your stocks** to buy puts:

$$R\_{\text{portfolio}} = (1-w) \cdot R\_{\text{SPY}} + w \cdot R\_{\text{puts}}$$

At $w = 1\%$: you hold 99% in stocks, 1% in puts. Total portfolio: 100%.

This always loses. You are taking money out of your best asset (stocks, which go up on average) and putting it into an asset with negative expected return (puts, which expire worthless most of the time). The arithmetic is straightforward and AQR is correct about it.

### What Spitznagel actually does

Spitznagel keeps **100% in stocks** and buys puts with a small separate budget on top:

$$R\_{\text{portfolio}} = 1.0 \cdot R\_{\text{SPY}} + w \cdot R\_{\text{puts}}$$

The strategy requires a small amount of capital beyond the core equity position to fund the put premium. Some might call this leverage. But it is fundamentally different from ordinary leverage.

Ordinary leverage means borrowing money to buy more stocks. If you borrow to hold 130% in stocks, your gains are 30% bigger but your losses are also 30% bigger. Drawdowns get worse in proportion to the leverage. The payoff is symmetric: leverage amplifies both good and bad outcomes equally.

A put overlay works differently. It is **asymmetric**:
- In calm markets, you bleed a small, known premium (the cost of the puts).
- In a crash, the puts pay off at 10x to 50x the premium paid.

This asymmetry happens because a deep OTM put's sensitivity to the market (its **delta**, $\Delta$) increases as prices fall. Delta measures how much the put's price changes per 1-dollar change in the stock. A deep OTM put starts with a delta near zero (barely reacts to market moves). As the market drops and the put moves closer to being "in the money," delta approaches $-1.0$ (moves dollar-for-dollar with the stock). A small position becomes a large hedge exactly when you need it.

AQR's published analysis does not test this portfolio construction. That does not make AQR's portfolio math wrong. It means their critique applies to a self-funded hedge, not to the externally funded overlay Spitznagel describes.

The public debate has been contentious. [Taleb and Asness clashed publicly in May 2020](https://www.bloomberg.com/news/articles/2020-05-21/taleb-spars-with-asness-on-twitter-over-tail-risk-hedges) over whether Universa's March 2020 returns proved the strategy works. [Aaron Brown (ex-AQR) wrote in Bloomberg](https://www.bloomberg.com/opinion/articles/2023-04-06/universa-s-3-126-black-swan-return-is-legit-but-with-an-asterisk) that Universa's percentage returns are "legit but with an asterisk" — the 3,612% is on the put allocation, not the total portfolio. [CalPERS' then-CIO Ben Meng argued](https://www.bloomberg.com/news/articles/2020-04-17/calpers-cio-says-his-hedges-worked-better-than-tail-risk-funds) that their alternative hedges outperformed tail-risk funds. These disagreements often reduce to framing: what denominator you use, and whether you funded the puts by selling stocks. AQR's follow-up work — [Israelov (2017)](https://www.aqr.com/Insights/Research/White-Papers/Pathetic-Protection-The-Elusive-Benefits-of-Protective-Puts) and [Hurst, Ooi, and Pedersen (2017)](https://www.aqr.com/insights/research/white-papers/tail-risk-hedging-contrasting-put-and-trend-strategies) — continues to test ATM or near-the-money puts in the sell-stocks-to-fund-puts framing. Neither paper tests the deep OTM overlay that Spitznagel actually runs.

## Results

All tests use deep OTM puts (delta $-0.10$ to $-0.02$, 90 to 180 days to expiration, monthly roll) on real SPY options data from 2008 to 2025.

A note on terminology: **"deep OTM"** means the put's strike price is far below the current market price. A put with delta $-0.02$ has roughly a 2% chance of ending up profitable. These puts are very cheap, but when the market crashes, they can multiply in value 10x to 50x.

### AQR framing: sell stocks to fund puts

| Config | Annual Return | Excess vs SPY | Max Drawdown |
|--------|:------------:|:-------------:|:------------:|
| SPY only | +11.11% | | -51.9% |
| 99.9% SPY + 0.1% deep OTM | +10.70% | -0.35% | -51.8% |
| 99.5% SPY + 0.5% deep OTM | +9.23% | -1.81% | -50.3% |
| 99% SPY + 1% deep OTM | +7.38% | -3.67% | -48.4% |
| 96.7% SPY + 3.3% deep OTM | -1.28% | -12.33% | -39.6% |

Every configuration underperforms. Performance degrades as the put allocation increases. At 3.3%, the strategy loses 1.28% per year. This happens because the steady premium bleed compounds downward while the equity exposure is reduced, so long non-crash periods grind the portfolio value lower. AQR is right about this framing. The disagreement is not about the math. It is about whether this is the portfolio Spitznagel and Universa are actually advocating.

### Spitznagel framing: 100% stocks plus puts on top

| Config | Annual Return | Excess vs SPY | Max Drawdown |
|--------|:------------:|:-------------:|:------------:|
| 100% SPY (baseline) | +11.05% | | -51.9% |
| 100% SPY + 0.05% deep OTM | +11.53% | +0.49% | -51.8% |
| 100% SPY + 0.1% deep OTM | +12.05% | +1.00% | -51.2% |
| 100% SPY + 0.2% deep OTM | +13.02% | +1.98% | -50.0% |
| 100% SPY + 0.5% deep OTM | +16.02% | +4.97% | -47.1% |
| 100% SPY + 1.0% deep OTM | +21.08% | +10.03% | -42.4% |
| 100% SPY + 2.0% deep OTM | +31.73% | +20.69% | -32.0% |
| 100% SPY + 3.3% deep OTM | +46.60% | +35.55% | -29.2% |

In this sample, every tested configuration outperforms gross of transaction costs and taxes. Both annual return and max drawdown improve at every budget level. At 0.5% annual premium budget, the total capital committed beyond 100% SPY is just the put premium, yet we see +4.97% excess return and a 4.8 percentage point improvement in max drawdown. The excess return is far larger than the premium spent, which is the signature of convexity. We explain why below.

Standard OTM puts (closer to the money, delta $-0.25$ to $-0.10$) in the same framing:

| Config | Annual Return | Excess vs SPY | Max Drawdown |
|--------|:------------:|:-------------:|:------------:|
| 100% SPY + 0.1% std OTM | +12.04% | +0.99% | -51.1% |
| 100% SPY + 0.5% std OTM | +15.80% | +4.75% | -47.8% |
| 100% SPY + 1.0% std OTM | +20.60% | +9.56% | -43.6% |

Both types of puts work. Deep OTM puts produce more hedge per dollar spent because their delta increases more dramatically during a crash (they have more **convexity**, meaning their payoff accelerates as the market falls further).

### Convexity breakdown

The following table shows the full picture. **Sharpe ratio** measures risk-adjusted return: $\text{Sharpe} = \frac{R\_\text{portfolio} - R\_\text{risk-free}}{\sigma\_\text{portfolio}}$. A higher Sharpe means more return per unit of risk. The risk-free rate is set to 4%.

| Strategy | Premium %/yr | Annual % | Excess % | Return per 1% Premium | Max DD % | Vol % | Sharpe |
|----------|:----------:|:--------:|:--------:|:-------------------:|:--------:|:-----:|:------:|
| 100% SPY (baseline) | 0.00 | 11.05 | +0.00 | | -51.9 | 20.0 | 0.353 |
| + 0.05% deep OTM | 0.05 | 11.53 | +0.49 | 9.8 | -51.8 | 19.7 | 0.382 |
| + 0.1% deep OTM | 0.10 | 12.05 | +1.00 | 10.0 | -51.2 | 19.4 | 0.414 |
| + 0.2% deep OTM | 0.20 | 13.02 | +1.98 | 9.9 | -50.0 | 19.0 | 0.476 |
| + 0.5% deep OTM | 0.50 | 16.02 | +4.97 | 9.9 | -47.1 | 17.8 | 0.676 |
| + 1.0% deep OTM | 1.00 | 21.08 | +10.03 | 10.0 | -42.4 | 16.7 | 1.020 |
| + 2.0% deep OTM | 2.00 | 31.73 | +20.69 | 10.3 | -32.0 | 17.7 | 1.565 |
| + 3.3% deep OTM | 3.30 | 46.60 | +35.55 | 10.8 | -29.2 | 22.7 | 1.879 |

Two things stand out. First, in this gross backtest, the return per 1% of annual put premium is roughly **10x** across all budget levels. Each 1% of annual premium spent on deep OTM puts generates about 10% of excess return. This ratio is stable across budget levels, which means the convexity of the puts is consistent. It is not an artifact of a single configuration. Transaction costs and bid-ask spreads would reduce this ratio in live trading, but would need to consume the majority of the premium to eliminate the effect.

Second, the Sharpe ratio increases monotonically from 0.353 (SPY alone) to 1.879 (3.3% budget). The strategy improves risk-adjusted returns at every level, not just raw returns. At 0.5% budget, the Sharpe is 0.676 versus 0.353 for unhedged SPY.

## Why it works: convexity, not leverage

The word "leverage" is misleading here. The put premium is not the same as notional exposure. When you spend 0.5% of portfolio value on deep OTM puts, you are not adding 0.5% of equity exposure. You are buying contingent downside convexity: a payoff that is near zero most of the time and very large during crashes. If you instead spent 0.5% borrowing to buy more stocks, the excess return would be about 0.05%/yr (0.5% of the equity premium). Instead, we observe +4.97%/yr. The actual excess is roughly **100 times** what linear leverage would produce. The extra return is not coming from additional market exposure. It is coming from the put's convexity.

This distinction matters because ordinary leverage and a put overlay have opposite effects on the two quantities that determine compound growth:

$$G \approx \mu - \frac{\sigma^2}{2}$$

**Ordinary leverage** (borrowing to buy more stocks) scales both terms proportionally. If you use 1.5x leverage, $\mu$ increases by 50% but $\sigma$ also increases by 50%, so $\sigma^2$ increases by 125%. The variance drain grows faster than the return. This is why leveraged ETFs [underperform their stated multiple](https://www.investopedia.com/articles/financial-advisors/082515/why-leveraged-etfs-are-not-longterm-bet.asp) over long periods — they win on the first moment and lose on the second.

**A put overlay** works on each moment independently. The premium is a small, linear cost to $\mu$ (the first moment). But the put's payoff during a crash truncates the left tail of the return distribution, which disproportionately reduces $\sigma^2$ (the second moment). Because the drain is quadratic in volatility, even a modest reduction in tail losses saves more in compounding terms than the premium costs.

A concrete example: suppose SPY drops 50% in a year. Without puts, that single year's contribution to variance drain is roughly $0.50^2 / 2 = 12.5\%$. With puts that offset 10% of the decline (reducing the loss to 40%), the drain contribution drops to $0.40^2 / 2 = 8.0\%$, a savings of 4.5 percentage points — from a put position that cost 0.5% of the portfolio.

Taleb describes this structure as a **barbell** in [*Antifragile*](https://en.wikipedia.org/wiki/Antifragile_(book)): combine a large, safe position with a small, highly convex one, and avoid the middle. The Spitznagel portfolio is a barbell. The bulk (100%) is in a broad equity index. A small sliver (0.1% to 1%) is in deep OTM puts.

The bulk earns the market return. The sliver has **bounded downside** (you can only lose the premium) and **convex upside** (the puts can return 10x to 50x during a crash). A "medium risk" portfolio with 80% stocks and 20% bonds reduces your exposure to crashes but also reduces your exposure to the equity premium. The barbell keeps full exposure to the equity premium while adding crash protection through a completely different mechanism.

As described above, this asymmetry comes from the put's delta shifting from near zero to near $-1.0$ as the market crashes — a tiny position becomes a large hedge exactly when you need it. Borrowing cannot replicate this. Borrowing amplifies gains and losses symmetrically. Puts amplify only the crash payoff.

**Ordinary leverage can wipe you out.** If you borrow to hold 150% in stocks and the market drops 50%, you lose 75% of your equity. A margin call forces you to sell at the bottom.

**A put overlay cannot do this.** If you spend 0.5% of your portfolio on puts and those puts expire worthless, you lose 0.5%. That is the worst outcome. The maximum loss is the premium paid, which you know at purchase. There is no margin call.

Comparing a 50% market decline:
- **100% SPY + 0.5% in puts**: roughly a 47% loss (puts pay off during the decline)
- **100.5% SPY via margin**: a 50.25% loss (leverage amplifies the decline)

The put overlay reduces the drawdown. The margin position amplifies it. Similar total capital committed, opposite outcomes.

## Sensitivity and robustness

A strategy that only works with one specific set of parameters is likely overfitted to the data. We tested this concern from multiple angles.

### Parameter sensitivity

We ran a 24-combination grid search across **DTE** (days to expiration: how far out the put expires), **delta** (how far out of the money), **exit timing** (when to close the position), and **budget** (how much to spend annually). The top 10 configurations by Sharpe ratio (SPY Sharpe: 0.353):

| DTE | Delta | Exit DTE | Budget % | Annual % | Excess % | Max DD % | Vol % | Sharpe |
|-----|:-----:|:--------:|:--------:|:--------:|:--------:|:--------:|:-----:|:------:|
| 120-240 | (-0.10, -0.02) | 14 | 1.0 | 22.14 | +11.09 | -42.9 | 16.9 | 1.071 |
| 90-180 | (-0.15, -0.05) | 60 | 1.0 | 21.70 | +10.65 | -42.5 | 16.6 | 1.066 |
| 120-240 | (-0.10, -0.02) | 60 | 1.0 | 22.14 | +11.09 | -44.0 | 17.1 | 1.062 |
| 120-240 | (-0.10, -0.02) | 30 | 1.0 | 22.05 | +11.00 | -43.7 | 17.0 | 1.061 |
| 120-240 | (-0.15, -0.05) | 60 | 1.0 | 22.09 | +11.04 | -45.1 | 17.2 | 1.049 |
| 120-240 | (-0.15, -0.05) | 30 | 1.0 | 21.92 | +10.87 | -45.2 | 17.1 | 1.048 |
| 90-180 | (-0.15, -0.05) | 30 | 1.0 | 21.33 | +10.28 | -42.8 | 16.5 | 1.048 |
| 90-180 | (-0.15, -0.05) | 14 | 1.0 | 21.29 | +10.24 | -42.5 | 16.5 | 1.048 |
| 120-240 | (-0.15, -0.05) | 14 | 1.0 | 21.88 | +10.84 | -45.3 | 17.1 | 1.046 |
| 90-180 | (-0.10, -0.02) | 60 | 1.0 | 21.41 | +10.37 | -43.3 | 16.9 | 1.031 |

The patterns are clear. All top 10 are at the 1.0% budget, which is the highest tested. More convexity exposure produces better results. Both DTE ranges appear in the top 10, with slightly longer-dated puts performing slightly better. Both delta ranges work. Exit timing has little impact — DTE 14, 30, and 60 all appear.

The single best configuration: DTE 120 to 240, delta (-0.10, -0.02), exit at DTE 14, 1% budget. This produces 22.14%/yr with a Sharpe of 1.071 and max drawdown of -42.9%.

**All 24 parameter combinations beat SPY.** The worst configuration still outperforms by +4.96%/yr. All 24 have a higher Sharpe ratio than unhedged SPY. Within the tested parameter range on this asset and period, the result does not depend on picking the right parameters. This is encouraging but not definitive: 24 combinations on one asset over 17 years is a limited grid, and different markets or longer time horizons could narrow the margins.

### Rebalance frequency

**Rebalancing** means closing existing put positions and buying new ones. Rebalance frequency is the single most impactful parameter after budget size. All prior results use monthly rebalancing. More frequent rebalancing captures more crash payoffs because you replace expired or decayed puts faster, maintaining continuous protection:

| Frequency | Annual % | Excess % | Max DD % | Vol % | Sharpe |
|-----------|:--------:|:--------:|:--------:|:-----:|:------:|
| Monthly | 16.02 | +4.97 | -47.1 | 17.8 | 0.676 |
| Biweekly | 24.59 | +13.54 | -44.6 | 18.6 | 1.106 |
| Weekly | 41.61 | +30.56 | -38.8 | 19.0 | 1.981 |

Biweekly rebalancing is the best practical middle ground in this sample. It improves materially on monthly (24.59%/yr vs. 16.02%/yr) while requiring far fewer rolls than weekly. Weekly rebalancing pushes the backtest further, to 41.61%/yr with a Sharpe of 1.981 and max drawdown improving from -47.1% to -38.8%, but this should be read as an upper-bound sensitivity result rather than a realistic default. With monthly rolls, there are gaps in coverage as puts decay (they lose value as time passes, a phenomenon called **theta decay**). More frequent rolls keep the hedge closer to full strength at all times.

The practical tradeoff is transaction costs. Biweekly rolling means 26 trades per year versus 12 for monthly and 52 for weekly. At realistic bid/ask spreads for deep OTM puts, the transaction cost drag rises with turnover. Biweekly is easier to defend as a live implementation. Weekly may still outperform monthly after costs, but the gap is less certain once spread, slippage, and intermittent illiquidity are included.

Quarterly and semi-annual rebalancing eliminate most of the benefit. Long gaps between rolls leave the portfolio unhedged for extended periods, which is precisely when a crash can strike.

### Profit targets

We tested profit target exits at 3x, 5x, 10x, and 20x the premium paid (e.g., a put bought for 100 dollars is sold when it reaches 300 dollars at 3x). The result: profit targets barely matter at monthly rebalancing. Holding the puts until the DTE exit date produces results nearly identical to taking profits at any threshold. This is because the convex payoff of deep OTM puts concentrates in a few extreme events. Taking profits at 3x or 5x caps the upside on exactly the trades that drive the strategy's edge. The 50x payoff during a crash funds years of premium bleed. Capping it at 10x removes most of the value.

### Macro signal timing

We tested whether macro indicators could improve put timing: buy more puts when a crash seems likely, fewer when it doesn't. Signals tested include VIX (the market's "fear gauge"), GDP growth, high-yield credit spreads, the yield curve (10Y-2Y treasury spread, which [inverts before recessions](https://www.newyorkfed.org/research/capital_markets/ycfaq.html)), non-financial corporate equity, the dollar index, the [Buffett Indicator](https://en.wikipedia.org/wiki/Buffett_indicator) (market cap/GDP), and Tobin's Q.

None of them improve put timing. The unconditional strategy (fixed budget, no signal) outperforms every signal-conditioned variant. The reason is that crash timing is inherently unpredictable. The VIX was low before both the 2008 crisis and COVID. The Buffett Indicator has been elevated for decades. Credit spreads were tight in early 2020. These signals contain information about risk levels but not about timing. The put strategy works precisely because it does not try to time crashes. It pays a small, steady cost for permanent protection.

### Out-of-sample validation

A fair objection: 17 years with three crashes may overstate the long-run crash frequency. A 20-year period with no crashes would bleed premium with no payoff. The response is that the premium is small (0.1% to 0.5% per year), so even infrequent crashes are enough to break even.

We tested this directly. We split the data in half and ran the same default configuration (0.5% budget, DTE 90 to 180, delta -0.10 to -0.02) on both periods without re-optimizing:

| Period | Strategy | SPY B&H | Excess | Max DD |
|--------|:--------:|:-------:|:------:|:------:|
| 2008 to 2016 | 12.14% | 7.29% | +4.85% | -47.1% |
| 2016 to 2025 | 20.02% | 14.92% | +5.09% | -22.3% |
| Full period | 16.02% | 11.05% | +4.97% | -47.1% |

The strategy beats SPY in both halves. The first half contains the GFC (the largest crash in the sample). The second half contains COVID and the 2022 bear market. The excess return is positive in both periods, which means the result is not driven by a single event.

The strongest argument for overfitting remains that the entire edge comes from three crashes. If those crashes had been 20% milder, or if the next 17 years produce no drawdown worse than 25%, the strategy may underperform. What we can say is that the strategy is robust to parameter choice and survives an out-of-sample split, but this is still limited evidence rather than a definitive long-horizon proof.

## Limitations and open questions

**Capacity and execution.** Deep OTM puts have limited liquidity, especially during stress. Bid-ask spreads on SPY puts with delta below $-0.05$ can exceed 20% of the mid price. During the March 2020 crash, some deep OTM strikes had no bids at all for hours. A strategy that works at \$10M may not scale to \$10B. Universa manages this by trading across multiple markets and maintaining dealer relationships, but capacity constraints are real.

**Financing source.** Where the put budget comes from matters. Our backtest treats it as an external cost. In practice, the premium could come from reducing equity exposure (AQR framing), from a separate cash allocation, or from an institutional budget line. The choice affects both the portfolio math and the behavioral likelihood of maintaining the strategy through long bleed periods.

**Tax and turnover.** Monthly rolling generates 12 short-term capital loss events per year. In taxable accounts, the interaction between put losses, put gains during crashes, and equity capital gains creates complex tax consequences. This drag is absent from our backtest.

**Regime dependence of skew pricing.** The volatility skew (how much more expensive OTM puts are relative to ATM options) varies over time. After 2008, skew steepened dramatically — deep OTM puts became more expensive. If the market "learns" to price tail risk more accurately, the edge may compress. Conversely, long calm periods tend to flatten skew, making puts cheaper again.

**Comparison with other tail hedges.** We only test put-based strategies. [Hurst, Ooi, and Pedersen (2017)](https://www.aqr.com/insights/research/white-papers/tail-risk-hedging-contrasting-put-and-trend-strategies) at AQR argue that trend-following (managed futures) provides crash protection more cheaply than puts because trend strategies *earn* a positive premium on average rather than bleeding one. A fair comparison would test both approaches on the same data. Our backtester currently does not support trend-following overlays.

## Future work: Beyond equities

The SPY put strategy works, but it may not be the optimal application of Spitznagel's structure. The same logic (steady carry plus cheap convexity on extreme moves) applies wherever there is a reliable asymmetry between calm periods and crises. The best market depends on the regime. In a classic disinflationary recession, rates options may be superior. For portfolios already earning carry, FX may be the most natural fit. For institutions with access to OTC markets, credit can offer very strong crisis convexity. VIX is the most direct panic hedge, but often the hardest to own cheaply enough. Several markets exhibit structural tail properties that can be as strong as, or stronger than, equities:

**Rates and rate futures.** Central banks tend to cut rates aggressively in crises. The Fed dropped rates from 5.25% to 0.25% during the 2008 crisis, and from 1.5% to 0% in two weeks during COVID. These moves are 10x to 20x larger than normal monthly rate changes. Rate options may underprice these panic-cut scenarios because standard models assume mean-reversion around stable levels. The trade would be: earn the risk-free rate (or hold short-term Treasuries), buy OTM calls on SOFR futures that pay off when rates collapse. The counterexample is stagflation: if inflation is high during a recession, central banks may not cut, and rate-based tail hedges would fail. This makes rates a conditional hedge rather than a universal one.

**FX carry trades.** Currencies like AUD/JPY and MXN/JPY offer interest rate differentials of 3 to 5% annually. In stable times, carry traders collect this premium. When risk sentiment shifts, these positions unwind violently. The 2008 crisis saw AUD/JPY drop 40% in weeks. OTM puts on the high-yield currency may be systematically cheap relative to the crash risk because Gaussian models treat carry-trade unwinds as low-probability events. The carry itself could fund the protection.

**Credit and CDS.** Investment-grade bonds earn a spread over Treasuries, but credit events are rare and clustered. The barbell structure here is: hold IG bonds for the spread, buy OTM protection on HY or IG CDS indices. The protection bleeds a small annual premium in calm markets. When credit stress hits, the payoff is convex: the 2008 crisis took IG CDS from 50bps to 250bps (5x) and HY CDS from 300bps to 2000bps (6.7x). CDS has a natural asymmetry similar to puts: bounded cost (the annual premium), unbounded upside in a credit crisis.

**Volatility products.** Buying calls on the VIX is the most direct tail hedge. These options can be extremely convex: when volatility explodes, short-dated VIX calls can reprice very quickly. That is why they are so attractive during panics. But high convexity does not automatically mean a good trade. The key distinction is between **convexity** (how fast the payoff accelerates in a selloff) and **efficiency** (how much convexity you get for the premium you pay).

The VIX itself trades around 12 to 15 in calm markets and can spike to 80+ during crashes (it hit 82.69 on March 16, 2020). At first glance, that makes VIX calls look like the perfect hedge. The confusion is that VIX options are not priced on spot VIX. They are priced on **VIX futures**. So even if spot VIX jumps from 15 to 80 intraday, the option payoff depends on how much the relevant VIX future moves, which is usually much less. This means the eye-catching spot spike overstates the actual option payout.

Short-dated ATM or slightly OTM VIX calls can still have very strong convexity because they are sensitive to sharp near-term changes in implied volatility. But that convexity is usually expensive because it is obvious and heavily demanded by investors looking for crash insurance. On top of that, the VIX futures curve is often in **contango** in calm markets, which means forward volatility is already priced above spot. That carry drag makes long VIX exposure expensive to hold over time.

So VIX calls are not weak because they lack convexity. They are often less efficient because the convexity is expensive, the payoff is filtered through the futures curve rather than spot VIX, and volatility mean-reverts quickly after the panic. Whether the remaining edge is still worth paying for is an empirical question.

**Commodities.** Oil crashes during demand shocks, which tend to coincide with equity crashes: crude fell from $145 to $30 in 2008 and briefly went negative in April 2020. OTM puts on crude oil futures would pay off in exactly these scenarios. The directional thesis is weaker than rates or credit because supply shocks push oil the other way (up during crises like the 1973 embargo or 2022 Ukraine war). This makes crude a noisier hedge than the other markets listed here.

**Cross-market diversification.** The strongest argument for testing multiple markets is not finding the single best hedge but combining several. Crises are correlated: when equities crash, rates get cut, carry trades unwind, credit spreads blow out, and the VIX spikes. The crash payoffs across markets are positively correlated, but the bleed costs are largely independent (rate option decay has nothing to do with FX option decay). A portfolio that spreads 0.5% of annual premium across four markets would bleed roughly the same total amount as concentrating in one, but the probability of at least one leg paying off in any given crisis is higher. This diversification of bleed with correlation of payoff is the multi-market version of Spitznagel's variance drain argument.

If the question is "what is better than SPY puts?", the useful answer is to match the hedge to the portfolio and the regime:

- If your core risk is an equity book and you want the simplest implementation, SPY puts remain the clean default.
- If your main concern is a deflationary recession with aggressive central-bank cuts, rates options may be better.
- If the portfolio already earns carry, FX options on carry trades may be the most natural extension because the carry can help fund the hedge.
- If you are an institution with OTC access and real size, credit hedges may offer the best crisis convexity.
- If you want the purest panic exposure and can tolerate rich pricing, VIX calls are the cleanest but often the least efficient.

So the practical conclusion is not that one asset dominates. It is that the "best" tail hedge depends on what you already own, what kind of crash you fear, and which markets you can actually trade cheaply and consistently. For most allocators, the strongest implementation is likely to start with equity puts, then diversify a small convexity budget across rates, FX, or credit only when the portfolio, regime, and execution capability justify it.

Testing these alternatives requires different data (CME futures and options, CDS term structures, FX options, VIX futures and options) and modifications to the backtester. This is ongoing work.

## Implementation

Our backtester uses monthly rebalancing: buy the lowest-premium deep OTM put available within the target delta and expiration range. This is a simple, mechanical strategy.

**Funding assumption**: the put budget is treated as an external, fixed annual premium (e.g., 0.5% of portfolio value) rather than being funded by selling SPY. This is the key distinction from AQR's setup. The fair benchmark is not plain SPY alone but SPY plus the same external capital source without the puts (e.g., SPY + 0.5% in cash). Since the premium is small and cash earns the risk-free rate, the benchmark difference is minor (roughly 0.02% per year at 0.5% budget and 4% risk-free rate), but the framing matters: the outperformance comes from the convexity of the puts, not from deploying more total capital. If you instead fund the put budget by reducing SPY, you recover the AQR framing and the results degrade.

**Methodology note**: no attempt is made to optimize timing; the strategy is purely rules-based. Real-world frictions (bid/ask spreads, slippage, and taxes) would reduce headline returns but should not remove the convexity effect.

Universa's actual implementation is more sophisticated. They manage rolls continuously to maintain their desired exposure profile. They reinvest put profits into stocks at crash lows, buying when prices are depressed. They hedge across multiple markets, not just the S&P 500. Our backtest results are a lower bound on the performance of the actual strategy.

## Code

The backtester and all notebooks are open source:

[github.com/lambdaclass/options_backtester](https://github.com/lambdaclass/options_backtester)

- [spitznagel_case.ipynb](https://github.com/lambdaclass/options_backtester/blob/master/notebooks/spitznagel_case.ipynb): full Spitznagel analysis with both framings, AQR vs Universa comparison, grid search, out-of-sample validation
- [paper_comparison.ipynb](https://github.com/lambdaclass/options_backtester/blob/master/notebooks/paper_comparison.ipynb): 10 strategies tested against academic claims
- [findings.ipynb](https://github.com/lambdaclass/options_backtester/blob/master/notebooks/findings.ipynb): allocation sweeps, macro signal evaluation, crash period analysis
- [volatility_premium.ipynb](https://github.com/lambdaclass/options_backtester/blob/master/notebooks/volatility_premium.ipynb): variance risk premium deep dive
- [trade_analysis.ipynb](https://github.com/lambdaclass/options_backtester/blob/master/notebooks/trade_analysis.ipynb): per-trade P&L with signal overlays

The performance-critical paths (inventory joins, grid sweeps, filter evaluation) have an optional Rust core via PyO3 and Polars, providing 10-50x speedups. The parallel grid sweep uses Rayon for shared-memory parallelism.

## Disclaimer

This article is research and educational material only. It is not financial advice, investment advice, or a recommendation to buy or sell any security or derivative. Past performance, whether backtested or live, does not guarantee future results. Options trading involves substantial risk of loss. The backtest results presented here are gross of transaction costs, taxes, and slippage, and may not be replicable in live trading. Consult a qualified financial advisor before making any investment decisions.

## References

- Barro, R. (2006). [*Rare Disasters and Asset Markets in the Twentieth Century*](https://academic.oup.com/qje/article-abstract/121/3/823/1917876). Quarterly J. Economics, 121(3).
- Bollerslev, T., Tauchen, G., and Zhou, H. (2009). [*Expected Stock Returns and Variance Risk Premia*](https://scholars.duke.edu/publication/732839). Review of Financial Studies, 22(11).
- Bouchaud, J.-P., Iori, G., and Sornette, D. (1996). [*Real-World Options: Smile and Residual Risk*](https://www.cfm.com/wp-content/uploads/2022/12/237-1994-real-world-options-smile-and-residual-risk.pdf). Risk, 9(3).
- Carr, P. and Wu, L. (2009). [*Variance Risk Premiums*](https://academic.oup.com/rfs/article-abstract/22/3/1311/1581057). Review of Financial Studies, 22(3).
- Hurst, B., Ooi, Y. H., and Pedersen, L. H. (2017). [*Tail Risk Hedging: Contrasting Put and Trend Strategies*](https://www.aqr.com/insights/research/white-papers/tail-risk-hedging-contrasting-put-and-trend-strategies). AQR White Paper.
- Israelov, R. (2019). [*Pathetic Protection: The Elusive Benefits of Protective Puts*](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2934538). J. Alternative Investments, 21(3).
- Kelly, B. and Jiang, H. (2014). [*Tail Risk and Asset Prices*](https://academic.oup.com/rfs/article/27/10/2841/1607080). Review of Financial Studies, 27(10).
- Nielsen, L., Villalon, D., and Berger, A. (2011). [*Chasing Your Own Tail (Risk)*](https://www.aqr.com/-/media/AQR/Documents/Insights/White-Papers/AQR-Chasing-Your-Own-Tail-Risk.pdf). AQR White Paper.
- Peters, O. (2019). [*The Ergodicity Problem in Economics*](https://www.nature.com/articles/s41567-019-0732-0). Nature Physics, 15.
- Sornette, D. (2003). [*Why Stock Markets Crash*](https://press.princeton.edu/books/paperback/9780691175959/why-stock-markets-crash). Princeton University Press.
- Spitznagel, M. (2021). [*Safe Haven: Investing for Financial Storms*](https://www.wiley.com/en-us/Safe+Haven%3A+Investing+for+Financial+Storms-p-9781119401797). Wiley.
- Taleb, N. N. (2007). [*The Black Swan: The Impact of the Highly Improbable*](https://en.wikipedia.org/wiki/The_Black_Swan:_The_Impact_of_the_Highly_Improbable). Random House.
- Taleb, N. N. (2012). [*Antifragile: Things That Gain from Disorder*](https://en.wikipedia.org/wiki/Antifragile_(book)). Random House.
- Taleb, N. N. (2020). [*Statistical Consequences of Fat Tails*](https://arxiv.org/abs/2001.10488). STEM Academic Press.
- Whaley, R. (2002). *Return and Risk of CBOE Buy Write Monthly Index*. J. Derivatives.
- See [REFERENCES.md](https://github.com/lambdaclass/options_backtester/blob/master/REFERENCES.md) for the full list.

[^sp500_2007_2009]: The S&P 500 closed at 1,565.15 on October 9, 2007 and 676.53 on March 9, 2009, a 56.8% decline on closing prices. The SOA Research Brief Table 3 reports −59%, likely using intraday highs and lows. See [SOA Research Brief (Apr 16, 2020)](https://www.soa.org/globalassets/assets/files/resources/research-report/2020/2020-covid-19-research-brief-04-16.pdf).
[^sp500_2020]: The same brief notes: "the S&P 500 cratered on March 23, down 34% from its February 19 level." See [SOA Research Brief (Apr 16, 2020)](https://www.soa.org/globalassets/assets/files/resources/research-report/2020/2020-covid-19-research-brief-04-16.pdf).
[^universa_2020]: Bloomberg reports the fund "returned 3,612% in March" and that this came "according to an investor letter ... obtained by Bloomberg." See [Taleb-Advised Universa Tail Fund Returned 3,600% in March](https://www.bloomberg.com/news/articles/2020-04-08/taleb-advised-universa-tail-risk-fund-returned-3-600-in-march).

---

### Detecting Crashes with Fat-Tail Statistics

*Published: 2026-02-19*

> We built fatcrash, a Rust+Python toolkit with 17 crash detection methods: LPPLS, DFA, EVT, Hill, Kappa, Hurst, GSADF, momentum/reversal, price velocity, and 2 neural network methods. Tested on 96 drawdowns across BTC, SPY, Gold, 23 forex pairs, and equity crises with honest precision/recall/F1 metrics. Plus: which methods transfer to revenue and profit data.

URL: https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/

Financial markets don't follow normal distributions. That is a claim about frequency, not just theory: it tells you how often catastrophic events happen. Under a Gaussian model, the 2008 financial crisis was a 25-sigma event. Something that should happen once every $10^{135}$ years. It happened on a Tuesday.

The problem is that we keep using tools designed for thin-tailed worlds. **Value at Risk** (**VaR**) models that assume normality. Risk metrics that treat the 2008 crash as an "outlier" rather than a regular feature of financial returns.

I built [fatcrash](https://github.com/unbalancedparentheses/fatcrash), a Rust+Python toolkit with 17 methods (15 classical + 2 neural network), to test whether fat-tail statistical methods can detect crashes before they happen. The performance-critical math (fitting, simulation, all rolling estimators) runs in Rust via PyO3; the neural network methods use PyTorch; everything else (data, viz, CLI) is Python.

<!-- more -->

## What are fat tails?

A **fat-tailed distribution** is one where extreme events happen far more often than a bell curve (Gaussian distribution) would predict. In a normal distribution, an event five standard deviations from the mean is essentially impossible, roughly a one-in-3.5-million chance. In a fat-tailed distribution, such events are uncommon but not rare. They show up regularly in financial data.

The technical way to describe this is through the **tail index**, usually written $\alpha$. A fat-tailed distribution follows a **power law** in the extremes: the probability of a loss larger than $x$ decays as $P(X > x) \sim x^{-\alpha}$. The smaller the $\alpha$, the fatter the tail and the more likely extreme events are.

Here is a rough guide to what different values of $\alpha$ mean:

- $\alpha < 2$: **Infinite variance.** The distribution is so fat-tailed that the variance doesn't converge with more data. Standard statistics like standard deviation and correlation become unreliable. This is [Cauchy distribution](https://en.wikipedia.org/wiki/Cauchy_distribution) territory.
- $\alpha$ between 2 and 4: **Finite variance but infinite kurtosis.** Kurtosis measures how "peaked" a distribution is and how heavy its tails are. When kurtosis is infinite, sample estimates of it are unstable and misleading. This is where most financial assets live.
- $\alpha > 4$: **Relatively thin tails.** Still fatter than Gaussian, but manageable with conventional tools.

The 15 classical methods in fatcrash fall into three groups: **bubble detection** (finding the specific price pattern that precedes a crash), **regime detection** (spotting shifts in how the market behaves over time, including momentum reversals and volatility cascades), and **tail estimation** (measuring how fat the tails actually are). The remaining 4 are neural network methods that learn LPPLS-like patterns from data. Let's walk through each group.

## Bubble detection

These methods look for structural patterns in prices, not statistics of returns. A bubble is a regime of super-exponential growth, prices rising faster and faster, that eventually becomes unsustainable.

### LPPLS: detecting bubbles before they burst

The **Log-Periodic Power Law Singularity** model takes a fundamentally different approach from statistical methods. Instead of measuring properties of returns, it detects a specific pattern in prices: **the bubble signature**.

The theory, developed by [Didier Sornette](https://en.wikipedia.org/wiki/Didier_Sornette) at ETH Zurich and described in his book [*Why Stock Markets Crash*](https://press.princeton.edu/books/paperback/9780691175959/why-stock-markets-crash), proposes that during a bubble, prices follow super-exponential growth decorated with accelerating oscillations that converge toward a **critical time** $t_c$, the most likely crash date. Think of it like a wine glass vibrating at increasing frequency before it shatters.

$$\ln p(t) = A + B(t_c - t)^m + C(t_c - t)^m \cos(\omega \ln(t_c - t) + \phi)$$

In plain language: the logarithm of price is the sum of a smooth power-law growth (the $B$ term) and an oscillation whose frequency accelerates as you approach $t_c$ (the cosine term). The seven parameters encode specific dynamics:

- $t_c$: critical time (when the bubble is most likely to end)
- $m$: power law exponent (must be 0.1-0.9 for a valid bubble)
- $\omega$: log-periodic frequency (must be 6-13)
- $B < 0$: required, indicates super-exponential growth
- $A, C, \phi$: amplitude and phase parameters

Fitting this is computationally expensive. For each candidate $(t_c, m, \omega)$, the linear parameters $(A, B, C_1, C_2)$ are solved analytically via OLS. The nonlinear search uses a population-based stochastic optimizer over the 3D space. The **Sornette filter** rejects fits that don't satisfy the physical constraints.

The **DS LPPLS confidence indicator** fits this model across many overlapping time windows. If a high fraction of windows produce valid bubble fits, confidence is high.

**In practice**: With the tightened Nielsen (2024) filter (omega [6,13]) and a critical-time proximity constraint, LPPLS achieves 74% recall and 37% precision (F1=50%) across 39 drawdowns. It detects the bubble regime itself (super-exponential growth + oscillations), which precedes both small corrections and major crashes. The LPPLS confidence indicator (multi-window aggregation) reaches 90% recall but 29% precision. The high false positive rate is inherent: LPPLS frequently detects "bubble signatures" during normal bull markets because super-exponential growth patterns are common in trending markets.

### GSADF: explosive unit roots

The **Generalized Sup ADF** (Augmented Dickey-Fuller) test, introduced by [Phillips, Shi, and Yu (2015)](https://onlinelibrary.wiley.com/doi/abs/10.1111/iere.12132), detects explosive behavior in prices. To understand it, you need one concept: a **unit root**. A time series has a unit root if it follows a random walk, meaning today's value is yesterday's value plus random noise, with no tendency to return to a long-run average. An **explosive** series goes further: it grows faster than a random walk, each day's value is a *multiple* of yesterday's, like compound interest gone wild.

GSADF runs backward-expanding [ADF unit root tests](https://en.wikipedia.org/wiki/Augmented_Dickey%E2%80%93Fuller_test) across all possible start and end dates, taking the supremum (the largest test statistic). If this supremum exceeds Monte Carlo critical values, the series is explosive.

GSADF is complementary to LPPLS. LPPLS detects the specific log-periodic oscillation pattern. GSADF detects any form of explosive growth, regardless of the oscillation structure.

**In practice**: GSADF detected 38% of drawdowns overall, but 59% of medium-sized drawdowns (15-30%). It achieves 38% precision, the highest among all methods, because explosive unit root tests are more specific than distributional measures.

## Regime detection

These methods look at how the *temporal structure* of returns changes over time. Before a crash, markets often shift from noisy, mean-reverting behavior to strong, persistent trending, a regime change that tail estimators can't see because they only measure distributional shape, not temporal dependence.

### DFA: detrended fluctuation analysis

**Detrended Fluctuation Analysis**, introduced by [Peng et al. (1994)](https://en.wikipedia.org/wiki/Detrended_fluctuation_analysis), measures **long-range dependence** in non-stationary time series, that is, whether today's returns are correlated with returns from days or weeks ago. The method works by dividing the integrated series (cumulative sum of returns) into windows, fitting a local polynomial trend in each window, computing the root-mean-square residual (how much the data deviates from the local trend), and checking how that residual scales with window size:

$$F(n) \sim n^{\alpha_{\text{DFA}}}$$

This formula says: the fluctuation $F$ at scale $n$ (window size) grows as a power of $n$. The **scaling exponent** $\alpha_{\text{DFA}}$ classifies the dynamics:

- $\alpha_{\text{DFA}} = 0.5$: uncorrelated (random walk), no memory in the series
- $\alpha_{\text{DFA}} > 0.5$: **persistent** (trends tend to continue), an up day makes another up day more likely
- $\alpha_{\text{DFA}} < 0.5$: **anti-persistent** (mean-reverting), an up day makes a down day more likely

Before a crash, markets often transition from mean-reverting to persistent dynamics. DFA picks up this regime shift. The key advantage over simpler methods is the **detrending step**: by removing local polynomial trends before measuring fluctuations, DFA separates genuine long-range dependence from spurious correlations caused by local trends.

**In practice**: DFA was the best non-bubble crash detector in our tests (82% recall, 22% precision, F1=34%). It handles non-stationarity better than Hurst's R/S analysis because the detrending step removes local polynomial trends before measuring fluctuations. The low precision reflects the fact that persistent dynamics are common in financial markets even outside crash windows.

### Hurst exponent: persistence detection

The **Hurst exponent**, introduced by [Harold Edwin Hurst](https://en.wikipedia.org/wiki/Hurst_exponent) in 1951 while studying Nile river flooding patterns, measures long-range dependence via **rescaled range (R/S) analysis**. For a time series of length $n$, compute the range of cumulative deviations from the mean, rescale by the standard deviation, and measure how $R/S$ scales with $n$:

$$\frac{R}{S} \sim n^H$$

This formula says: the rescaled range grows as a power of the sample size. $H = 0.5$ is a random walk. $H > 0.5$ is persistent. $H < 0.5$ is mean-reverting. Financial assets typically show $H$ between 0.55 and 0.85. A shift toward higher $H$ before a crash means the market is trending more strongly, which often accompanies bubble formation.

**In practice**: Hurst detected 59% of drawdowns. It is simpler than DFA but less robust to non-stationarity, because it doesn't remove local trends before measuring the range.

### Spectral exponent: frequency domain

The **GPH log-periodogram regression**, introduced by [Geweke and Porter-Hudak (1983)](https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9892.1983.tb00371.x), estimates the **long-memory parameter** $d$ from the frequency domain. Instead of looking at how correlations decay over time (as DFA and Hurst do), it looks at how much power the signal has at different frequencies.

The relationship to Hurst: $d = H - 0.5$. Positive $d$ indicates long memory (persistence). Think of it as measuring the same phenomenon (long-range dependence) but through a different lens: time domain vs. frequency domain.

**In practice**: It detected 28% of drawdowns, comparable to Hill. The frequency-domain approach is theoretically elegant but doesn't add much beyond what DFA already captures for crash detection.

### Momentum and reversal

**Momentum** is the tendency for assets that have been rising to keep rising, and assets that have been falling to keep falling. [Jegadeesh and Titman (1993)](https://doi.org/10.1111/j.1540-6261.1993.tb04702.x) documented this effect in equities: buying past winners and selling past losers produces positive returns over 3-12 month horizons. For crash detection, the signal is not momentum itself but its **reversal**. When long-term momentum is strongly positive (the asset has been trending up for months) but short-term momentum turns sharply negative (the last few weeks show a sudden decline), the divergence signals a potential crash — the trend is breaking.

The reversal signal is computed as:

$$\text{reversal} = \text{mom}_{\text{long}} - \text{mom}_{\text{short}}$$

where $\text{mom}(k) = \ln(P_t / P_{t-k})$ is the log-return over $k$ periods. A large positive reversal means the long-term trend is still up but the short-term move is down — exactly the pattern seen at the onset of major crashes, when a strong bull market suddenly reverses.

[Scowcroft and Sefton (2005)](https://doi.org/10.1057/palgrave.jam.2240167) showed that momentum returns are partly compensation for crash risk: momentum strategies are profitable on average but suffer catastrophic losses during market reversals. This is the same asymmetry in reverse — the crash unwinds the positions that momentum built up.

**In practice**: Momentum reversal is not used as a standalone detector (it doesn't have its own precision/recall row). Instead, it feeds into the combined detector as an independent signal. A large positive reversal — strong 12-month returns but negative 1-month returns — is a danger sign that the existing regime methods (DFA, Hurst) can't see because they measure temporal dependence, not price-level divergence.

### Price velocity: cascade detection

**Price velocity** measures the rate of change of realized volatility — not how volatile the market is, but how fast volatility is *accelerating*. A sudden spike in volatility acceleration often signals a forced-liquidation cascade: margin calls triggering sales, which trigger more margin calls, which trigger more sales.

$$\text{velocity} = \frac{\sigma_t - \sigma_{t-\text{lag}}}{\sigma_{t-\text{lag}}}$$

where $\sigma_t$ is the realized volatility (standard deviation of recent returns) and $\text{lag}$ is the lookback for the rate of change. When velocity exceeds a threshold, the market is in a self-reinforcing volatility spiral.

The Feb 5, 2018 "Volmageddon" is the canonical example: the VIX doubled in a single day, triggering the liquidation of short-volatility ETNs (XIV, SVXY), which forced further VIX buying, which triggered more liquidations. The Sep 2019 repo rate spike followed a similar pattern in interest rate markets. In both cases, the price velocity — the acceleration of volatility, not its level — distinguished a mechanical cascade from ordinary high-vol conditions.

**In practice**: Like momentum reversal, price velocity is not a standalone detector. It feeds into the combined detector as an independent signal. Its value is in catching a specific failure mode that other methods miss: the mechanical cascade, where the crash *causes itself* through forced selling. Tail estimators measure the distribution of returns; DFA and Hurst measure persistence; velocity measures the feedback loop in real time.

## Tail estimation

These methods directly measure the fatness of the tails, how extreme the extremes really are. They answer questions like: "How often should we expect a 10% daily loss?" and "Is the variance of this distribution even finite?"

### Hill estimator: measuring tail heaviness

The [Hill estimator](https://en.wikipedia.org/wiki/Hill_estimator) ([Hill, 1975](http://www.econ.uiuc.edu/~econ536/Papers/hill75.pdf)) is the most widely used tail index estimator. It fits a power law to the extreme values of a distribution and estimates the exponent $\alpha$.

The estimator works by sorting the data from largest to smallest, taking the $k$ largest observations (called **order statistics**, which is just a fancy name for sorted values), and computing:

$$\hat{\alpha} = k \left( \sum_{i=1}^{k} \ln \frac{X_{(i)}}{X_{(k)}} \right)^{-1}$$

where $X_{(1)} \geq X_{(2)} \geq \ldots$ are the order statistics. In words: take the $k$ biggest values, compute how far each is from the $k$-th largest (in log scale), average those distances, and invert. A small average log-gap means the extreme values are tightly packed (thin tail), and a large gap means they're spread out (fat tail).

The choice of $k$ matters enormously. Too small and the estimate is noisy (not enough data points). Too large and you're including observations from the body of the distribution, not the tail. A **Hill plot** ($\alpha$ vs. $k$) helps find the plateau where the estimate stabilizes.

**In practice**: Hill alpha is useful for characterization: it tells you *what kind of distribution you're dealing with*. But as a standalone crash predictor, it's noisy. In our tests, it only detected 28% of drawdowns.

### Kappa metrics: how far from Gaussian?

Two metrics answer this question from different angles.

**Taleb's kappa**, introduced in [*Statistical Consequences of Fat Tails*](https://arxiv.org/abs/2001.10488) by [Nassim Nicholas Taleb](https://en.wikipedia.org/wiki/Nassim_Nicholas_Taleb), measures how fast the **mean absolute deviation** (MAD), the average distance of observations from their mean, converges as you add more data. For well-behaved distributions, the MAD stabilizes quickly. For fat-tailed ones, it doesn't, because new extreme observations keep pulling the average around. The formula compares the MAD at two sample sizes $n_0$ and $n$:

$$\kappa = 2 - \frac{\log n - \log n_0}{\log M(n) - \log M(n_0)}$$

where $M(n)$ is the MAD for $n$ summands. For a Gaussian, $\kappa = 0$ (fast convergence). For a [Cauchy distribution](https://en.wikipedia.org/wiki/Cauchy_distribution), $\kappa = 1$ (no convergence at all). Values between 0 and 1 measure the degree of fat-tailedness.

**Max-stability kappa** takes a different approach rooted in [extreme value theory](https://en.wikipedia.org/wiki/Extreme_value_theory). The intuition: in a fat-tailed distribution, the single most extreme observation dominates everything. If you split your data into subsamples and find the maximum of each subsample, those subsample maxima will be much smaller than the overall maximum, because the one truly extreme value ended up in just one subsample. In a Gaussian distribution, the subsample maxima would be closer to the overall maximum.

Formally: split your data into $n$ subsamples. Find the maximum of each subsample. Compare the mean of those maxima to the overall maximum:

$$\kappa_{\text{max}} = \frac{\text{mean of subsample maxima}}{\text{overall maximum}}$$

For a Gaussian distribution, this ratio converges to a known benchmark (approximately $1/\sqrt{n}$ for large samples). For fat-tailed distributions, $\kappa_{\text{max}}$ falls below the benchmark because extreme observations are *much* more extreme than what you'd see in any subsample.

The ratio $\kappa_{\text{max}} / \text{benchmark}$ is the signal:
- Near 1.0: behaves Gaussian
- Below 0.8: significantly fat-tailed
- Below 0.5: extremely fat-tailed, crisis regime

fatcrash implements both variants.

**In practice**: Max-stability kappa was the best tail-based method in our tests (49% overall detection rate). It's more robust than Hill because it doesn't depend on choosing $k$, and it directly benchmarks against Gaussian via Monte Carlo simulation. Taleb's kappa detected 33% of drawdowns but is more useful for long-term characterization than short-term prediction.

### Pickands estimator: domain-agnostic tail index

The [Pickands estimator](https://en.wikipedia.org/wiki/Pickands%E2%80%93Balkema%E2%80%93de_Haan_theorem) ([Pickands, 1975](https://projecteuclid.org/journals/annals-of-statistics/volume-3/issue-1/Statistical-Inference-Using-Extreme-Order-Statistics/10.1214/aos/1176343003.full)) estimates the **extreme value index** $\gamma$ using just three order statistics, making it valid for all three domains of attraction (Frechet, Gumbel, Weibull, explained below in the EVT section):

$$\hat{\gamma} = \frac{1}{\ln 2} \ln \frac{X_{(k)} - X_{(2k)}}{X_{(2k)} - X_{(4k)}}$$

In words: look at the gaps between the $k$-th, $2k$-th, and $4k$-th largest values. If the gap between the top values is much larger than the gap further down, the tail is fat ($\gamma > 0$). If the gaps are similar, the tail is exponential ($\gamma \approx 0$). If the top gap is smaller, the tail is bounded ($\gamma < 0$).

Unlike Hill, which assumes the tail is Pareto (Frechet domain only), Pickands works regardless of the tail type.

**In practice**: Pickands detected 49% of drawdowns, matching max-stability kappa. Its domain-agnostic nature makes it a useful cross-check on Hill.

### DEH moment estimator

The **Dekkers-Einmahl-de Haan moment estimator** ([Dekkers, Einmahl, and de Haan, 1989](https://projecteuclid.org/journals/annals-of-statistics/volume-17/issue-4/A-Moment-Estimator-for-the-Index-of-an-Extreme-Value/10.1214/aos/1176347397.full)) uses first and second moments (averages and averages of squares) of log-spacings between order statistics. Like Pickands, it is valid for all domains of attraction, but it uses more data points from the tail, which makes it less volatile.

**In practice**: It detected 46% of drawdowns.

### QQ estimator

The **QQ estimator** computes the tail index from the slope of a log-log **QQ plot** (quantile-quantile plot) against exponential quantiles. A QQ plot compares the observed distribution against a theoretical one; if the points fall on a straight line, the distributions match. The slope of that line in log-log space gives you the tail index.

**In practice**: It detected 38% of drawdowns.

### Maximum-to-Sum ratio

The **Maximum-to-Sum ratio** is a direct diagnostic for infinite variance. For $n$ observations, compute:

$$R_n = \frac{\max(|X_i|)}{\sum(|X_i|)}$$

In words: what fraction of the total absolute value comes from the single largest observation? If one observation dominates the entire sum, the distribution likely has infinite variance. If $R_n$ stays bounded away from zero as $n$ grows, the variance is infinite.

**In practice**: It detected 31% of drawdowns.

### EVT: quantifying worst-case scenarios

**Extreme Value Theory** ([EVT](https://en.wikipedia.org/wiki/Extreme_value_theory)) is the standard mathematical framework for modeling tail risk. Instead of fitting a distribution to all the data (where the bulk dominates and extreme events are treated as noise), EVT focuses only on the extremes.

Two complementary approaches:

**GPD (Generalized Pareto Distribution)**: Pick a high threshold $u$ (say, losses worse than the 95th percentile). Fit the [GPD](https://en.wikipedia.org/wiki/Generalized_Pareto_distribution) to losses that exceed $u$. The [Pickands-Balkema-de Haan theorem](https://en.wikipedia.org/wiki/Pickands%E2%80%93Balkema%E2%80%93de_Haan_theorem) guarantees that for sufficiently high $u$, the exceedances follow a GPD regardless of the underlying distribution. The GPD has two parameters: scale ($\sigma$, how spread out the exceedances are) and shape ($\xi$, how fat the tail is). From these you get:

$$\text{VaR}_p = u + \frac{\sigma}{\xi}\left[\left(\frac{n}{N_u}(1-p)\right)^{-\xi} - 1\right]$$

$$\text{ES}_p = \frac{\text{VaR}_p + \sigma - \xi u}{1 - \xi}$$

**VaR** (Value at Risk) tells you the loss you won't exceed with probability $p$. For example, a 99% VaR of 5% means that on 99% of days, you'll lose less than 5%. **ES** (**Expected Shortfall**, also called Conditional VaR) tells you the average loss *when you do* exceed VaR, answering the question "when things go badly, how bad do they get on average?"

**GEV (Generalized Extreme Value)**: Instead of exceedances over a threshold, fit to **block maxima** (e.g., the worst loss each month). The [Fisher-Tippett-Gnedenko theorem](https://en.wikipedia.org/wiki/Fisher%E2%80%93Tippett%E2%80%93Gnedenko_theorem) guarantees that block maxima converge to a GEV distribution. The shape parameter $\xi$ tells you the tail type:
- $\xi > 0$: **Frechet** (fat tail, power-law decay), typical for finance
- $\xi \approx 0$: **Gumbel** (exponential tail)
- $\xi < 0$: **Weibull** (bounded tail, there's a maximum possible value)

**In practice**: GPD VaR detected 42% of drawdowns. It works well for medium corrections but struggles with major crashes because the pre-crash period is itself volatile, making the baseline VaR already elevated.

## Results on 39 drawdowns

We tested all methods on 39 drawdowns across three assets (BTC, SPY, Gold). A **drawdown** is defined as a peak-to-trough decline in daily close; the pre-crash window is the 120 trading days before the peak, and the calm window is a similar period ending well before the peak. A method "detects" a crash if its signal during the pre-crash window is significantly elevated compared to the calm window.

We also test each method on ~150 non-crash windows (50 per asset, sampled at least 180 days from any crash) to measure false positive rates. This gives us precision, recall, and F1 — not just recall.

The table below shows the 13 classical methods that have standalone precision/recall/F1 scores. The 2 neural network methods (M-LNN, P-LNN) are evaluated separately; momentum reversal and price velocity are used as signals in the combined detector rather than standalone detectors.

| Method | Precision | Recall | F1 |
|--------|:---------:|:------:|:--:|
| **LPPLS** | **37%** | **74%** | **50%** |
| **LPPLS confidence** | **29%** | **90%** | **43%** |
| GSADF | 38% | 38% | 38% |
| **DFA** | **22%** | **82%** | **34%** |
| Hurst | 19% | 59% | 28% |
| Pickands | 19% | 49% | 27% |
| Kappa | 19% | 49% | 27% |
| DEH | 18% | 46% | 26% |
| Spectral | 22% | 28% | 25% |
| Taleb Kappa | 20% | 33% | 25% |
| QQ | 16% | 38% | 23% |
| GPD VaR | 12% | 42% | 19% |
| Max-to-Sum | 12% | 31% | 18% |
| Hill | 12% | 28% | 16% |

**Precision** = how often a signal is correct (TP/(TP+FP)). **Recall** = how many crashes are caught (TP/(TP+FN)). **F1** = harmonic mean of both.

**Why precision is low for tail/regime methods**: These methods detect distributional regime shifts (tail thickening, persistent dynamics), not crash-specific patterns. They fire in many non-crash periods because fat tails and persistence are pervasive in financial data. This is by design — they measure the distributional regime, not a specific crash. LPPLS and GSADF have higher precision because they detect bubble-specific structure.

**The Sornette-Bouchaud debate on precision vs recall:** Sornette (the LPPLS inventor) argues that LPPLS is deliberately tuned for high recall at the cost of precision because the cost function is asymmetric — missing a crash is far more expensive than a false alarm. He calls false positives "failed predictions" and argues they are inevitable: bubbles can end in slow deflation rather than sharp crashes. His 2024 paper with Nielsen ([arXiv:2405.12803](https://arxiv.org/abs/2405.12803)) introduced the tightened omega [6,13] range specifically to improve precision without sacrificing recall.

Bouchaud takes a more skeptical view. In his work at CFM and in papers with Potters, he emphasizes that fat-tail estimators (Hill, etc.) measure *unconditional* properties of returns and are poor at *conditional* crash prediction. His point is exactly what the data shows: tail estimators have decent recall but low precision because fat tails are always present, not just before crashes. He favors portfolio-level risk measures (drawdown control, volatility targeting) over point-in-time crash prediction.

Both perspectives are reflected in fatcrash: LPPLS targets the mechanism (Sornette's approach), tail estimators measure the regime (which Bouchaud correctly notes is always fat-tailed), and the aggregator combines both — using Sornette-style bubble detection as the primary signal and Bouchaud-style regime measurement as confirmation.

Recall by crash size shows that LPPLS confidence catches 93% of small, 94% of medium, and 75% of major crashes. DFA catches 86% of small and 88% of medium crashes.

### Major known crashes

Testing on four major crashes with pre-crash vs. calm period comparison:

| Crash | Kappa | GPD VaR | LPPLS | Hill |
|-------|:---:|:---:|:---:|:---:|
| 2017 BTC Bubble | detected | detected | detected | missed |
| 2021 BTC Crash | detected | detected | detected | missed |
| 2008 Financial Crisis | detected | detected | detected | detected |
| COVID Crash 2020 | detected | — | detected | missed |

Kappa and LPPLS each detected all four. GPD VaR detected 3 of 4. Hill detected 1 of 4. These are recall numbers — false positive rates are reported in the table above.

### Why Hill underperforms

Hill measures the tail index of the *return distribution*, but this property changes slowly. A 6-month pre-crash window doesn't necessarily have thinner tails than a 6-month calm window because the calm window might include its own mini-shocks. The Hill estimator is useful for long-term characterization (this asset has $\alpha=3$, that one has $\alpha=4$) but not for short-term prediction.

### Why LPPLS leads on F1

LPPLS detects *structure*, not statistics. It's looking for a specific pattern: accelerating growth with log-periodic oscillations. This pattern appears before both 10% corrections and 80% crashes. The tail-based methods need to see the tail *thickening*, which requires the crash to already be underway. LPPLS sees the bubble building.

With the tightened Nielsen (2024) filter (omega restricted to [6,13] instead of the original loose [2,25], plus a critical-time proximity constraint requiring tc to fall within 40% of the window length after the end), LPPLS achieves the best F1 score (50%) by balancing recall (74%) and precision (37%). The LPPLS confidence indicator trades precision (29%) for higher recall (90%) by aggregating across many sub-windows.

The relatively low precision is inherent: LPPLS frequently detects "bubble signatures" during normal bull markets because super-exponential growth patterns are common. The solution is combining it with the tail-based and regime methods. If LPPLS says "bubble" and kappa says "tails thickening" and DFA shows persistent dynamics, the signal is more reliable.

### Why DFA is the best non-bubble method

DFA detects regime shifts in the correlation structure of returns. Before a crash, markets transition from noisy mean-reverting behavior to strongly persistent trending. This transition is invisible to tail estimators like Hill or kappa, which measure distributional shape. DFA measures temporal dependence. The detrending step gives DFA an edge over Hurst's R/S analysis (82% recall vs 59%) because raw R/S conflates local trends with long-range dependence. DFA strips out the local trends and measures the residual scaling.

DFA's 82% recall is high but its 22% precision means it also fires in many non-crash periods — persistent dynamics are common in financial markets. DFA is particularly strong on small and medium drawdowns (86% and 88% recall), where tail-based methods struggle because the distributional shift is subtle.

### Combined detector

For the combined detector, signals are grouped into four independent categories: **bubble** (LPPLS, GSADF, plus 2 neural network variants), **tail** (Hill, Pickands, DEH, QQ, Max-to-Sum, Taleb Kappa, Max-Stability Kappa, GPD VaR), **regime** (DFA, Hurst, Spectral, momentum reversal), and **structure** (multiscale agreement across daily, 3-day, and weekly frequencies, LPPLS critical time proximity, and price velocity). When three or more categories independently signal elevated risk, the combined detector applies a +15% bonus to the crash probability.

| | Small (<15%) | Medium (15-30%) | Major (>30%) | Overall |
|--|:---:|:---:|:---:|:---:|
| Combined (agreement bonus) | 64% | 94% | 75% | **79%** |

The combined detector reaches 79% overall, with 94% on medium-sized drawdowns. The agreement requirement filters out most of LPPLS's false positives while retaining most of its true positives. The gap between small (64%) and medium/major (94%/75%) drawdowns reflects the fact that small corrections often happen without prior tail thickening or regime change. They are genuine surprises, and no method should be expected to predict all of them.

## Long timescales

### 54 years of forex (GBP/USD)

We tested on GBP/USD daily data from 1971 to 2025 (13,791 trading days):

| Decade | Hill alpha | Kappa/benchmark | Notable |
|--------|:---------:|:---------------:|---------|
| 1970s | 2.92 | 0.78 | Oil crises, IMF bailout |
| 1980s | 4.36 | 0.68 | Plaza Accord |
| 1990s | 4.51 | 0.83 | Black Wednesday |
| 2000s | 2.90 | 0.57 | 2008 crisis |
| 2010s | 3.86 | 0.35 | Brexit |
| 2020s | 3.39 | 0.71 | Truss mini-budget |

Every decade shows fat tails. In our labeling, all six GBP/USD crisis events were detected (6/6):
- 1976 IMF Crisis
- 1985 Plaza Accord
- 1992 Black Wednesday
- 2008 Financial Crisis
- 2016 Brexit Vote
- 2022 Truss Mini-Budget

### All methods on daily forex (1971-2025)

We ran all methods on 23 currency pairs from FRED daily data. The table shows the key estimators from each category — tail (Hill, QQ, DEH), regime (Hurst, DFA), and bubble (GSADF):

| Pair | Hill $\alpha$ | QQ $\alpha$ | DEH $\gamma$ | Hurst $H$ | DFA $\alpha$ | GSADF |
|------|:---:|:---:|:---:|:---:|:---:|:---:|
| VEF/USD | 1.20 | 0.82 | 1.06 | 0.53 | 0.82 | bubble |
| HKD/USD | 1.73 | 2.12 | 0.24 | 0.54 | 0.62 | bubble |
| KRW/USD | 1.90 | 1.93 | 0.44 | 0.67 | 0.60 | bubble |
| MXN/USD | 2.04 | 1.98 | 0.44 | 0.56 | 0.57 | bubble |
| LKR/USD | 2.14 | 1.97 | 0.51 | 0.58 | 0.66 | bubble |
| TWD/USD | 2.31 | 2.62 | 0.21 | 0.60 | 0.63 | bubble |
| THB/USD | 2.38 | 2.43 | 0.33 | 0.58 | 0.59 | bubble |
| MYR/USD | 2.42 | 2.46 | 0.33 | 0.58 | 0.60 | bubble |
| AUD/USD | 2.58 | 2.30 | 0.44 | 0.56 | 0.56 | bubble |
| INR/USD | 2.62 | 2.56 | 0.34 | 0.57 | 0.58 | bubble |
| CNY/USD | 2.79 | 1.70 | 0.69 | 0.59 | 0.71 | bubble |
| BRL/USD | 2.80 | 3.12 | 0.15 | 0.56 | 0.58 | bubble |
| NZD/USD | 2.89 | 2.46 | 0.44 | 0.57 | 0.56 | — |
| ZAR/USD | 3.19 | 3.43 | 0.15 | 0.58 | 0.54 | bubble |
| NOK/USD | 3.39 | 3.44 | 0.22 | 0.57 | 0.53 | bubble |
| SEK/USD | 3.50 | 2.88 | 0.41 | 0.58 | 0.55 | — |
| SGD/USD | 3.59 | 3.66 | 0.18 | 0.56 | 0.53 | bubble |
| CHF/USD | 3.81 | 3.59 | 0.28 | 0.57 | 0.54 | — |
| CAD/USD | 3.84 | 3.58 | 0.27 | 0.57 | 0.53 | bubble |
| DKK/USD | 3.84 | 3.23 | 0.37 | 0.58 | 0.55 | — |
| JPY/USD | 3.94 | 4.02 | 0.18 | 0.58 | 0.58 | bubble |
| GBP/USD | 4.13 | 4.11 | 0.19 | 0.58 | 0.55 | bubble |
| EUR/USD | 4.88 | 4.90 | 0.12 | 0.56 | 0.54 | — |

All 23 pairs show fat tails: DEH $\gamma > 0$ for 23/23, and two independent tail index estimators converge (mean Hill $\alpha$ = 2.95, mean QQ $\alpha$ = 2.84). All 23 show persistent dynamics: Hurst $H > 0.5$ and DFA $\alpha > 0.5$ for every pair. GSADF detected explosive episodes in 18 of 23 pairs.

VEF/USD (Venezuela) is the extreme case: Hill $\alpha$ = 1.20, QQ $\alpha$ = 0.82, DEH $\gamma$ = 1.06 — every estimator confirms infinite variance. At the other end, EUR/USD has the thinnest tails (Hill $\alpha$ = 4.88) but is still fat-tailed by any standard. KRW/USD has the fattest tails among liquid pairs (Hill $\alpha$ = 1.90). CNY/USD shows the strongest persistence (DFA = 0.71), consistent with managed float dynamics.

### 500 years, 138 countries

Using the [Clio Infra exchange rate dataset](https://clio-infra.eu/) (1500-2013), we ran all tail and regime methods on every country with 50+ years of data.

Results:

| Tail regime | Countries | Percentage |
|------------|:---------:|:----------:|
| $\alpha < 2$ (infinite variance) | 98 | **71%** |
| $\alpha$ 2-4 (fat tails, finite variance) | 37 | 27% |
| $\alpha > 4$ (moderate tails) | 3 | 2% |

**71% of countries have exchange rate distributions with infinite variance.** The median $\alpha$ across all 138 countries is 1.57. GEV confirms this: 81% of countries show Frechet-type (fat) tails with median $\xi = 0.76$.

The most extreme cases:

Returns in this table are log-returns, which can exceed -100% or +100%. A log-return of -2,748% means the currency lost virtually all its value (the price ratio $e^{-27.48} \approx 0$). This convention is standard in fat-tail analysis because log-returns are additive across time periods.

| Country | Years of data | Hill $\alpha$ | Taleb $\kappa$ | Worst year | Best year |
|---------|:---:|:---:|:---:|:---:|:---:|
| Syria | 61 | 0.32 | — | -2% | +105% |
| Iraq | 61 | 0.40 | — | -38% | +883% |
| Germany | 153 | 0.52 | 1.00 | **-2,748%** | +2,104% |
| Nicaragua | 76 | 0.52 | — | -2,159% | +787% |
| Zimbabwe | 56 | 0.55 | — | -12% | +1,345% |
| Hungary | 66 | 0.56 | — | -944% | +247% |
| Peru | 64 | 0.72 | — | -1,660% | +426% |
| Bolivia | 63 | 0.76 | — | -1,794% | +494% |
| Brazil | 129 | 1.03 | — | **-3,536%** | +318% |
| Argentina | 102 | 1.28 | 1.00 | -2,748% | +388% |

Germany's -2,748% in a single year is the Weimar hyperinflation. Brazil's -3,536% reflects the cruzeiro collapse. These aren't outliers. They're exactly what a distribution with $\alpha < 1$ predicts.

Running all classical methods on the top 30 countries by data length confirms that fat tails and persistence go hand in hand:

| Country | Years | Hill $\alpha$ | Hurst $H$ | Taleb $\kappa$ | Verdict |
|---------|:---:|:---:|:---:|:---:|---------|
| Germany | 153 | 0.52 | 0.56 | 1.00 | extreme, persistent |
| Austria | 104 | 0.63 | 0.61 | 1.00 | extreme, persistent |
| Belgium | 114 | 0.89 | 0.64 | 0.86 | extreme, persistent |
| Finland | 100 | 0.94 | 0.58 | 0.43 | extreme, persistent |
| Italy | 95 | 0.77 | 0.80 | 0.95 | extreme, persistent |
| Portugal | 88 | 0.98 | 0.85 | 1.00 | extreme, persistent |
| Greece | 87 | 0.77 | 0.76 | 0.81 | extreme, persistent |
| Argentina | 102 | 1.28 | 0.71 | 1.00 | extreme, persistent |
| Mexico | 113 | 1.06 | 0.70 | 0.92 | extreme, persistent |
| UK | 223 | 2.42 | 0.47 | 0.04 | fat-tailed |
| Canada | 100 | 3.70 | 0.50 | 0.00 | fat-tailed |

Of the top 30 countries, 19 have $\alpha < 2$ (infinite variance), 25 have Hurst $H > 0.5$ (persistent dynamics), and 28 have QQ $\alpha < 4$ (heavy tails confirmed by multiple estimators). Germany, Austria, Argentina, and Portugal saturate at Taleb $\kappa = 1.0$ — Cauchy-like behavior where the CLT does not operate at any practical sample size. Italy ($H$ = 0.80) and Portugal ($H$ = 0.85) show the strongest persistence over century-scale data.

#### Century-by-century: United Kingdom (1789-2013)

The UK has 224 years of continuous exchange rate data:

| Century | Hill $\alpha$ | Regime |
|---------|:---:|-----------|
| 1800s | 1.19 | Infinite variance (Napoleonic wars, banking crises) |
| 1900s | 3.17 | Fat but finite (Bretton Woods stability) |
| 2000s | 2.04 | Back to borderline infinite variance |

Even within a single country, tail regimes shift across centuries.

### Inflation: 500 years, 82 countries

Inflation data from Clio Infra (1500-2010):

| Statistic | Value |
|-----------|-------|
| Countries analyzed | 82 |
| Countries with hyperinflation (>100%/yr) | **32** (39%) |
| Countries with $\alpha < 2$ | **36** (44%) |
| Median $\alpha$ | 2.14 |

The most extreme inflation tails:

| Country | Years | Hill $\alpha$ | Max inflation |
|---------|:-----:|:---:|:---:|
| Nicaragua | 71 | 0.30 | 13,110%/yr |
| Zimbabwe | 83 | 0.44 | 24,411%/yr |
| Germany | **494** | 0.57 | **211,427,400,000%/yr** |
| Brazil | 226 | 0.63 | 2,948%/yr |
| Peru | 363 | 0.80 | 7,482%/yr |
| Argentina | 274 | 0.85 | 3,079%/yr |
| China | 336 | 0.86 | 1,579%/yr |
| Poland | 414 | 0.97 | 4,738%/yr |

Germany has 494 years of inflation data with $\alpha = 0.57$. Its maximum annual inflation was 211 billion percent (Weimar 1923). With $\alpha < 1$, neither the mean nor the variance of this distribution converges. You cannot compute a confidence interval. You cannot build a VaR model. The standard toolkit breaks down.

## Extended validation: 96 crash windows across forex and equities

The original 39-drawdown evaluation used only BTC, SPY, and Gold. To test whether the results generalize, we extended the dataset with 23 FRED daily forex pairs (1971-2025) and 6 equity files covering the 2008, 2020, and 2022 crises. This yielded 96 total crash windows and 631 non-crash windows.

LPPLS recall held at 89% on the extended dataset and 90% combined. The precision and F1 patterns remained stable: LPPLS leads on recall, GSADF leads on precision, DFA is the best non-bubble method. The combined detector's agreement bonus continues to filter false positives effectively.

The forex pairs confirmed that fat tails and persistence are universal: Hill $\alpha < 4$ for all 23 pairs, Hurst $H > 0.5$ for all 23. EM currency pairs (BRL, MXN, KRW, ZAR) show the fattest tails, as expected from their crisis histories.

## Beyond market prices: detecting problems in revenue and profit data

These methods were built for market prices, but most transfer to any time series where you need to detect structural problems — company revenue, profit margins, unit economics, or any financial metric that changes over time. The key distinction: **market prices** reflect collective speculative behavior (herding, positive feedback loops), while **revenue and profit** reflect real economic activity (customer demand, operational execution, competitive dynamics).

### What transfers to company data

**Tail estimation (Hill, DEH, QQ, Pickands, Kappa, Max-to-Sum, GPD/GEV)** — Yes. Revenue growth rates have fat tails. [Gabaix (2011)](https://academic.oup.com/qje/article/126/1/185/1903368) showed that idiosyncratic firm-level shocks drive aggregate fluctuations precisely because firm-size distributions are fat-tailed. A company with Hill $\alpha < 2$ on quarterly revenue growth has a distribution where a single catastrophic quarter can dominate the entire history. EVT gives you calibrated worst-case scenarios: fit GPD to the worst quarterly declines for a valid tail risk estimate.

**Persistence detection (DFA, Hurst, Spectral)** — Yes. Revenue series often show strong persistence ($H > 0.5$) due to contracts, recurring revenue, and customer stickiness. A shift from persistent ($H > 0.5$) to anti-persistent ($H < 0.5$) could signal fundamental deterioration — the business is losing its growth momentum. DFA handles the non-stationarity inherent in growing companies better than raw Hurst.

**GSADF** — Partially. It detects unsustainable exponential growth. Applied to revenue, it could flag "growth bubbles" — growth rates that would require capturing 100% of the addressable market to sustain. Useful for evaluating whether a company's growth trajectory is explosive (and therefore unsustainable) or merely strong.

**Momentum and velocity** — Partially. Revenue momentum (trailing growth rates) is meaningful for company analysis. A reversal in revenue momentum — strong long-term growth suddenly decelerating — is a classic warning sign. Price velocity (volatility acceleration) is less directly applicable to revenue data, which doesn't exhibit the forced-liquidation cascades it was designed to detect.

**LPPLS and neural network methods (M-LNN, P-LNN)** — No. These model speculative bubble dynamics: herding, log-periodic oscillations, reflexive feedback loops. Revenue doesn't exhibit these patterns. It's driven by real economic activity, not reflexive speculation. Don't apply LPPLS to your quarterly revenue.

### Practical example

For a company's quarterly revenue time series:

```python
import numpy as np
from fatcrash._core import hill_estimator, dfa_exponent, hurst_exponent

# Quarterly revenue growth rates
growth = np.diff(np.log(quarterly_revenue))

hill_estimator(growth)    # Tail index — are revenue shocks fat-tailed?
dfa_exponent(growth)      # Persistence — is growth momentum persistent or fading?
hurst_exponent(growth)    # Same question, different method (cross-check)
```

The practical challenge: quarterly data gives roughly 80 observations over 20 years (vs. 5,000+ daily prices). Tail estimators need at least 100 data points to be reliable. Use monthly revenue or longer history when possible. For shorter series, DFA and Hurst are more robust than Hill because they measure temporal structure rather than distributional shape.

**What to watch for:**
- **Hill $\alpha$ dropping below 3**: Revenue shocks are getting more extreme. The distribution is shifting toward heavier tails.
- **DFA shifting from > 0.5 to < 0.5**: Growth momentum is breaking down. Revenue used to be self-reinforcing; now it's mean-reverting.
- **Max-to-Sum ratio rising**: A single quarter is starting to dominate the entire history — either a massive win or a massive loss.
- **GPD VaR spiking**: The worst-case quarterly decline is getting worse, even accounting for the fat tails.

These methods won't tell you *why* revenue is deteriorating — you still need business context for that. But they can tell you *that* something structural has changed in the data before it becomes obvious in the headline numbers.

## Conclusions

1. **Fat tails are universal.** Every asset, every timescale, every country. From daily BTC returns to 500 years of exchange rates. 71% of countries have exchange rate distributions with infinite variance. That is the default state of financial markets, not an anomaly.

2. **LPPLS has the best F1 score (50%)** because it detects bubble *structure*, not tail *statistics*. With tightened filters (Nielsen omega [6,13], tc constraint), it achieves 74% recall and 37% precision. The LPPLS confidence indicator trades precision (29%) for recall (90%). Both have substantial false positive rates — bubble signatures appear during normal bull markets too.

3. **DFA is the best non-bubble method (82% recall, F1=34%).** It detects regime shifts in temporal dependence, not distributional shape. The detrending step makes it robust to non-stationarity where simpler methods like Hurst (59% recall) are confused by local trends. Low precision (22%) reflects that persistent dynamics are common in financial markets.

4. **Tail-based methods have moderate recall but low precision.** Kappa and Pickands (49% recall each), DEH (46%), Hill (28%). These methods detect distributional regime shifts that are pervasive in financial data, not crash-specific patterns. They are most valuable as ensemble components, not standalone detectors.

5. **The combined detector reaches 79% recall.** When bubble, tail, regime, and structural methods independently agree, the signal is more reliable. The agreement requirement filters most false positives while preserving 94% detection on medium-sized drawdowns. Momentum reversal and price velocity add structural signals that capture crash dynamics invisible to distributional methods — trend breaks and forced-liquidation cascades, respectively. No single method is sufficient. These results were validated on an extended dataset of 96 crash windows across 23 forex pairs and equity crises, with LPPLS recall stable at 89-90%.

6. **Standard risk models are wrong for most countries.** [Modern Portfolio Theory](https://en.wikipedia.org/wiki/Modern_portfolio_theory), [CAPM](https://en.wikipedia.org/wiki/Capital_asset_pricing_model), [Black-Scholes](https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model): all assume finite variance. For 71% of the world's currencies, this assumption is empirically false. The math doesn't just give wrong answers; it gives answers to the wrong question.

7. **Hyperinflation isn't rare.** 39% of countries experienced >100% annual inflation at some point. Germany's 211 billion percent is extreme, but dozens of countries experienced four- and five-digit inflation. Any model that treats these as "outliers" is a model that doesn't understand the data it's modeling.

8. **Most methods transfer beyond market prices.** Tail estimation, persistence detection, momentum, and EVT work on any time series — revenue, profit, unit economics. LPPLS doesn't transfer (it models speculative dynamics), but Hill, DFA, Hurst, momentum reversal, and GPD work on company-level data. The challenge is sample size: quarterly data gives ~80 observations vs. 5,000+ daily prices. Use monthly data when possible.

Beyond crash detection, fatcrash includes two portfolio-level tools. **Constant volatility targeting** ([Hallerbach, 2012](https://doi.org/10.2139/ssrn.2042750)) sizes positions inversely to realized volatility: when vol spikes, reduce exposure; when vol is low, increase it. This is the Bouchaud-school response to crash risk — don't predict crashes, just mechanically reduce exposure when the market gets rough. The **rebalance risk signal** ([Rattray, Granger, Harvey, and Van Hemert, 2020](https://doi.org/10.3905/jpm.2020.1.131)) addresses a subtler problem: mechanical rebalancing (buying stocks after they fall to maintain a target allocation) is negative convexity. During persistent drawdowns where DFA shows trending dynamics and momentum is negative, "buying the dip" amplifies losses. The signal combines DFA persistence with momentum direction to warn when rebalancing is dangerous.

The code is open source: [github.com/unbalancedparentheses/fatcrash](https://github.com/unbalancedparentheses/fatcrash). The forex data comes from [forex-centuries](https://github.com/unbalancedparentheses/forex-centuries).

## References

- Jegadeesh, N. and Titman, S. (1993). [*Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency*](https://doi.org/10.1111/j.1540-6261.1993.tb04702.x). Journal of Finance, 48(1).
- Scowcroft, A. and Sefton, J. (2005). [*Understanding Momentum*](https://doi.org/10.1057/palgrave.jam.2240167). Journal of Asset Management, 6(3).
- Hallerbach, W. (2012). [*A Proof of the Optimality of Volatility Weighting over Time*](https://doi.org/10.2139/ssrn.2042750). Journal of Investment Strategies, 1(4).
- Rattray, S., Granger, N., Harvey, C. R., and Van Hemert, O. (2020). [*Strategic Rebalancing*](https://doi.org/10.3905/jpm.2020.1.131). Journal of Portfolio Management, 46(6).
- Jordà, Ò., Schularick, M. and Taylor, A. M. (2019). [*The Rate of Return on Everything, 1870-2015*](https://doi.org/10.1093/qje/qjy004). Quarterly Journal of Economics, 134(3).
- Nielsen, M. and Sornette, D. (2024). [*Deep LPPLS*](https://arxiv.org/abs/2405.12803). arXiv:2405.12803.

## Disclaimer

This article is for educational and research purposes only. Nothing here constitutes financial advice, trading recommendations, or an invitation to buy or sell any asset. The detection rates reported are retrospective, based on labeled historical events, and should not be interpreted as predictive accuracy for future markets. Always consult a qualified financial advisor before making investment decisions.

---

### Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal

*Published: 2026-02-12*

> We assembled forex-centuries, an open dataset of exchange rates, gold, silver, interest rates, commodity prices, GDP, sovereign debt, and more — 27 sources spanning 1 CE to 2026, covering 240 countries. Fat tails are universal. Pegged currencies are the most dangerous. Every currency loses against gold.

URL: https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/

In 1252, Florence minted the gold florin. Within decades it became the dominant trade currency of medieval Europe. Merchants in Bruges, Venice, and Constantinople quoted prices against it. By the 1400s, the florin's dominance had faded, replaced by the Venetian ducat. Then the Spanish real. Then the Dutch guilder. Then sterling. Then the dollar. Each transition involved devaluations, defaults, and crises that ruined anyone holding the wrong currency at the wrong time.

We have data on all of this. Not estimates. Actual recorded exchange rates, starting from 1106. And not just exchange rates: gold and silver prices from 1257, interest rates from 1311, commodity prices from 1260, GDP per capita from the year 1 CE, sovereign debt ratios from 1800, and crisis indices covering two centuries of banking panics, currency collapses, and sovereign defaults.

I assembled [forex-centuries](https://github.com/unbalancedparentheses/forex-centuries), the most comprehensive open-source collection of long-run financial and economic data available. 27 sources, 1,100+ files, ~240 countries, spanning twenty centuries. Exchange rates, precious metals, interest rates, commodity prices, inflation, GDP, real wages, sovereign debt, regime classifications, and real effective exchange rates — all in one repository with an automated build pipeline, weekly CI updates, and reproducible analysis. No other free repository combines this breadth of asset classes across this depth of history. The only comparable product is [Global Financial Data](https://globalfinancialdata.com/) (commercial, institutional pricing). The goal: provide the raw material for studying how currencies and financial systems behave over centuries, not decades.

<!-- more -->

## The dataset

### Exchange rates

| Source | Period | Description |
|--------|--------|-------------|
| [MEMDB Spufford](https://memdb.libraries.rutgers.edu/spufford-currency) | 1106-1500 | 13,197 medieval exchange quotations across 521 places |
| [MEMDB Metz](https://memdb.libraries.rutgers.edu/metz-currency) | 1350-1800 | 50,559 records from the Lower Rhine region |
| [Clio Infra](https://clio-infra.eu/) | 1500-2016 | Exchange rates vs GBP and USD, inflation, bonds, debt, GDP, real wages |
| [CFS Historical Financial Statistics](https://centerforfinancialstability.org/hfs.php) | ~1500-1950 | Official and market exchange rates, interest rates, money supply, central bank balance sheets |
| [MeasuringWorth](https://www.measuringworth.com/datasets/exchangeglobal/) | 1791-2025 | 41 currencies vs USD, gold (5 series), interest rates, CPI |
| [Bank of England](https://www.bankofengland.co.uk/statistics/research-datasets) | 1791-2016 | UK millennium dataset (90+ sheets) |
| [JST Macrohistory](https://www.macrohistory.net/database/) | 1870-2017 | 18 advanced economies, 59 macro variables |
| [Sveriges Riksbank](https://www.riksbank.se/en-gb/statistics/interest-rates-and-exchange-rates/) | 1900-2026 | 53 SEK bilateral series, 295,000 observations |
| [Penn World Table](https://www.rug.nl/ggdc/productivity/pwt/) | 1950-2023 | 185 countries, exchange rates and PPP |
| [IMF IFS](https://data.imf.org/ifs) | 1955-2025 | 173 currencies, monthly, 158,000 observations |
| [BIS](https://data.bis.org/topics/EER) | 1957-2026 | Bilateral and effective rates, ~190 economies, 2.66 million rows |
| [World Bank FX](https://data.worldbank.org/indicator/PA.NUS.FCRF) | 1960-present | Official rates, all member countries |
| [Bruegel/Darvas REER](https://www.bruegel.org/publications/datasets/real-effective-exchange-rates-for-178-countries-a-new-database) | 1960s-2026 | Real effective exchange rates, 178 countries, monthly |
| [Global Macro Database](https://www.globalmacrodata.com/) | 1960-2024 | 243 countries, USD FX and REER |
| [FRED Daily](https://fred.stlouisfed.org/categories/94) | 1971-2025 | 23 daily currency pairs and 2 USD indices |

### Precious metals

| Source | Period | Description |
|--------|--------|-------------|
| [MeasuringWorth](https://www.measuringworth.com/datasets/gold/) | 1257-2025 | Gold prices (5 series: British official, London, US, New York, gold/silver ratio) |
| [FreeGoldAPI](https://freegoldapi.com/) | 1258-2025 | 768-year gold, silver prices, and gold/silver ratio |
| [LBMA](https://www.lbma.org.uk/prices-and-data/precious-metal-prices) | 1968-2025 | Daily gold and silver in USD, GBP, EUR |
| [DataHub Gold](https://github.com/datasets/gold-prices) | 1833-2025 | Monthly gold prices USD |

### Interest rates

| Source | Period | Description |
|--------|--------|-------------|
| [Schmelzing (BoE)](https://www.bankofengland.co.uk/working-paper/2020/eight-centuries-of-global-real-interest-rates-r-g-and-the-suprasecular-decline-1311-2018) | 1311-2018 | Real interest rates for 8 countries (Italy, UK, Netherlands, Germany, France, Spain, Japan, US) |
| [MeasuringWorth](https://www.measuringworth.com/datasets/) | 1729-2025 | UK and US nominal short and long term rates |

### Commodity prices

| Source | Period | Description |
|--------|--------|-------------|
| [Allen-Unger GCPD](https://datasets.iisg.amsterdam/dataset.xhtml?persistentId=hdl:10622/3SV0BO) | 1260-1914 | 973 commodity price series (wheat, rye, silver, coal, spices) across European and Asian cities |
| [World Bank Pink Sheet](https://www.worldbank.org/en/research/commodity-markets) | 1960-present | ~70 commodity prices (oil, metals, agriculture), monthly and annual |

### GDP, inflation, wages, and debt

| Source | Period | Description |
|--------|--------|-------------|
| [Maddison Project](https://www.rug.nl/ggdc/historicaldevelopment/maddison/) | 1 CE-2022 | GDP per capita, 178 countries |
| [Riksbank Historical Monetary Statistics](https://www.riksbank.se/en-gb/statistics/historical-monetary-statistics-of-sweden/) | 1277-2020 | Swedish FX, CPI, wages, GDP, money supply, stocks, bonds (3 volumes, 13 files) |
| [IMF HPDD](https://data.imf.org/) | 1800-2015 | Sovereign debt-to-GDP, 191 countries |
| [Reinhart-Rogoff](https://carmenreinhart.com/data/) | 1800-2016 | Debt/GDP, inflation, crisis indices, gold standard dates, regime classifications |

### Regime classifications and crises

| Source | Period | Description |
|--------|--------|-------------|
| [IRR Regimes](https://www.ilzetzki.com/irr-data) | 1940-2021 | De facto exchange rate regime classifications, ~190 countries |
| [Reinhart-Rogoff](https://carmenreinhart.com/data/) | 1800-2016 | Banking, currency, and debt crisis indices; capital control indicators |

The longest individual exchange rate series is the United States at 526 years (1500-2025). The United Kingdom has 236 years of continuous data (1789-2025). The medieval data covers exchange quotations across Florence, Bruges, Venice, London, and hundreds of other cities across Europe, Byzantium, and North Africa. For GDP per capita, the Maddison Project traces 178 countries back to 1 CE. For interest rates, Schmelzing's dataset provides 707 years of real rates across eight countries. For commodity prices, the Allen-Unger database contains 973 individual series spanning the period when Europe transitioned from medieval to modern market economies.

All derived exchange rate data uses a consistent quoting convention: foreign currency per 1 USD. The build pipeline normalizes, deduplicates, and cross-validates across sources. Where sources overlap, [MeasuringWorth](https://www.measuringworth.com/datasets/exchangeglobal/) takes priority (carefully curated for continuity), then [Clio Infra](https://clio-infra.eu/), then the [Global Macro Database](https://www.globalmacrodata.com/).

## What fat tails mean

Before diving into the results, a brief primer on the key concepts. If you have read the companion article on [crash detection](/articles/detecting-crashes-with-fat-tail-statistics/), you can skip this section.

### Log returns

Throughout this article, all return calculations use **log returns** (also called continuously compounded returns). If $P_t$ is the exchange rate at time $t$ and $P_{t-1}$ is the rate at the previous period, the log return is:

$$r_t = \ln\left(\frac{P_t}{P_{t-1}}\right)$$

Log returns are the standard choice for financial analysis because they are additive over time (you can sum daily log returns to get the yearly return) and symmetric (a +10% followed by a -10% move doesn't leave you back where you started in simple percentage terms, but log returns account for this correctly).

### Excess kurtosis

**Kurtosis** measures how much of a distribution's weight sits in the tails versus near the center. A normal (Gaussian) distribution has a kurtosis of 3. **Excess kurtosis** subtracts that baseline:

$$\text{Excess kurtosis} = \frac{\mathbb{E}\left[(X - \mu)^4\right]}{\left(\mathbb{E}\left[(X - \mu)^2\right]\right)^2} - 3$$

where $\mu$ is the mean and $\mathbb{E}$ denotes the expected value. If excess kurtosis is 0, the distribution has Gaussian-like tails. If it is positive, the tails are heavier: extreme events happen more often than a bell curve predicts. An excess kurtosis of 5 means the extreme observations are far more frequent than normal. An excess kurtosis of 4,110, as we will see for Sri Lanka, means the distribution looks nothing like a Gaussian.

### Tail index and power laws

Many financial distributions follow a **power law** in the tails. The probability of seeing a value larger than $x$ falls as:

$$P(X > x) \sim x^{-\alpha}$$

The exponent $\alpha$ is the **tail index**. It tells you how heavy the tail is:

- $\alpha < 2$: **Infinite variance.** The distribution is so fat-tailed that the variance does not converge no matter how many observations you collect. Computing a standard deviation is meaningless because the number you get depends on your sample size.
- $\alpha$ between 2 and 4: **Finite variance but infinite kurtosis.** Fat tails, but the standard deviation at least converges. This is where many financial assets live.
- $\alpha > 4$: **Moderate tails.** Still fatter than Gaussian, but standard tools become somewhat reasonable.

We estimate $\alpha$ using the **Hill estimator**. Sort the $n$ observations in descending order, take the $k$ largest, and compute:

$$\hat{\alpha} = k \left( \sum_{i=1}^{k} \ln \frac{X_{(i)}}{X_{(k)}} \right)^{-1}$$

where $X_{(1)} \geq X_{(2)} \geq \ldots$ are the order statistics. The choice of $k$ matters: too small and the estimate is noisy; too large and you include observations from the body of the distribution. A Hill plot ($\alpha$ versus $k$) helps find the stable region.

### Annualized volatility

**Annualized volatility** is the standard deviation of returns scaled to one year. If $\sigma_d$ is the standard deviation of daily log returns, the annualized figure is:

$$\sigma_{\text{ann}} = \sigma_d \times \sqrt{252}$$

where 252 is the approximate number of trading days per year. This gives a single number summarizing "how much does this currency move in a typical year." EUR/USD at 10% annualized volatility means a one-standard-deviation annual move is about 10 cents on the dollar. Venezuela at 380% means the currency can lose most of its value in a single year.

### 3-sigma tail ratio

The **3-sigma tail ratio** compares how often extreme events actually occur versus how often a Gaussian would predict. Under a normal distribution, returns beyond 3 standard deviations from the mean should happen about 0.27% of the time. If the actual frequency is 1.08%, the tail ratio is 4x. A ratio above 1 means fatter tails than normal; a ratio below 1 means thinner tails (which can happen in pegged currencies that suppress small moves but concentrate risk in huge jumps).

## Daily data (1971-2025)

The central finding from running [fatcrash](https://github.com/unbalancedparentheses/fatcrash) on this data: **fat tails are universal across all currencies, all time scales, and all centuries.**

Every one of the 23 [FRED](https://fred.stlouisfed.org/categories/94) daily currency pairs has heavier tails than a Gaussian distribution. For most floating pairs, three-sigma events happen about 3 to 6 times more often than a normal distribution predicts. Pegged and tightly managed pairs are a special case: they can show tail ratios below 1 even with extreme kurtosis because daily returns are compressed near zero. Even EUR/USD, the most liquid pair in the world, has excess kurtosis of 2.5 and 4x too many tail events.

The most extreme daily tails:

| Currency | Ann. Vol | Excess Kurtosis | 3-sigma Tail Ratio |
|----------|:--------:|:---------------:|:------------------:|
| LKR/USD | 11.7% | 4,110 | 2.98x |
| CNY/USD | 8.2% | 3,846 | 0.53x |
| VEF/USD | 380.7% | 2,560 | 0.28x |
| THB/USD | 8.7% | 279 | 5.32x |
| HKD/USD | 3.2% | 261 | 5.09x |
| KRW/USD | 10.8% | 140 | 4.37x |

Sri Lanka (LKR) has an excess kurtosis of 4,110. For context, a Gaussian distribution has excess kurtosis of 0. The Student's t-distribution with 3 degrees of freedom, often used as a "fat-tailed alternative" in finance, has excess kurtosis of infinity (undefined). LKR's distribution is empirically closer to the Student's t than to the Gaussian.

Note the paradox in the tail ratio column: CNY and VEF have enormous kurtosis but tail ratios *below* 1. This happens because these currencies are pegged or managed. Their daily returns are almost always exactly zero (the peg holds), so the standard deviation is very small. The rare days when the peg breaks produce moves that are extreme in absolute terms but may not exceed 3 of those tiny standard deviations as often as you would expect, because the distribution is a spike at zero with a few catastrophic outliers rather than a smooth bell curve. The kurtosis captures this shape; the simple tail ratio can miss it.

## Yearly data (1791-2025)

On yearly timescales, the tails are even more extreme. Using [MeasuringWorth](https://www.measuringworth.com/datasets/exchangeglobal/) data:

| Country | Years | Excess Kurtosis | Worst Year |
|---------|:-----:|:---------------:|:----------:|
| Mexico | 234 | 83.7 | -86% |
| Austria | 234 | 49.9 | -100% |
| Israel | 234 | 46.7 | -100% |
| Germany | 234 | 37.8 | -100% |
| Peru | 234 | 36.2 | -100% |
| Argentina | 234 | 19.9 | -100% |
| Brazil | 234 | 13.3 | -100% |
| United Kingdom | 234 | 5.2 | -33% |

Germany's kurtosis of 37.8 reflects a single year: 1923, when the mark lost effectively all its value. But this is exactly the point. A model that treats 1923 as an outlier to be excluded is a model that doesn't understand what kind of distribution it is dealing with. In a fat-tailed distribution, the extreme observation *is* the distribution. Remove it and you are estimating the wrong thing. As [Taleb (2020)](https://arxiv.org/abs/2001.10488) emphasizes, the sample mean and sample variance of a power-law distribution are dominated by the largest observation. That single observation contains more information about the tail than all the other observations combined.

### Five centuries of tail estimates (1500-2013)

Using fatcrash's Hill estimator on the [Clio Infra](https://clio-infra.eu/) data (1500-2013, 138 countries with 50+ years of data):

| Tail regime | Countries | Percentage |
|------------|:---------:|:----------:|
| $\alpha < 2$ (infinite variance) | 98 | **71%** |
| $\alpha$ 2-4 (fat tails, finite variance) | 37 | 27% |
| $\alpha > 4$ (moderate tails) | 3 | 2% |

71% of countries have exchange rate distributions where the variance does not converge. The median tail index $\alpha$ across all 138 countries is 1.57. For these currencies, computing a standard deviation is meaningless. The number you get depends on how many observations you have and which ones you happened to include. Add one more observation and the "standard deviation" can double.

This goes far beyond a marginal statistical issue: the foundational assumption behind Modern Portfolio Theory, CAPM, Black-Scholes, and every VaR model --- that returns have a finite, estimable variance --- is empirically false for the majority of the world's currencies.

## The peg paradox

The most counterintuitive finding: currencies with the lowest daily volatility have some of the highest excess kurtosis.

HKD/USD has 3.2% annualized volatility (barely moves) but excess kurtosis of 261. CNY/USD has 8.2% volatility but excess kurtosis of 3,846. Freely floating currencies like GBP/USD, with 10.1% volatility, have excess kurtosis of 5.4.

A **peg** compresses daily volatility to near zero. The central bank intervenes to keep the rate fixed, absorbing shocks with reserves instead of letting the price adjust. But when the peg breaks --- when the central bank runs out of reserves or loses the political will to defend it --- the move is catastrophic. The distribution is a spike at zero with rare but enormous outliers. This is exactly the kind of distribution that **Value at Risk** (VaR) models fail on. VaR estimates the loss you won't exceed with some high probability (say 99%) on a given day. It says risk is low (because daily moves are tiny). The true risk is high (because the peg can snap at any time, and when it does, the loss is 10 to 50 standard deviations).

The regime data from [Ilzetzki, Reinhart, and Rogoff (2019)](https://www.nber.org/papers/w23134), covering 1940-2021, confirms this pattern:

| Regime | N Countries | Volatility | Excess Kurtosis |
|--------|:-----------:|:----------:|:---------------:|
| Free float | 9 | 10.8% | 0.8 |
| Managed float | 33 | 56.7% | 129.3 |
| Crawling peg | 31 | 53.6% | 139.7 |
| Peg | 37 | 59.5% | 132.9 |
| Freely falling | 12 | 225.1% | 16.0 |

Free-floating currencies have low kurtosis (0.8) because the market absorbs shocks continuously through small daily moves. Pegged and managed currencies suppress daily volatility but accumulate stress that releases in catastrophic jumps, producing kurtosis 160 times higher than free floats.

This is [Taleb's](https://arxiv.org/abs/2001.10488) fragility argument in data. Suppressing volatility does not reduce risk. It concentrates risk into rare, large events. The peg creates an illusion of stability that makes the eventual break more damaging, both financially and psychologically, because nobody is positioned for it. Small, frequent adjustments are antifragile: each one releases pressure and provides information. Rigid pegs are fragile: they hide information until the system snaps.

The "freely falling" category is instructive too. These are currencies in hyperinflation or free-fall collapse. They have the highest volatility (225%) but moderate kurtosis (16.0). When a currency is already collapsing every day, there is no pent-up pressure left. The tails are fat but not as extreme as a peg because the large moves are continuous rather than sudden.

## Every currency loses against gold

Using gold price data from [DataHub](https://datahub.io/core/gold-prices) and [MeasuringWorth](https://www.measuringworth.com/datasets/gold/) cross-referenced with exchange rates, we computed the cumulative gold purchasing power retained by major currencies.

The British pound has data going back to 1257. For roughly 600 years (1257 to the early 1900s), the pound maintained its gold purchasing power relatively well, fluctuating within a band. The **gold standard** enforced discipline: because the pound was defined as a fixed weight of gold, debasement required a deliberate political act (reducing the gold content of coins, or suspending convertibility). Then the 20th century happened: two world wars, the end of the gold standard, and continuous inflation. The pound has lost the vast majority of its gold purchasing power since 1900.

Every other currency follows the same arc, just faster. The dollar has lost most of its gold value since 1789. The yen, rupee, and most Latin American currencies show steeper declines. No currency in the dataset has gained gold purchasing power over its full history.

This should be read as an empirical observation, not an argument for or against the gold standard. Over multi-century timescales, fiat currencies consistently debase against a fixed-supply asset. The rate of debasement varies enormously, but the long-run direction is consistent. [Reinhart and Rogoff (2009)](https://press.princeton.edu/books/paperback/9780691152646/this-time-is-different) document this pattern systematically across eight centuries: governments debase currencies to finance wars, bail out banks, and cover deficits. The mechanism changes (coin clipping in the medieval period, money printing in the modern era), yet the long-run outcome is the same.

The dataset includes monthly gold inflation data for 174 currencies from 1940 to 2025 and yearly data for 243 countries from 1257 to 2025.

## Cross-currency correlations

The daily correlation matrix reveals clear geographic and economic clusters:

- **European cluster**: GBP, CHF, NOK, SEK, DKK, EUR move together, reflecting economic integration and policy coordination.
- **Asian managed currencies**: SGD, MYR, THB form a tight cluster. These currencies are managed with reference to trade-weighted baskets that include each other.
- **Asian pegged**: CNY, HKD, LKR are nearly uncorrelated with everything else. Their pegs decouple them from market forces most of the time.
- **Commodity/Antipodean**: CAD, AUD, NZD cluster together, driven by commodity export exposure.
- **Latin America**: BRL and MXN show some correlation but are more idiosyncratic, driven by country-specific crises.
- **Venezuela**: VEF is essentially uncorrelated with all other currencies. Hyperinflation creates noise that overwhelms any trade or policy signal.

For portfolio construction, the implication is that holding multiple European currencies does not diversify FX risk much. Holding a European and an Asian currency does. But the "low correlation" of pegged currencies is misleading. In a global crisis, correlations spike as all pegs come under simultaneous pressure. The correlations that matter most are the ones you observe in the worst 1% of days, not the average.

## Medieval exchange data

The oldest records in the dataset come from the [Medieval and Early Modern Data Bank](https://memdb.libraries.rutgers.edu/) (MEMDB). [Spufford's](https://memdb.libraries.rutgers.edu/about-spufford-currency) collection contains 13,197 exchange quotations from 521 places across Europe, Byzantium, and North Africa, from 1106 to 1500. [Metz's](https://memdb.libraries.rutgers.edu/metz-currency) collection adds 50,559 records from the Lower Rhine region, from 1350 to 1800, documenting trades in Reichstaler, ducats, marks, and dozens of other coin types.

This data matters for a specific reason. Modern exchange rate theory is built on a sample that starts, at earliest, in 1971 (the end of Bretton Woods). Occasionally researchers go back to 1945 or 1900. That gives us, at best, 125 years of data in a world where currency systems have existed for nearly a millennium. The [Allen-Unger Global Commodity Prices Database](https://datasets.iisg.amsterdam/dataset.xhtml?persistentId=hdl:10622/3SV0BO) extends this even further: 973 commodity price series from 1260 to 1914, documenting the prices of wheat, rye, silver, coal, spices, and dozens of other goods across European and Asian cities. Combined with the medieval exchange rates, we can study how commodity prices and exchange rates interacted during the very period when modern financial markets were emerging.

The medieval data shows that the basic patterns we observe today --- fat tails, sudden devaluations, contagion across trading partners --- are not artifacts of modern fiat money or floating exchange rates. The Florentine banking crises of the 1340s, when the Bardi and Peruzzi banks collapsed after England defaulted on war debts, caused exchange rate disruptions across Europe that look structurally similar to modern currency crises despite the very different institutions involved. This tail behavior predates the post-1971 monetary regime; it appears to be a feature of exchange rates as such.

[Denzel (2010)](https://eh.net/book_reviews/handbook-of-world-exchange-rates-1590-1914/) documents this pattern further in his *Handbook of World Exchange Rates, 1590-1914*, showing that the early modern period had its own currency crises, debasements, and contagion episodes. The mechanisms of medieval and early modern exchange rate volatility (coin debasement, sovereign default, banking panic) differ from modern ones (central bank policy, capital flows, speculative attacks), but the statistical signature is the same: power-law tails and clustered extremes.

## Why this data matters

Most financial research operates on 20 to 50 years of data. This is a problem for studying events that happen once every 30 to 80 years. If you have 40 years of data and the event you care about has a 2% annual probability, you might see it once, or not at all. Your sample is too short to estimate the tail.

With 2,000 years of data across 240 countries and 27 sources, we have a much larger sample of extreme events. The [Clio Infra](https://clio-infra.eu/) data alone contains dozens of hyperinflations, currency collapses, and regime transitions. [Reinhart and Rogoff's](https://carmenreinhart.com/data/) crisis indices catalogue two centuries of banking panics, sovereign defaults, and currency crashes. [Schmelzing's](https://www.bankofengland.co.uk/working-paper/2020/eight-centuries-of-global-real-interest-rates-r-g-and-the-suprasecular-decline-1311-2018) interest rate data shows that the secular decline in rates is a 700-year trend, not a 40-year anomaly. Combined with the [FRED](https://fred.stlouisfed.org/categories/94) daily data, we can study the same phenomenon at both high frequency (daily) and long duration (centuries).

Three things become clear with this much data:

1. **Standard risk models underestimate tail risk by orders of magnitude.** When 71% of countries have infinite-variance exchange rate distributions ($\alpha < 2$), any model that assumes finite variance (CAPM, Black-Scholes, mean-variance optimization) is systematically wrong. It does not give conservative estimates. It gives estimates of the wrong quantity. You cannot approximate a distribution with $\alpha = 1.5$ using a Gaussian any more than you can approximate a Cauchy distribution with a Gaussian. The moments do not exist.

2. **Stability is not safety.** Pegged currencies look stable until they are not. Managed currencies look stable until they are not. Low volatility regimes have the highest kurtosis. The absence of small crises is often the precondition for a large one. This pattern --- suppressed volatility leading to explosive tail events --- appears in every subset of the data: daily, yearly, medieval, modern, developed, developing.

3. **Currency debasement is the norm, not the exception.** Over multi-century timescales, every currency in the dataset loses purchasing power against gold. The rate varies enormously (some countries lose 99% in a decade, others take centuries), but the direction is uniform. Any financial plan that assumes a stable currency over a 30-year horizon is making an assumption that contradicts 2,000 years of data.

## Code and data

The full dataset and build pipeline are open source:

[github.com/unbalancedparentheses/forex-centuries](https://github.com/unbalancedparentheses/forex-centuries)

The repository includes:
- Raw data from all 27 sources (1,100+ files), 23 of which are automatically fetched by a weekly CI pipeline
- A 12-step ETL pipeline that normalizes, deduplicates, and cross-validates
- Derived datasets:
  - Unified yearly panel (24,656 rows, 243 countries), normalized daily rates (271,228 rows), log returns, rolling volatility
  - Regime classifications, gold inflation for 174 currencies, correlation matrices
  - Momentum analysis: 3/6/12 month momentum and reversals for 23 FRED currencies
  - Sigma event frequencies: observed 2-5$\sigma$ events vs Gaussian expected counts per currency
  - JST asset class returns: real returns on equities, housing, bonds, and bills across 18 countries (1870-2017)
  - 20-year rolling stock-bond correlations from JST data
- 9 charts: fat-tail histograms, QQ-plots, peg paradox scatter, tail ratio bars, rolling volatility, correlation heatmap, gold erosion, regime timeline
- 52 data validation checks with outlier detection and cross-source consistency
- 17 unit tests with synthetic data
- Quickstart scripts (pure Python and pandas versions)
- A Jupyter exploration notebook

The companion crash detection toolkit: [github.com/unbalancedparentheses/fatcrash](https://github.com/unbalancedparentheses/fatcrash)

## References

- Spufford, P. (1986). [*Handbook of Medieval Exchange*](https://books.google.com/books/about/Handbook_of_Medieval_Exchange.html?id=IvgoAAAAYAAJ). Royal Historical Society.
- Metz, R. (1990). [*Geld, Wahrung und Preisentwicklung*](https://memdb.libraries.rutgers.edu/metz-currency). (Rhine region, 1350-1800).
- Denzel, M. (2010). [*Handbook of World Exchange Rates, 1590-1914*](https://eh.net/book_reviews/handbook-of-world-exchange-rates-1590-1914/). Ashgate.
- Reinhart, C. and Rogoff, K. (2009). [*This Time Is Different: Eight Centuries of Financial Folly*](https://press.princeton.edu/books/paperback/9780691152646/this-time-is-different). Princeton University Press.
- Ilzetzki, E., Reinhart, C. and Rogoff, K. (2019). [*Exchange Arrangements Entering the 21st Century: Which Anchor Will Hold?*](https://www.nber.org/papers/w23134) Quarterly Journal of Economics, 134(2).
- Jorda, O., Schularick, M. and Taylor, A. (2017). [*Macrofinancial History and the New Business Cycle Facts*](https://www.macrohistory.net/database/). NBER Macroeconomics Annual.
- Thomas, R. and Dimsdale, N. (2017). [*A Millennium of UK Data*](https://www.bankofengland.co.uk/statistics/research-datasets). Bank of England.
- Officer, L. and Williamson, S. [MeasuringWorth](https://www.measuringworth.com/datasets/exchangeglobal/).
- Schmelzing, P. (2020). [*Eight Centuries of Global Real Interest Rates, R-G, and the 'Suprasecular' Decline, 1311-2018*](https://www.bankofengland.co.uk/working-paper/2020/eight-centuries-of-global-real-interest-rates-r-g-and-the-suprasecular-decline-1311-2018). Bank of England Staff Working Paper No. 845.
- Bolt, J. and van Zanden, J.L. (2024). [*Maddison Project Database 2023*](https://www.rug.nl/ggdc/historicaldevelopment/maddison/). University of Groningen.
- Allen, R.C. and Unger, R.W. [*Global Commodity Prices Database*](https://datasets.iisg.amsterdam/dataset.xhtml?persistentId=hdl:10622/3SV0BO). International Institute of Social History.
- Darvas, Z. (2012). [*Real Effective Exchange Rates for 178 Countries: A New Database*](https://www.bruegel.org/publications/datasets/real-effective-exchange-rates-for-178-countries-a-new-database). Bruegel Working Paper 2012/06.
- Edvinsson, R., Jacobson, T. and Waldenström, D. (2010). [*Historical Monetary and Financial Statistics for Sweden*](https://www.riksbank.se/en-gb/statistics/historical-monetary-statistics-of-sweden/). Sveriges Riksbank.
- Mitchener, K.J. and Weidenmier, M.D. (2015). [*Historical Financial Statistics*](https://centerforfinancialstability.org/hfs.php). Center for Financial Stability.
- Jegadeesh, N. and Titman, S. (1993). [*Returns to Buying Winners and Selling Losers*](https://doi.org/10.1111/j.1540-6261.1993.tb04702.x). Journal of Finance, 48(1).
- Jordà, Ò., Schularick, M. and Taylor, A. M. (2019). [*The Rate of Return on Everything, 1870-2015*](https://doi.org/10.1093/qje/qjy004). Quarterly Journal of Economics, 134(3).
- Rattray, S., Granger, N., Harvey, C. R., and Van Hemert, O. (2020). [*Strategic Rebalancing*](https://doi.org/10.3905/jpm.2020.1.131). Journal of Portfolio Management, 46(6).
- Taleb, N. N. (2020). [*Statistical Consequences of Fat Tails*](https://arxiv.org/abs/2001.10488). STEM Academic Press.

---

## Talks

### Next 10 Years of Ethereum

*Published: 2025-11-15*

> Talk at Devconnect about the next 10 years of Ethereum

URL: https://federicocarrone.com/talks/next-10-years-of-ethereum/

---

### Ethereum's Native Rollup Roadmap with Justin Drake

*Published: 2025-10-01*

> Discussion about Ethereum's native rollup roadmap

URL: https://federicocarrone.com/talks/2025-podcast-with-justin-drake/

---

## About

### About

> Federico Carrone. Building formally verified infrastructure, writing about what compounds over time.

URL: https://federicocarrone.com/about/

---

## Keywords

### ai

- [Legibility Kills What It Measures](https://federicocarrone.com/articles/legibility-kills-what-it-measures/): Article
- [Friction as Luxury: What We Lose When AI Gives Us What We Want](https://federicocarrone.com/articles/friction-as-luxury/): Article
- [Is Consciousness a Network Effect?](https://federicocarrone.com/articles/consciousness-as-network-effect/): Article
- [China is trying to commoditize the complement](https://federicocarrone.com/articles/china-commoditizing-the-complement/): Article
- [Unprepared for What's Coming](https://federicocarrone.com/articles/unprepared/): Article

### AQR

- [The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question](https://federicocarrone.com/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/): Series (Leptokurtic)

### backtesting

- [The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question](https://federicocarrone.com/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/): Series (Leptokurtic)

### bitcoin

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### Bouchaud

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### commodity prices

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### consciousness

- [Is Consciousness a Network Effect?](https://federicocarrone.com/articles/consciousness-as-network-effect/): Article
- [The Death of the Inner Self](https://federicocarrone.com/articles/the-death-of-the-inner-self/): Article

### coordination

- [The Death of the Inner Self](https://federicocarrone.com/articles/the-death-of-the-inner-self/): Article
- [Crypto doctrine](https://federicocarrone.com/articles/crypto-doctrine/): Article
- [Transforming the Future with Zero-Knowledge Proofs, Fully Homomorphic Encryption and new Distributed Systems algorithms](https://federicocarrone.com/articles/transforming-the-future-with-zero-knowledge-proofs-fully-homomorphic-encryption-and-new-distributed-systems-algorithms/): Article

### crash detection

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### crypto

- [Crypto doctrine](https://federicocarrone.com/articles/crypto-doctrine/): Article
- [The new financial backend of the world](https://federicocarrone.com/series/ethereum/the-new-financial-backend-of-the-world/): Series (Ethereum)
- [The missing institution of the Internet](https://federicocarrone.com/series/ethereum/the-missing-institution-of-the-internet/): Series (Ethereum)

### cryptography

- [Transforming the Future with Zero-Knowledge Proofs, Fully Homomorphic Encryption and new Distributed Systems algorithms](https://federicocarrone.com/articles/transforming-the-future-with-zero-knowledge-proofs-fully-homomorphic-encryption-and-new-distributed-systems-algorithms/): Article

### culture

- [Legibility Kills What It Measures](https://federicocarrone.com/articles/legibility-kills-what-it-measures/): Article
- [Friction as Luxury: What We Lose When AI Gives Us What We Want](https://federicocarrone.com/articles/friction-as-luxury/): Article
- [Notes on permanence, time, and ergodicity](https://federicocarrone.com/articles/notes-on-culture-infrastructure-time-and-ergodicity/): Article
- [Crypto doctrine](https://federicocarrone.com/articles/crypto-doctrine/): Article

### currency crisis

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### DFA

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### disruption

- [Unprepared for What's Coming](https://federicocarrone.com/articles/unprepared/): Article

### economy

- [The new financial backend of the world](https://federicocarrone.com/series/ethereum/the-new-financial-backend-of-the-world/): Series (Ethereum)
- [The missing institution of the Internet](https://federicocarrone.com/series/ethereum/the-missing-institution-of-the-internet/): Series (Ethereum)

### elixir

- [Building a SaaS with Elixir/Phoenix and React](https://federicocarrone.com/articles/building-a-saas-with-elixir-phoenix-and-react/): Article

### ergodic

- [Notes on permanence, time, and ergodicity](https://federicocarrone.com/articles/notes-on-culture-infrastructure-time-and-ergodicity/): Article

### ergodicity economics

- [At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.](https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/): Series (Leptokurtic)

### ethereum

- [Crypto doctrine](https://federicocarrone.com/articles/crypto-doctrine/): Article
- [The new financial backend of the world](https://federicocarrone.com/series/ethereum/the-new-financial-backend-of-the-world/): Series (Ethereum)
- [The missing institution of the Internet](https://federicocarrone.com/series/ethereum/the-missing-institution-of-the-internet/): Series (Ethereum)

### exchange rates

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### extreme value theory

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### fat tails

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)
- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### finance

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### forex

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### formal-verification

- [The Concrete Programming Language: Systems Programming for Formal Reasoning](https://federicocarrone.com/series/concrete/the-concrete-programming-language-systems-programming-for-formal-reasoning/): Series (Concrete)

### functional programming

- [Type Systems: From Generics to Dependent Types](https://federicocarrone.com/articles/type-systems/): Article

### geopolitics

- [China is trying to commoditize the complement](https://federicocarrone.com/articles/china-commoditizing-the-complement/): Article

### gold

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### Hurst

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### individuality

- [The Death of the Inner Self](https://federicocarrone.com/articles/the-death-of-the-inner-self/): Article

### infrastructure

- [China is trying to commoditize the complement](https://federicocarrone.com/articles/china-commoditizing-the-complement/): Article
- [Notes on permanence, time, and ergodicity](https://federicocarrone.com/articles/notes-on-culture-infrastructure-time-and-ergodicity/): Article

### interest rates

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### Jensen's inequality

- [At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.](https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/): Series (Leptokurtic)

### Kelly criterion

- [At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.](https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/): Series (Leptokurtic)

### legibility

- [Legibility Kills What It Measures](https://federicocarrone.com/articles/legibility-kills-what-it-measures/): Article

### logarithms

- [At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.](https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/): Series (Leptokurtic)

### LPPLS

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### medieval finance

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### momentum

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### nix

- [Building a SaaS with Elixir/Phoenix and React](https://federicocarrone.com/articles/building-a-saas-with-elixir-phoenix-and-react/): Article

### open data

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### options

- [The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question](https://federicocarrone.com/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/): Series (Leptokurtic)

### Peters

- [At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.](https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/): Series (Leptokurtic)

### philosophy

- [Legibility Kills What It Measures](https://federicocarrone.com/articles/legibility-kills-what-it-measures/): Article
- [Friction as Luxury: What We Lose When AI Gives Us What We Want](https://federicocarrone.com/articles/friction-as-luxury/): Article
- [Is Consciousness a Network Effect?](https://federicocarrone.com/articles/consciousness-as-network-effect/): Article
- [Notes on permanence, time, and ergodicity](https://federicocarrone.com/articles/notes-on-culture-infrastructure-time-and-ergodicity/): Article

### phoenix

- [Building a SaaS with Elixir/Phoenix and React](https://federicocarrone.com/articles/building-a-saas-with-elixir-phoenix-and-react/): Article

### profit

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### programming languages

- [Type Systems: From Generics to Dependent Types](https://federicocarrone.com/articles/type-systems/): Article

### programming-languages

- [The Concrete Programming Language: Systems Programming for Formal Reasoning](https://federicocarrone.com/series/concrete/the-concrete-programming-language-systems-programming-for-formal-reasoning/): Series (Concrete)

### react

- [Building a SaaS with Elixir/Phoenix and React](https://federicocarrone.com/articles/building-a-saas-with-elixir-phoenix-and-react/): Article

### revenue

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### rust

- [Type Systems: From Generics to Dependent Types](https://federicocarrone.com/articles/type-systems/): Article
- [The Concrete Programming Language: Systems Programming for Formal Reasoning](https://federicocarrone.com/series/concrete/the-concrete-programming-language-systems-programming-for-formal-reasoning/): Series (Concrete)
- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### SaaS

- [Building a SaaS with Elixir/Phoenix and React](https://federicocarrone.com/articles/building-a-saas-with-elixir-phoenix-and-react/): Article

### silver

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### Sornette

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### sovereign debt

- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### Spitznagel

- [At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.](https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/): Series (Leptokurtic)
- [The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question](https://federicocarrone.com/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/): Series (Leptokurtic)

### SPY

- [The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question](https://federicocarrone.com/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/): Series (Leptokurtic)

### tacit knowledge

- [Legibility Kills What It Measures](https://federicocarrone.com/articles/legibility-kills-what-it-measures/): Article

### tail hedging

- [At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.](https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/): Series (Leptokurtic)
- [The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question](https://federicocarrone.com/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/): Series (Leptokurtic)

### Taleb

- [At the Core of Finance Lies Geometry. In the End, It’s All Jensen’s Inequality.](https://federicocarrone.com/series/leptokurtic/at-the-core-of-finance-lies-geometry-in-the-end-its-all-jensens-inequality/): Series (Leptokurtic)
- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

### technology

- [China is trying to commoditize the complement](https://federicocarrone.com/articles/china-commoditizing-the-complement/): Article
- [Unprepared for What's Coming](https://federicocarrone.com/articles/unprepared/): Article

### trust

- [Transforming the Future with Zero-Knowledge Proofs, Fully Homomorphic Encryption and new Distributed Systems algorithms](https://federicocarrone.com/articles/transforming-the-future-with-zero-knowledge-proofs-fully-homomorphic-encryption-and-new-distributed-systems-algorithms/): Article

### type systems

- [Type Systems: From Generics to Dependent Types](https://federicocarrone.com/articles/type-systems/): Article
- [The Concrete Programming Language: Systems Programming for Formal Reasoning](https://federicocarrone.com/series/concrete/the-concrete-programming-language-systems-programming-for-formal-reasoning/): Series (Concrete)

### volatility

- [The Tail Hedge Debate: Spitznagel Is Right, AQR Is Answering the Wrong Question](https://federicocarrone.com/series/leptokurtic/the-tail-hedge-debate-spitznagel-is-right/): Series (Leptokurtic)
- [Twenty Centuries of Financial Data: What 240 Countries and 2,000 Years Reveal](https://federicocarrone.com/series/leptokurtic/nine-centuries-of-exchange-rates/): Series (Leptokurtic)

### volatility targeting

- [Detecting Crashes with Fat-Tail Statistics](https://federicocarrone.com/series/leptokurtic/detecting-crashes-with-fat-tail-statistics/): Series (Leptokurtic)

---

## Reading

Book recommendations from Federico Carrone

URL: https://federicocarrone.com/reading/

### Philosophy

- The Discourses | by Epictetus | rating 4.40
  A practical manual for distinguishing what is within our control from what is not. Reading it feels like being corrected by someone who has thought more clearly about the same problems you face. The Stoic framework here is less about suppressing emotion and more about directing attention where it can actually matter.

- Antifragile | by Nassim Nicholas Taleb | rating 4.10
  The central insight is that some systems benefit from volatility and disorder rather than merely surviving them. This reframes how I think about building organizations and making decisions under uncertainty. The distinction between fragile, robust, and antifragile is more useful than any risk management framework I have encountered.

- Inventing the Individual | by Larry Siedentop | rating 4.10
  A history of how Western individualism emerged from Christian thought over centuries. Siedentop challenges the assumption that individualism is a modern invention, tracing its roots to early Christianity and medieval theology. Useful context for understanding debates about consciousness and selfhood.

- Fooled by Randomness | by Nassim Nicholas Taleb | rating 4.08
  My biggest recommendation to crypto founders. Taleb shows how we systematically confuse luck with skill, especially in domains with high randomness. Essential for anyone operating in markets where survivorship bias distorts our understanding of what actually works.

- The Protestant Ethic and the Spirit of Capitalism | by Max Weber | rating 3.91
  Essential for understanding recent crypto Twitter debates and the cultural foundations of modern capitalism. Weber traces how religious ideas shaped economic behavior, creating a framework that still influences how we think about work, wealth, and moral obligation.

### Science & Mathematics

- The Feynman Lectures on Physics | by Richard Feynman, Robert B. Leighton, and Matthew Sands | rating 4.61
  The best introduction to understanding the fundamental laws governing the physical world. Feynman had a gift for making complex ideas accessible without dumbing them down. Reading these lectures fosters both healthy skepticism and justified certainty about how things actually work.

- Mathematics: Its Content, Methods and Meaning | by A.D. Aleksandrov, A.N. Kolmogorov, and M.A. Lavrent'ev | rating 4.43
  My favorite book for explaining the beauty and utility of mathematics. Written by three Soviet mathematicians, it offers a panoramic view of the major fields without sacrificing depth. For founders, mathematical thinking provides a strategic advantage that compounds over time.

- Nonlinear Dynamics and Chaos | by Steven Strogatz | rating 4.38
  The best introduction to chaos theory and dynamical systems. Strogatz makes complex mathematical concepts accessible through intuition and examples drawn from physics, biology, and engineering. Understanding nonlinear dynamics changes how you see feedback loops, tipping points, and emergent behavior in any complex system.

- Infinite Powers | by Steven Strogatz | rating 4.29
  A captivating history of calculus and why it matters. Strogatz shows how the language of infinity has shaped our understanding of everything from planetary motion to GPS satellites. The book reveals calculus not as abstract manipulation but as humanity's most powerful tool for decoding the universe.

- Scale | by Geoffrey West | rating 4.11
  Uncovers universal laws governing biology, cities, and economies. West shows how the same mathematical patterns appear across vastly different systems, from metabolic rates in organisms to innovation in cities. Understanding these scaling laws is crucial for navigating a changing world.

- Sync | by Steven Strogatz | rating 4.07
  An exploration of how spontaneous order emerges from chaos. Strogatz examines synchronization across nature: fireflies flashing in unison, cardiac pacemaker cells, circadian rhythms, and even the wobble of the Millennium Bridge. The patterns reveal deep mathematical principles governing self-organization in complex systems.

- What Evolution Is | by Ernst Mayr | rating 4.02
  A definitive explanation of evolutionary biology from one of its greatest practitioners. Mayr distills a lifetime of work into a clear account of how evolution operates, addressing common misconceptions along the way. Essential for understanding the process that shaped all life on Earth.

- An Introduction to Information Theory | by John R. Pierce | rating 3.90
  A clear introduction to Shannon's information theory without requiring advanced mathematics. Pierce explains entropy, channel capacity, and coding with remarkable clarity. Understanding information theory is fundamental for anyone working with data, communication systems, or trying to grasp the mathematical limits of what can be transmitted or compressed.

### Engineering

- Systems Performance: Enterprise and the Cloud | by Brendan Gregg | rating 4.49
  A must-read for engineers deploying production code. Gregg covers performance analysis methodology, tools, and techniques at every layer of the stack. This book will change how you think about observability, bottlenecks, and system behavior under load.

### Economics & Money

- The Last Economy | by Emad Mostaque | rating 4.39
  A guide to understanding how AI will fundamentally transform economics. Mostaque argues that we are entering a new era where traditional economic models break down as intelligent systems reshape production, labor, and value creation. Essential reading for anyone trying to understand the economic implications of artificial intelligence.

- The Use of Knowledge in Society | by F.A. Hayek | rating 4.37
  A short essay that explains why decentralized coordination through prices often outperforms central planning. The argument is about information: knowledge is dispersed across millions of minds and cannot be aggregated into a single plan without losing most of what makes it valuable.

- Principles for Changing World Order | by Ray Dalio | rating 4.27
  A study of the rise and decline of reserve currencies and the empires behind them. Dalio examines cycles spanning centuries to identify patterns that might indicate where we are in the current cycle. Especially relevant during pivotal societal moments.

- Seeing Like a State | by James C. Scott | rating 4.21
  How large-scale schemes to improve the human condition fail when they ignore local knowledge and complexity. The concept of legibility, making society readable to administrators, explains many pathologies of modern institutions.

- The Sovereign Individual | by James Dale Davidson and William Rees-Mogg | rating 4.18
  Written in 1997, this book predicted much of what the internet would do to the relationship between individuals and states. Its framework for understanding how technology shifts power remains remarkably useful for analyzing current dynamics.

- More Money Than God | by Sebastian Mallaby | rating 4.11
  The definitive history of hedge funds, from Alfred Winslow Jones's original market-neutral fund through the quant revolution and the 2008 crisis. Mallaby shows how hedge funds pioneered risk management techniques, exploited market inefficiencies, and shaped modern finance — while repeatedly blowing up in spectacular fashion.

- Broken Money | by Lyn Alden | rating 4.00
  A clear explanation of how money systems have become dysfunctional and what potential fixes exist. Alden combines engineering precision with financial depth to analyze monetary systems from first principles.

- Bubbles and the End of Stagnation | by Byrne Hobart and Tobias Huber | rating 3.82
  A fresh perspective on financial bubbles as engines of progress rather than purely destructive forces. The Bitcoin chapter is particularly insightful for understanding how speculative energy can drive technological adoption.

### History

- The Age of Revolution, The Age of Capital, and The Age of Empire | by Eric Hobsbawm | rating 4.23
  A trilogy covering 1789-1914 that provides the historical framework to understand how the modern world took shape. Hobsbawm traces how the dual revolutions (French and Industrial) transformed everything from politics to daily life. Essential context for understanding current changes.

### Business & Strategy

- Never Split the Difference | by Chris Voss | rating 4.34
  Negotiation as a fundamental human skill, not just a business tactic. Voss, a former FBI hostage negotiator, shows that negotiation principles apply to nearly all human interactions. The techniques here are immediately practical.

- The Luxury Strategy | by Jean-Noël Kapferer and Vincent Bastien | rating 4.24
  An amazing book on the difference between premium and luxury. The authors argue that luxury brands follow different rules than traditional marketing, anti-laws that seem counterintuitive but explain why certain brands maintain their power across generations.

- Working Backwards | by Colin Bryar and Bill Carr | rating 4.20
  The most important book for the Ethereum community and anyone building products. Former Amazon executives explain the internal mechanisms that allowed Amazon to innovate consistently, including the famous six-page memo and working backwards from the customer.

- Zero to One | by Peter Thiel | rating 4.15
  The counterintuitive idea that competition is for losers. Thiel argues that the most valuable companies create something entirely new rather than competing in existing markets. Building a monopoly through uniqueness is more sustainable than fighting for market share.

- Only the Paranoid Survive | by Andrew S. Grove | rating 3.98
  Grove's framework for detecting strategic inflection points, moments when the fundamentals of a business change. Learning to recognize these transitions early is essential for building organizations that survive and adapt over time.

---

## Screens

Anime, animation, movies, series, and games recommendations from Federico Carrone

URL: https://federicocarrone.com/screens/

### Anime

- Attack on Titan | rating 9.1
  URL: https://www.imdb.com/title/tt2560140/
  The best exploration of freedom, sovereignty, and the price of survival I have seen in any medium. The political arcs in the final seasons are more honest about power than most prestige drama.

- Cowboy Bebop | rating 8.9
  URL: https://www.imdb.com/title/tt0213338/
  Style as substance. Every episode is a meditation on being unable to escape the past, wrapped in jazz and noir. The ending is perfect.

- Berserk | rating 8.7
  URL: https://www.imdb.com/title/tt0318871/
  Ambition, betrayal, and what it costs to impose your will on the world. The Golden Age arc is one of the great tragic narratives in fiction.

- Ghost in the Shell: Stand Alone Complex | rating 8.5
  URL: https://www.imdb.com/title/tt0346314/
  The film asked the philosophical questions. SAC builds the world where those questions have policy implications. The Laughing Man arc is one of the best treatments of information warfare in fiction.

- Ghost in the Shell | rating 8.0
  URL: https://www.imdb.com/title/tt0113568/
  The original questions about consciousness and identity in a networked world. Still more philosophically serious than most AI discourse today.

- Akira | rating 8.0
  URL: https://www.imdb.com/title/tt0094625/
  Power without institutions to contain it. Visually unmatched four decades later. The animation alone changed what the medium could be.

- Legend of the Galactic Heroes: Die Neue These | rating 7.8
  URL: https://www.imdb.com/title/tt7407236/
  Democracy versus autocracy argued honestly, with neither side caricatured. Closer to Thucydides than to space opera.

### Animation

- Rick and Morty | rating 9.1
  URL: https://www.imdb.com/title/tt2861424/
  Nihilism played for laughs until it stops being funny. The best episodes land because they take the consequences of infinite possibility seriously.

- Arcane | rating 9.0
  URL: https://www.imdb.com/title/tt11126994/
  Class conflict, institutional failure, and what happens when the people with nothing to lose get access to power. The animation sets a new standard.

- Gravity Falls | rating 8.9
  URL: https://www.imdb.com/title/tt1865718/
  Deceptively smart. Mystery, conspiracy, and genuine emotional weight hidden inside a children's show. Knew exactly when to end.

- BoJack Horseman | rating 8.8
  URL: https://www.imdb.com/title/tt3398228/
  The most honest show about self-destruction I have seen. It refuses to let its protagonist off the hook, which is rare and necessary.

- Samurai Jack | rating 8.5
  URL: https://www.imdb.com/title/tt0278238/
  Pure visual storytelling. Entire episodes with almost no dialogue that work better than most scripts. Patience as aesthetic principle.

- Love, Death & Robots | rating 8.4
  URL: https://www.imdb.com/title/tt9561862/
  Short-form science fiction that takes animation seriously as a medium. Beyond the Aquila Rift and Zima Blue do more in fifteen minutes than most feature films.

- Cyberpunk: Edgerunners | rating 8.3
  URL: https://www.imdb.com/title/tt12590266/
  What happens when a system is designed to grind people down and someone decides not to comply. Studio Trigger at their most kinetic.

- Final Space | rating 8.2
  URL: https://www.imdb.com/title/tt6317068/
  Starts as absurd comedy and quietly becomes one of the most emotionally committed animated shows. The tonal shift works because it was always there.

- Daria | rating 7.8
  URL: https://www.imdb.com/title/tt0118298/
  Deadpan intelligence against a world that rewards conformity. Still relevant decades later, which says something about the world.

### Movies: Crime & Drama

- The Godfather | rating 9.2
  URL: https://www.imdb.com/title/tt0068646/
  Power, family, and the corruption that comes from believing you can keep them separate. The transition from Michael's idealism to his cold pragmatism is the most important arc in American cinema.

- Pulp Fiction | rating 8.8
  URL: https://www.imdb.com/title/tt0110912/
  Proved that structure itself could be a creative act. The nonlinear storytelling changed what audiences were willing to follow.

- City of God | rating 8.6
  URL: https://www.imdb.com/title/tt0317248/
  Growing up in a Latin American city where institutions have failed, told without sentimentality. The most honest film about what happens when the state abandons a place.

- The Departed | rating 8.5
  URL: https://www.imdb.com/title/tt0407887/
  Identity as performance. Everyone is pretending to be someone else, and the system rewards the best liars.

- Oldboy | rating 8.4
  URL: https://www.imdb.com/title/tt0364569/
  Revenge as self-destruction. The corridor fight scene is famous, but the real brutality is in the ending.

- Reservoir Dogs | rating 8.3
  URL: https://www.imdb.com/title/tt0105236/
  Trust and betrayal in a closed system. Tarantino's tightest script. Everything that matters happens off-screen or in dialogue.

- Snatch | rating 8.3
  URL: https://www.imdb.com/title/tt0208092/
  Every plan fails, every failure creates an opportunity, and somehow it all resolves. Funnier and more rewatchable than it has any right to be.

- There Will Be Blood | rating 8.2
  URL: https://www.imdb.com/title/tt0469494/
  Ambition that consumes everything around it, including itself. Daniel Day-Lewis gave the definitive performance of a man who wins by becoming what he despises.

- Taxi Driver | rating 8.2
  URL: https://www.imdb.com/title/tt0075314/
  Alienation in a city full of people. Travis Bickle's loneliness is not romantic. It is dangerous, and the film never pretends otherwise.

- The Wolf of Wall Street | rating 8.2
  URL: https://www.imdb.com/title/tt0993846/
  The system does not punish Belfort. It absorbs him. The audience's enjoyment is the point Scorsese is making.

- Lock, Stock and Two Smoking Barrels | rating 8.2
  URL: https://www.imdb.com/title/tt0120735/
  Guy Ritchie's debut, tighter and funnier than everything that followed. Cascading consequences played as comedy.

- Nueve Reinas | rating 8.1
  URL: https://www.imdb.com/title/tt0247586/
  Argentine con-artist cinema at its best. Trust is the currency, and the film itself cons the audience. If you grew up in Buenos Aires, you recognize every character.

- The Irishman | rating 7.8
  URL: https://www.imdb.com/title/tt1302006/
  The Godfather told from the perspective of old age. All that power and violence, and in the end you are alone in a nursing home with the door open.

- The Girl with the Dragon Tattoo | rating 7.8
  URL: https://www.imdb.com/title/tt1568346/
  Fincher's coldest film. Lisbeth Salander is one of the great characters in contemporary fiction. Competence as survival mechanism.

- Zodiac | rating 7.7
  URL: https://www.imdb.com/title/tt0443706/
  Obsession without resolution. The real subject is not the killer but what the search does to the people who cannot stop looking.

- Once Upon a Time in Hollywood | rating 7.6
  URL: https://www.imdb.com/title/tt7131622/
  Tarantino's most melancholic film. A love letter to a world that is ending, told by people who do not yet know it.

- Gangs of New York | rating 7.5
  URL: https://www.imdb.com/title/tt0217505/
  How institutions are built on violence and then erase the memory of that violence. Daniel Day-Lewis carries a messy film through sheer force.

### Movies: Sci-Fi & Thriller

- The Dark Knight | rating 9.1
  URL: https://www.imdb.com/title/tt0468569/
  The Joker's argument that civilization is a thin veneer over chaos is never actually refuted. The film's real tension is that he might be right.

- Inception | rating 8.8
  URL: https://www.imdb.com/title/tt1375666/
  Ideas as infrastructure. Nolan built a world where the architecture of thought is literally constructed and the rules must be internally consistent. The heist is secondary to the world-building.

- Fight Club | rating 8.8
  URL: https://www.imdb.com/title/tt0137523/
  Consumer nihilism and the desire for authenticity through destruction. The twist is less interesting than the critique it enables.

- The Good, the Bad and the Ugly | rating 8.8
  URL: https://www.imdb.com/title/tt0060196/
  Three strategies for surviving in a world without law. Leone understood that morality is a luxury of stable systems.

- The Matrix | rating 8.7
  URL: https://www.imdb.com/title/tt0133093/
  The red pill as epistemological rupture. Still the best popular treatment of simulation, reality, and the cost of knowing the difference.

- Apocalypse Now | rating 8.5
  URL: https://www.imdb.com/title/tt0078788/
  The journey upriver is a journey toward the logic that institutions try to suppress. Kurtz understood something that the army could not afford to acknowledge.

- Gladiator | rating 8.5
  URL: https://www.imdb.com/title/tt0172495/
  Duty surviving the collapse of the institution that gave it meaning. The stoic framework is not subtextual. It is the whole point.

- Django Unchained | rating 8.5
  URL: https://www.imdb.com/title/tt1853728/
  Tarantino using genre to confront history directly. Christoph Waltz makes the best case for competence as moral action.

- Dune: Part Two | rating 8.5
  URL: https://www.imdb.com/title/tt15239678/
  The rare blockbuster that takes its source material's pessimism seriously. Paul's arc is a warning about charismatic leadership, not a celebration of it.

- Inglourious Basterds | rating 8.4
  URL: https://www.imdb.com/title/tt0361748/
  Language as weapon. The opening scene is one of the greatest exercises in sustained tension ever filmed. Hans Landa is terrifying because he is brilliant.

- Full Metal Jacket | rating 8.3
  URL: https://www.imdb.com/title/tt0093058/
  Two films in one: the making of a soldier and the unmaking of everything that process promised. Kubrick's coldest dissection of institutional violence.

- Shutter Island | rating 8.2
  URL: https://www.imdb.com/title/tt1130884/
  The question is not what is real, but whether knowing the truth is survivable. Scorsese's most underrated film.

- Dune: Part One | rating 8.0
  URL: https://www.imdb.com/title/tt1160419/
  Villeneuve proved that science fiction does not have to be fast to be immersive. The pacing is the point. It demands patience.

- Sin City | rating 8.0
  URL: https://www.imdb.com/title/tt0401792/
  Noir as pure form. The visual language is so committed that the story almost becomes secondary to the aesthetic.

- Drive | rating 7.8
  URL: https://www.imdb.com/title/tt0780504/
  Minimalism as characterization. The driver says almost nothing, and every silence means more than dialogue would. Refn understood that restraint is its own kind of violence.

- Watchmen | rating 7.6
  URL: https://www.imdb.com/title/tt0409459/
  The deconstruction of heroism that most superhero films pretend does not exist. Rorschach's moral absolutism against Ozymandias's utilitarian calculus is a genuine philosophical conflict.

- The Assassination of Jesse James | rating 7.5
  URL: https://www.imdb.com/title/tt0443680/
  Mythology and the people who get destroyed by proximity to it. The most beautiful cinematography in any Western. Patience required and rewarded.

### Movies: Comedy & Indie

- The Big Lebowski | rating 8.1
  URL: https://www.imdb.com/title/tt0118715/
  The Dude's refusal to participate in anyone else's urgency is either profound laziness or a radical philosophical stance. The Coen brothers never tell you which.

- The Grand Budapest Hotel | rating 8.1
  URL: https://www.imdb.com/title/tt2278388/
  Civilization as aesthetic practice. Gustave H. maintains his standards precisely because the world around him is collapsing. Anderson's most emotionally serious film.

- Little Miss Sunshine | rating 7.8
  URL: https://www.imdb.com/title/tt0449059/
  A family of failures who discover that losing together is better than winning alone. The ending is one of the great acts of collective defiance in comedy.

- Midnight in Paris | rating 7.7
  URL: https://www.imdb.com/title/tt1605783/
  Nostalgia as trap. Every era idealizes the one before it. Gil's realization that he is doing exactly what he criticizes is the only honest way to end the film.

- Babel | rating 7.4
  URL: https://www.imdb.com/title/tt0449467/
  Interconnected failures across borders. The point is not that we are all connected but that connection does not imply understanding.

- Blue Jasmine | rating 7.3
  URL: https://www.imdb.com/title/tt2334873/
  What happens when the story you tell yourself about your life stops being sustainable. Blanchett's performance is a controlled demolition.

- The Darjeeling Limited | rating 7.2
  URL: https://www.imdb.com/title/tt0838221/
  Three brothers trying to reconnect through a spiritual journey that never becomes spiritual. The baggage metaphor is literal, which is the joke.

- Vicky Cristina Barcelona | rating 7.1
  URL: https://www.imdb.com/title/tt0497465/
  Two approaches to life, safety versus passion, tested against a third person who refuses to choose. Bardem and Cruz make chaos look inevitable.

### Series

- Band of Brothers | rating 9.4
  URL: https://www.imdb.com/title/tt0185906/
  The definitive treatment of what holds a unit together under conditions designed to destroy it. Leadership, loyalty, and the cost of both.

- The Wire | rating 9.3
  URL: https://www.imdb.com/title/tt0306414/
  The only television show that treats institutions as its real characters. Every season adds a system (police, docks, politics, schools, media) and shows how each one fails the people inside it.

- The Sopranos | rating 9.2
  URL: https://www.imdb.com/title/tt0141842/
  Therapy as narrative device. Tony Soprano cannot change because the system that made him rewards exactly what therapy asks him to confront. The show invented modern television.

- Game of Thrones | rating 9.2
  URL: https://www.imdb.com/title/tt0944947/
  The first four seasons are the best treatment of political realism in popular fiction. Power is not claimed by the worthy. It is seized by whoever understands the game.

- Sherlock | rating 9.1
  URL: https://www.imdb.com/title/tt1475582/
  Intelligence as performance. Cumberbatch and Freeman's chemistry carries it, but the best episodes work because the puzzles are genuinely clever.

- The Office | rating 9.0
  URL: https://www.imdb.com/title/tt0386676/
  The most accurate depiction of institutional life ever made, disguised as comedy. Michael Scott's need to be loved is funnier and sadder than anything scripted as drama.

- Seinfeld | rating 8.9
  URL: https://www.imdb.com/title/tt0098904/
  A show about nothing that invented the language of observational comedy for a generation. The characters are terrible people, and the show never asks you to forgive them.

- Succession | rating 8.8
  URL: https://www.imdb.com/title/tt7660850/
  Power, inheritance, and the impossibility of earning approval from someone who sees love as weakness. The best dialogue on television since Deadwood.

- Peaky Blinders | rating 8.8
  URL: https://www.imdb.com/title/tt2442560/
  Ambition as engine and trap. Tommy Shelby keeps building and can never stop, which is the most honest portrait of a certain kind of entrepreneur.

- Boardwalk Empire | rating 8.6
  URL: https://www.imdb.com/title/tt0979432/
  Prohibition-era America as a case study in how banning something creates the institutions that profit from its absence. Scorsese's visual language applied to television.

- Bron/Broen (The Bridge) | rating 8.6
  URL: https://www.imdb.com/title/tt1733785/
  Scandinavian noir at its best. Saga Norén's inability to perform social norms makes her a better detective, not worse. The original that launched a genre.

- Homeland | rating 8.3
  URL: https://www.imdb.com/title/tt1796960/
  Paranoia as professional requirement. The first two seasons are extraordinary. Carrie's instability is inseparable from her insight, and the show takes that seriously.

- The Killing | rating 8.3
  URL: https://www.imdb.com/title/tt1637727/
  Slow, atmospheric, and willing to let the investigation feel as frustrating as real investigations do. One of the few crime shows that respects the audience's patience.

- Turn: Washington's Spies | rating 8.1
  URL: https://www.imdb.com/title/tt2543328/
  Espionage as the foundation of a nation. The show makes a convincing case that intelligence work, not battlefield heroics, won the American Revolution.

### Games

- Red Dead Redemption 2 | rating 97
  URL: https://www.metacritic.com/game/red-dead-redemption-2/
  The most fully realized open world ever built. A meditation on loyalty, decline, and the end of the frontier. Arthur Morgan's arc is one of the great tragic narratives in games.

- Grand Theft Auto V | rating 97
  URL: https://www.metacritic.com/game/grand-theft-auto-v/
  American capitalism as open world satire. Three protagonists, three relationships with money and violence, none of them redeemable. Rockstar built a system so detailed that the parody became indistinguishable from the thing it mocks.

- Civilization VI | rating 88
  URL: https://www.metacritic.com/game/sid-meiers-civilization-vi/
  Institutions, trade, diplomacy, and war across six thousand years. The game that makes you understand why empires expand, overextend, and collapse. Every session is a lesson in compounding decisions.

- Age of Empires II: Definitive Edition | rating 88
  URL: https://www.metacritic.com/game/age-of-empires-ii-definitive-edition/
  The RTS that taught a generation how civilizations rise and fall. Resource management, military strategy, and historical campaigns that still hold up decades later. The definitive edition proved the design was timeless.

- Diablo II | rating 88
  URL: https://www.metacritic.com/game/diablo-ii/
  The game that defined the action RPG. Loot, builds, and one more run. Blizzard at their peak, before they forgot what made them great. Every ARPG since is either copying it or reacting to it.

- Command & Conquer: Red Alert 2 | rating 86
  URL: https://www.metacritic.com/game/command-conquer-red-alert-2/
  Cold War absurdity as real-time strategy. Soviet Tesla coils versus Allied Prism tanks, played completely straight. The best C&C game and the peak of Westwood Studios before EA buried them.

- Total War: Warhammer III | rating 87
  URL: https://www.metacritic.com/game/total-war-warhammer-iii/
  The culmination of the Total War fantasy trilogy. Grand strategy meets spectacle, with the Immortal Empires campaign offering more strategic depth than any other game in the genre.

---

## Listening

Podcasts and videos I recommend

URL: https://federicocarrone.com/listening/

### 2026

- Coding Agents & Language Evolution: Navigating Uncharted Waters • José Valim • GOTO 2025 | Feb 2026
  URL: https://youtube.com/watch?v=VZcDxkFj_9E
  Valim thinks clearly about what AI coding agents mean for language design. Relevant to anyone building programming languages right now.

- Jiang Xueqin: Humanity's patterns, the nature of reality, and the battle for your mind. | Feb 2026
  URL: https://youtube.com/watch?v=CRw5CCq8Uf4
  A rare conversation that connects pattern recognition, consciousness, and information warfare without losing rigor.

- State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490 | Feb 2026
  URL: https://youtube.com/watch?v=EV7WhVT270Q
  Comprehensive overview of where AI actually stands. Scaling laws, geopolitics, and the gap between hype and deployment reality.

- Software is Eating Labor | Feb 2026
  URL: https://youtube.com/watch?v=dhyhR4Bzc0I
  The structural argument for what happens when AI automates cognitive work. Not hype, an honest assessment of labor market disruption.

- The Global World Order Is Collapsing- And It's Much Bigger Than Trump! | Feb 2026
  URL: https://youtube.com/watch?v=ribrY5okACk
  Institutional decay at the global level. The framing goes beyond personalities to the structural forces reshaping the international order.

- 60 Minute Business Masterclass (worth more than your Stanford MBA) | Feb 2026
  URL: https://youtube.com/watch?v=VFZb42SsZWc
  Dense, practical, and more useful than most business books. Worth the time if you run anything.

- How AI WIPES Out Capitalism Emad Mostaque | Feb 2026
  URL: https://youtube.com/watch?v=vfhszRuMA8Y
  Mostaque's thesis on AI and economic transformation. Provocative framing, but the mechanism he describes for how intelligence commoditization changes everything is worth engaging with seriously.

- "We have 900 days left." | Emad Mostaque | Feb 2026
  URL: https://youtube.com/watch?v=zQThHCB_aec
  The urgency argument for AI timelines. Whether you agree with the timeline or not, the structural reasoning is worth understanding.

- Ben Horowitz and David Solomon: The Sweetest Macro Spot in 40 Years | Feb 2026
  URL: https://youtube.com/watch?v=jLVgGGz5bvk
  Macro environment from the perspective of people who deploy capital at scale. The optimism is backed by structural reasoning, not sentiment.

- What The Keymaker Scene in The Matrix ACTUALLY Means | Feb 2026
  URL: https://youtube.com/watch?v=WRKsGogQYZo
  A close reading of one of the best action sequences in cinema. Shows how much philosophical architecture is embedded in what looks like a chase scene.

- How one artist invented modern pop culture (Moebius documentary) | Feb 2026
  URL: https://youtube.com/watch?v=QWaCsteIYig
  Moebius shaped Alien, Blade Runner, The Fifth Element, and most science fiction visual language. Essential context for understanding where the aesthetic of the future comes from.

- What most people Misunderstand about the Collapse of the Rules Based Order: Featuring Michael Every | Feb 2026
  URL: https://youtube.com/watch?v=re5Ys6NYQKo
  Michael Every is one of the sharpest macro thinkers working. His framework for understanding institutional collapse is grounded in history, not ideology.

- Marc Andreessen's 2026 Outlook: AI Timelines, US vs. China, and The Price of AI | Feb 2026
  URL: https://youtube.com/watch?v=xRh2sVcNXQ8
  Andreessen on AI geopolitics and deployment timelines. Useful for calibrating expectations against someone who sees the deal flow.

- The Engineering State vs The Lawyerly State with Dan Wang | Feb 2026
  URL: https://youtube.com/watch?v=GqcG2otUtyo
  Dan Wang's distinction between states that build things and states that litigate things. The implications for industrial policy and technological competitiveness are significant.

- Anthropic's Amodei on AI: Power and Risk | Feb 2026
  URL: https://youtube.com/watch?v=Ckt1cj0xjRM
  Amodei articulates the dual nature of AI development, enormous capability and genuine risk, without collapsing into either pure optimism or doom.

- This Painting Is Beyond Insane | Feb 2026
  URL: https://youtube.com/watch?v=k8kjPxrXuFU
  Art analysis that makes you see more than you would on your own. The kind of close attention to craft that applies to any discipline.

- How Did The World Get So Ugly? | Feb 2026
  URL: https://youtube.com/watch?v=tWYxrowovts
  The decline of public aesthetic standards and what it reveals about institutional priorities. Connects to the argument in my Friction as Luxury essay.

- Yuval Noah Harari Warns AI Will Take Over Language, Law, and Power at WEF | Feb 2026
  URL: https://youtube.com/watch?v=QxCpNpOV4Jo
  Harari's argument about AI capturing the instruments of meaning-making: language, law, narrative. The institutional implications are underexplored elsewhere.

---

## Backlog

Books, series, movies, anime, and animation I want to read or watch

URL: https://federicocarrone.com/backlog/

### Anime

- Fullmetal Alchemist: Brotherhood | 2009 | rating 9.1
  URL: https://www.imdb.com/title/tt1355642/
  Alchemy as metaphor for equivalent exchange. Two brothers trying to undo a mistake learn that every system has a price, and the price is never what you expect.

- Hunter x Hunter | 2011 | rating 9.0
  URL: https://www.imdb.com/title/tt2098220/
  Starts as adventure, becomes a meditation on power systems and what happens when the rules of a world are taken to their logical extremes. The Chimera Ant arc is one of the best arcs in any medium.

- One Piece | 1999 | rating 9.0
  URL: https://www.imdb.com/title/tt0388629/
  Freedom, institutional corruption, and the world government as antagonist. Over 1000 episodes. Irrecommendable by length, undeniable by ambition.

- Legend of the Galactic Heroes | 1988 | rating 9.0
  URL: https://www.imdb.com/title/tt0096633/
  The original 110-episode OVA. Democracy versus autocracy across a galactic war, with neither side caricatured. Political philosophy as space opera.

- Spirited Away | 2001 | rating 8.9
  URL: https://www.imdb.com/title/tt0245429/
  Miyazaki's masterpiece. A child navigating an alien economy where identity is literally taken from you if you forget who you are. Capitalism as spirit world.

- Frieren: Beyond Journey's End | 2023 | rating 8.9
  URL: https://www.imdb.com/title/tt22248376/
  An elf who outlives everyone she knows, learning too late what human connection meant. Time, memory, and the difference between measured years and lived ones.

- Death Note | 2006 | rating 8.8
  URL: https://www.imdb.com/title/tt0877057/
  What happens when one person gets absolute power to enforce their moral vision. A cat-and-mouse game that is really about whether justice can exist outside institutions.

- Vinland Saga | 2019 | rating 8.8
  URL: https://www.imdb.com/title/tt10233448/
  Ambition, violence, and the question of whether a person built by war can choose peace. Thematically close to Berserk but with a redemption arc.

- Mushishi | 2005 | rating 8.8
  URL: https://www.imdb.com/title/tt0807832/
  Contemplative, episodic, and patient. A wandering specialist studies organisms that exist beyond human understanding. About accepting what cannot be controlled.

- Steins;Gate | 2011 | rating 8.8
  URL: https://www.imdb.com/title/tt1910272/
  Time, causality, and irreversible consequences. The protagonist learns that you cannot engineer outcomes without destroying something. Connects to ergodic thinking.

- Jujutsu Kaisen | 2020 | rating 8.8
  URL: https://www.imdb.com/title/tt12343534/
  Cursed energy, sorcerers, and institutional politics within the jujutsu world. The power system is inventive and the fight choreography is among the best in modern anime. MAPPA at full capacity.

- Monster | 2004 | rating 8.7
  URL: https://www.imdb.com/title/tt0434706/
  A doctor chasing a serial killer across post-reunification Europe. Institutional corruption, moral responsibility, and the question of whether evil is systemic or individual.

- Perfect Blue | 1997 | rating 8.6
  URL: https://www.imdb.com/title/tt0156887/
  Satoshi Kon on identity fracture. A pop singer turned actress loses the boundary between performance and self. Influenced Black Swan and half of modern psychological horror.

- Code Geass | 2006 | rating 8.6
  URL: https://www.imdb.com/title/tt0994314/
  Political strategy, revolution, and institutional manipulation. A brilliant strategist uses an occupied nation's rebellion as a chess game. Power as performance.

- Millennium Actress | 2001 | rating 8.6
  URL: https://www.imdb.com/title/tt0291350/
  Satoshi Kon. Time, memory, and how narrative shapes identity. An actress's life and her film roles blur until the distinction stops mattering.

- Made in Abyss | 2017 | rating 8.6
  URL: https://www.imdb.com/title/tt7222086/
  The deeper you descend, the higher the cost to return. Irreversible consequences made literal. Deceptively cute art hiding genuinely dark themes about the price of knowledge.

- Demon Slayer | 2019 | rating 8.6
  URL: https://www.imdb.com/title/tt9335498/
  A boy becomes a demon slayer after his family is massacred and his sister is turned. Gorgeous animation by Ufotable, but the real draw is the emotional sincerity. Shonen conventions played completely straight and better for it.

- Planetes | 2003 | rating 8.5
  URL: https://www.imdb.com/title/tt0816398/
  Hard sci-fi about orbital debris collectors. Work, class, and institutional neglect of infrastructure. The most LambdaClass anime on any list.

- Neon Genesis Evangelion | 1995 | rating 8.5
  URL: https://www.imdb.com/title/tt0112159/
  The most influential anime of the last 30 years. Institutional dysfunction, individual psychology, and the impossibility of piloting a machine designed to save humanity when you cannot save yourself.

- Mob Psycho 100 | 2016 | rating 8.5
  URL: https://www.imdb.com/title/tt5897304/
  A psychic teenager with unlimited power who just wants to be normal. The argument that power without emotional maturity is meaningless, played for both comedy and genuine depth.

- Samurai Champloo | 2004 | rating 8.5
  URL: https://www.imdb.com/title/tt0423731/
  Watanabe's follow-up to Bebop. Edo-period Japan remixed with hip-hop. Style, anachronism, and rootlessness.

- Tatami Galaxy | 2010 | rating 8.5
  URL: https://www.imdb.com/title/tt1607949/
  Yuasa. A college student relives his university years choosing different paths each time, learning that the optimal choice does not exist. Connects to ergodic thinking about sample paths.

- Ping Pong the Animation | 2014 | rating 8.5
  URL: https://www.imdb.com/title/tt3592052/
  Yuasa on talent, effort, and what excellence actually costs. Five players, five philosophies of competition. Visually radical and emotionally precise.

- Nana | 2006 | rating 8.5
  URL: https://www.imdb.com/title/tt0810548/
  Brutal and honest about ambition, relationships, and the gap between who you want to be and who you become. Two women with the same name, opposite temperaments.

- Link Click | 2021 | rating 8.5
  URL: https://www.imdb.com/title/tt14976292/
  Chinese anime about two men who can enter photographs and relive the past. Time travel as emotional archaeology. Every decision to change the past creates consequences that ripple forward.

- Kaiji: Ultimate Survivor | 2007 | rating 8.4
  URL: https://www.imdb.com/title/tt1087727/
  Game theory under existential stakes. A man drowning in debt enters underground gambling games where the system is rigged. Decision-making when survival is the only metric.

- Shinsekai Yori | 2012 | rating 8.4
  URL: https://www.imdb.com/title/tt2419314/
  A society built on suppressing dangerous knowledge to maintain stability. What happens when the system that protects you is also what oppresses you. Slow, unsettling, and philosophically serious.

- Paranoia Agent | 2004 | rating 8.4
  URL: https://www.imdb.com/title/tt0433722/
  Satoshi Kon's TV series. Collective delusion, social pressure, and institutional failure in modern Japan. A mysterious attacker that may be a shared hallucination.

- Ranking of Kings | 2021 | rating 8.4
  URL: https://www.imdb.com/title/tt13409432/
  A deaf, physically weak prince in a world that values strength above all else. Subverts every expectation about power and leadership. Deceptively simple art hiding genuine narrative ambition.

- Chainsaw Man | 2022 | rating 8.3
  URL: https://www.imdb.com/title/tt13616990/
  A teenager merges with a devil and works for a government agency that hunts them. Nihilistic, visceral, and genuinely unpredictable. Subverts every shonen convention.

- Dandadan | 2024 | rating 8.3
  URL: https://www.imdb.com/title/tt30217403/
  Aliens, ghosts, and teenagers. Absurdist action with genuine emotional core. The animation quality is extraordinary.

- Odd Taxi | 2021 | rating 8.3
  URL: https://www.imdb.com/title/tt14134550/
  A walrus taxi driver gets pulled into a missing persons case. Anthropomorphic animals, interconnected storylines, and a mystery that rewards attention to detail. The most structurally tight anime in years.

- Paprika | 2006 | rating 8.2
  URL: https://www.imdb.com/title/tt0851578/
  Satoshi Kon. A device that lets therapists enter patients' dreams is stolen. Dreams invade reality. Adjacent to Ghost in the Shell from the subconscious side. Influenced Inception directly.

- Claymore | 2007 | rating 8.2
  URL: https://www.imdb.com/title/tt0988824/
  Women engineered to fight monsters, slowly becoming what they hunt. Berserk-adjacent in its bleakness. The manga is the complete story; the anime ends mid-arc.

- Summer Time Rendering | 2022 | rating 8.2
  URL: https://www.imdb.com/title/tt15686254/
  A boy returns to his island hometown for a funeral and discovers something is copying and replacing the residents. Time loops, body horror, and escalating stakes. Tighter than most shows attempting the same premise.

- Serial Experiments Lain | 1998 | rating 8.1
  URL: https://www.imdb.com/title/tt0500092/
  Identity dissolving into networks. A girl discovers she exists more fully online than in reality. Adjacent to the Death of the Inner Self essay. Prophetic about the internet's effect on selfhood.

- Psycho-Pass | 2012 | rating 8.1
  URL: https://www.imdb.com/title/tt2379308/
  A society where an AI system judges criminal intent before crimes happen. What happens when the institution designed to maintain order becomes the source of injustice.

- Pluto | 2023 | rating 8.1
  URL: https://www.imdb.com/title/tt26737616/
  Urasawa reimagines an Astro Boy arc as a detective thriller. Robots, war trauma, and the question of whether artificial beings can grieve. From the creator of Monster.

- 91 Days | 2016 | rating 8.1
  URL: https://www.imdb.com/title/tt5765640/
  Prohibition-era mafia revenge in 13 episodes. Tight, self-contained, and willing to let the consequences of violence be permanent. Fits alongside The Godfather and Boardwalk Empire.

- Delicious in Dungeon | 2024 | rating 8.1
  URL: https://www.imdb.com/title/tt21621494/
  An adventuring party eats the monsters in a dungeon to survive. Comedy premise, serious world-building. Studio Trigger treating fantasy ecology as a real system.

- Tokyo Godfathers | 2003 | rating 8.1
  URL: https://www.imdb.com/title/tt0388473/
  Satoshi Kon's most grounded film. Three homeless people find an abandoned baby on Christmas Eve. Dignity under institutional abandonment, held together by chance and stubbornness.

- One Outs | 2008 | rating 8.1
  URL: https://www.imdb.com/title/tt1324426/
  Game theory applied to baseball. A genius gambler enters professional baseball with a contract designed to bankrupt him. Pure strategic thinking in an adversarial system.

- Texhnolyze | 2003 | rating 8.0
  URL: https://www.imdb.com/title/tt0397238/
  Nihilism, institutional collapse, and technology replacing humanity. Set in an underground city where cybernetic limbs are the only economy. The darkest anime on this list.

- Afro Samurai | 2007 | rating 7.8
  URL: https://www.imdb.com/title/tt0465316/
  Hip-hop, samurai, and a revenge quest through a feudal-futuristic Japan. Samuel L. Jackson voices the lead. Style-driven and unapologetically violent.

- Trigun | 1998 | rating 7.8
  URL: https://www.imdb.com/title/tt0251439/
  Bebop-era sibling. A legendary outlaw who refuses to kill, testing pacifism as philosophy in a world that punishes it. Starts comedic, becomes serious.

- Tekkonkinkreet | 2006 | rating 7.5
  URL: https://www.imdb.com/title/tt0831888/
  Two orphans defending their territory in a city being consumed by development. Urban decay, childhood, and violence as the only language the system understands.

- Mutafukaz | 2017 | rating 6.7
  URL: https://www.imdb.com/title/tt4717402/
  French-Japanese co-production set in a Los Angeles analogue. Conspiracy, aliens, and life at the margins. Visually inventive, narratively chaotic.

### Animation

- Scavengers Reign | 2023 | rating 8.6
  URL: https://www.imdb.com/title/tt21056886/
  Survivors of a crashed spacecraft adapt to an alien planet with its own ruthless ecology. Slow, wordless, and beautiful. Animation as nature documentary for a world that doesn't exist.

- Common Side Effects | 2025 | rating 8.5
  URL: https://www.imdb.com/title/tt28093628/
  From the creator of Bojack Horseman. Pharmaceutical conspiracies and American dysfunction. Adult animation that takes its premise seriously.

### Series

- Dark | 2017 | rating 8.7
  URL: https://www.imdb.com/title/tt5753856/
  Four families in a German town connected across multiple timescales. Time travel as determinism. The most carefully plotted show since The Wire, where every detail in episode one pays off by the finale.

- Severance | 2022 | rating 8.6
  URL: https://www.imdb.com/title/tt11280740/
  Workers have their memories surgically split between office and personal life. What happens to identity when you are literally two people. Corporate dystopia as psychological horror.

- Shogun | 2024 | rating 8.6
  URL: https://www.imdb.com/title/tt2788316/
  Political maneuvering in feudal Japan. An English navigator caught between warring lords, where every conversation is a negotiation and every alliance is temporary. Patience rewarded.

- Andor | 2022 | rating 8.6
  URL: https://www.imdb.com/title/tt9253284/
  Star Wars stripped of mythology and rebuilt as political thriller. How rebellion forms inside systems designed to prevent it. The best thing the franchise has produced since the original trilogy.

- The Penguin | 2024 | rating 8.6
  URL: https://www.imdb.com/title/tt15435876/
  Colin Farrell's Oz Cobb climbs Gotham's criminal hierarchy after the events of The Batman. Crime drama that barely needs the DC label. Power acquisition as character study.

- Mr. Robot | 2015 | rating 8.5
  URL: https://www.imdb.com/title/tt4158110/
  A hacker tries to destroy the financial system. The most technically accurate depiction of cybersecurity in fiction, wrapped in a story about loneliness, identity, and whether systemic change is possible from inside the system.

- The Bear | 2022 | rating 8.5
  URL: https://www.imdb.com/title/tt14452776/
  A fine dining chef returns to run his family's Chicago sandwich shop. Trauma, perfectionism, and the kitchen as pressure cooker for human dysfunction. The most stressful show on television.

- What We Do in the Shadows | 2019 | rating 8.5
  URL: https://www.imdb.com/title/tt7908628/
  Vampire roommates in Staten Island navigating modern life. Mockumentary comedy that gets funnier as the characters deepen. Six seasons of consistently inventive writing.

- Fallout | 2024 | rating 8.3
  URL: https://www.imdb.com/title/tt12637874/
  Post-nuclear America where corporations survived the apocalypse and rebuilt the same extractive systems. Dark comedy about institutional persistence. Based on the game series but stands alone.

- House of the Dragon | 2022 | rating 8.3
  URL: https://www.imdb.com/title/tt11198330/
  Targaryen civil war, 200 years before Game of Thrones. Succession politics where everyone has dragons. The question is not who has power but what power costs the people who hold it.

- Slow Horses | 2022 | rating 8.3
  URL: https://www.imdb.com/title/tt5875444/
  Disgraced MI5 agents exiled to a dead-end office who keep stumbling into real operations. Gary Oldman leading an ensemble of institutional rejects. British intelligence as bureaucratic comedy of errors.

- Ripley | 2024 | rating 8.1
  URL: https://www.imdb.com/title/tt11016042/
  Andrew Scott as Tom Ripley in black and white Italy. Patricia Highsmith's sociopath rendered with visual precision. Identity theft as art form.

- Squid Game | 2021 | rating 8.0
  URL: https://www.imdb.com/title/tt10919420/
  Desperate people compete in children's games for money while the wealthy watch. Class violence made literal. The premise is the critique: the system already treats people this way, just less visibly.

- Silo | 2023 | rating 7.9
  URL: https://www.imdb.com/title/tt14688458/
  Ten thousand people live in an underground silo with strict rules about what can be discussed. Institutional secrecy, forbidden knowledge, and the cost of asking questions the system does not want answered.

- A Knight of the Seven Kingdoms | 2025
  URL: https://www.imdb.com/title/tt23776532/
  Prequel to Game of Thrones set a century earlier. Hedge knights, Targaryen politics, and the Westeros political landscape before the events of the main series.

### Movies

- 12 Angry Men | 1957 | rating 9.0
  URL: https://www.imdb.com/title/tt0050083/
  One room, twelve men, one decision. The best film about persuasion, systems, and how institutions actually function at the micro level.

- Se7en | 1995 | rating 8.6
  URL: https://www.imdb.com/title/tt0114369/
  Fincher's masterpiece. Obsession, structure, and a world that punishes the people who try to impose meaning on it.

- Parasite | 2019 | rating 8.5
  URL: https://www.imdb.com/title/tt6751668/
  Class conflict, institutional failure, and what happens when people with nothing to lose infiltrate the world of those who have everything. Bong Joon-ho's tightest film.

- The Lives of Others | 2006 | rating 8.4
  URL: https://www.imdb.com/title/tt0405094/
  A Stasi officer surveilling an artist in East Berlin slowly begins to question the system he serves. Power, surveillance, and the cost of seeing clearly inside a corrupt institution.

- A Clockwork Orange | 1971 | rating 8.3
  URL: https://www.imdb.com/title/tt0066921/
  Kubrick on institutional violence, free will, and whether a society that removes the capacity for evil also removes the capacity for good.

- Heat | 1995 | rating 8.3
  URL: https://www.imdb.com/title/tt0113277/
  The definitive cops-and-robbers film. Professionalism as moral framework. De Niro and Pacino across a table, two men who understand each other better than anyone else in their lives.

- Oppenheimer | 2023 | rating 8.2
  URL: https://www.imdb.com/title/tt15398776/
  Nolan on the man who built the bomb and the institutions that consumed him afterward. Power, moral responsibility, and what happens when the thing you created is taken from you.

- No Country for Old Men | 2007 | rating 8.2
  URL: https://www.imdb.com/title/tt0477348/
  The Coen Brothers on fate, violence, and a world that no longer follows rules anyone understands. Chigurh is the logical endpoint of the argument they have been making across their filmography.

- El Secreto de sus Ojos | 2009 | rating 8.2
  URL: https://www.imdb.com/title/tt1305806/
  The best Argentine thriller. Memory, justice, and obsession across decades. If you liked Nueve Reinas for its Buenos Aires DNA, this is the next one.

- Network | 1976 | rating 8.1
  URL: https://www.imdb.com/title/tt0074958/
  A television anchor loses his mind on air and the network turns it into ratings. Written in 1976, more accurate about media incentives now than it was then.

- Blade Runner | 1982 | rating 8.1
  URL: https://www.imdb.com/title/tt0083658/
  The film Ghost in the Shell was responding to. Identity, consciousness, and what it means to be human in a world where the line between artificial and real has dissolved.

- Stalker | 1979 | rating 8.1
  URL: https://www.imdb.com/title/tt0079944/
  Three men walk into a zone where the rules of reality bend. Tarkovsky on desire, faith, and what people actually want when the constraints are removed.

- Prisoners | 2013 | rating 8.1
  URL: https://www.imdb.com/title/tt1392214/
  Villeneuve's darkest film. A father's moral collapse when institutions fail to find his daughter. How far you go when the system cannot help you.

- Relatos Salvajes | 2014 | rating 8.1
  URL: https://www.imdb.com/title/tt3011894/
  Six stories about frustration, revenge, and what happens when people stop complying with social norms. Dark comedy at its most Argentine.

- Blade Runner 2049 | 2017 | rating 8.0
  URL: https://www.imdb.com/title/tt1856101/
  Villeneuve's continuation asks whether a manufactured memory can ground a real identity. Arguably his best work. The pacing and visual language match Dune's ambition.

- Arrival | 2016 | rating 7.9
  URL: https://www.imdb.com/title/tt2543164/
  Linguistic determinism, time, and the question of whether you would choose suffering if you knew it was coming. The most intellectually serious sci-fi film of the last decade.

- A Prophet | 2009 | rating 7.9
  URL: https://www.imdb.com/title/tt1235166/
  French prison film about power acquisition from nothing. Institutional dynamics inside a closed system. Closer to City of God than to anything else.

- Children of Men | 2006 | rating 7.9
  URL: https://www.imdb.com/title/tt0206634/
  Institutional collapse in a near-future where humanity has stopped reproducing. The long tracking shots are technically extraordinary, but the real subject is what holds civilization together when hope disappears.

- The Social Network | 2010 | rating 7.8
  URL: https://www.imdb.com/title/tt1285016/
  Fincher and Sorkin on the founding of Facebook. Ambition, betrayal, and the loneliness of building something that connects everyone except yourself.

- All Quiet on the Western Front | 2022 | rating 7.8
  URL: https://www.imdb.com/title/tt1016150/
  German adaptation of Remarque's novel. The machinery of war consuming the young men fed into it. Visceral and unsparing, with no interest in heroism.

- Ex Machina | 2014 | rating 7.7
  URL: https://www.imdb.com/title/tt0470752/
  A programmer evaluates whether an AI is conscious. The real test is not what the AI knows but what the human refuses to see. The cleanest Turing test film.

- Everything Everywhere All at Once | 2022 | rating 7.7
  URL: https://www.imdb.com/title/tt6710474/
  A laundromat owner discovers she can access alternate versions of herself across the multiverse. Absurdist action comedy that becomes genuinely moving. The argument that paying attention to the small things is the only meaningful response to nihilism.

- Aftersun | 2022 | rating 7.6
  URL: https://www.imdb.com/title/tt19770238/
  A daughter rewatches home videos of a vacation with her father, trying to understand what she could not see as a child. Memory, depression, and the distance between who someone appears to be and who they are.

- The Banshees of Inisherin | 2022 | rating 7.6
  URL: https://www.imdb.com/title/tt11813216/
  A man on a small Irish island is told by his lifelong friend that the friendship is over. Stubbornness, meaning, and what happens when someone decides they want more from life than pleasantness.

- Arco | 2024 | rating 7.5
  URL: https://www.imdb.com/title/tt14883538/
  Argentine film about the 2001 economic crisis seen through one family's collapse. If you grew up there, you lived it. If you didn't, this is the closest you'll get.

- Tár | 2022 | rating 7.4
  URL: https://www.imdb.com/title/tt14444726/
  Cate Blanchett as a world-renowned conductor whose institutional power begins to unravel. The anatomy of how authority is constructed and how it collapses. The most precise film about cancel culture that never uses the word.

- The Zone of Interest | 2023 | rating 7.3
  URL: https://www.imdb.com/title/tt7160372/
  The commandant of Auschwitz and his family living their comfortable domestic life next to the camp. Evil as banality, filmed with clinical detachment. What you do not see is the point.

### Games

- Baldur's Gate 3 | 2023 | rating 97
  URL: https://store.steampowered.com/app/1086940/Baldurs_Gate_3/
  The most complete RPG in years. Consequences that actually matter, characters that remember what you did, and a level of systemic depth that rewards creative problem-solving. Peak of the genre.

- Elden Ring | 2022 | rating 96
  URL: https://store.steampowered.com/app/1245620/ELDEN_RING/
  FromSoftware's open world. Exploration, difficulty, and a world that refuses to explain itself. The most complete world-building achievement in the medium, designed by Miyazaki with lore by George R.R. Martin.

- The Witcher 3: Wild Hunt | 2015 | rating 92
  URL: https://store.steampowered.com/app/292030/The_Witcher_3_Wild_Hunt/
  Moral ambiguity as game design. Every choice has consequences, none are clean, and the world does not wait for you to decide. The closest games have come to prestige television.

- Crusader Kings III | 2020 | rating 91
  URL: https://store.steampowered.com/app/1158310/Crusader_Kings_III/
  Political simulation across centuries. Dynasties rise, overextend, and collapse through marriage, murder, and mismanagement. Succession the show as emergent gameplay.

- Factorio | 2020 | rating 90
  URL: https://store.steampowered.com/app/427520/Factorio/
  Systems building in its purest form. Design, optimize, scale, and watch complexity emerge from simple rules. The game equivalent of infrastructure engineering.

- Disco Elysium | 2019 | rating 91
  URL: https://store.steampowered.com/app/632470/Disco_Elysium__The_Final_Cut/
  Essentially an interactive novel. A detective with amnesia investigates a murder in a politically fractured city. Deeply political, philosophically dense, nothing else like it in any medium.

- Pentiment | 2022 | rating 86
  URL: https://store.steampowered.com/app/1205520/Pentiment/
  A small, slow historical mystery set in a Bavarian abbey. Ideas, legacy, and the tension between institutional authority and individual conscience. Obsidian at their most literary.

### Books

- Nonlinear Dynamics and Chaos | 1994 | rating 4.38
  URL: https://www.goodreads.com/book/show/116164.Nonlinear_Dynamics_and_Chaos
  Strogatz's textbook on dynamical systems. The mathematical foundations of how complex behavior emerges from simple rules. Prerequisite for thinking seriously about complexity.

- The Maniac | 2023 | rating 4.36
  URL: https://www.penguinrandomhouse.com/books/725022/the-maniac-by-benjamin-labatut/
  Labatut on von Neumann, the hydrogen bomb, and the birth of modern computation. Fictionalized history where genius and madness are indistinguishable. The intellectual ancestry of everything we are building now.

- Gödel, Escher, Bach | 1979 | rating 4.29
  URL: https://www.hachettebookgroup.com/titles/douglas-r-hofstadter/godel-escher-bach/9780465026562/
  Hofstadter on self-reference, formal systems, and consciousness. How meaning emerges from meaningless symbols. The book that launched a generation of interdisciplinary thinking.

- Chip War | 2022 | rating 4.25
  URL: https://www.simonandschuster.com/books/Chip-War/Chris-Miller/9781982172008
  How semiconductors became the most contested technology on earth. The geopolitical history of chips, from Texas Instruments to TSMC, and why the US-China competition over fabrication capacity is the defining industrial conflict of this era.

- Meditations | 180 | rating 4.28
  URL: https://www.penguinrandomhouse.com/books/292839/meditations-by-marcus-aurelius/
  Marcus Aurelius writing to himself about duty, impermanence, and self-governance. A Roman emperor's private journal. The companion piece to Epictetus.

- The Machiavellians | 1943 | rating 4.27
  URL: https://books.apple.com/us/book/the-machiavellians/id6446856171
  Burnham on the elite theorists (Mosca, Pareto, Michels) who argued that all political systems are oligarchies regardless of ideology. Power analysis without illusions.

- Determined | 2023 | rating 4.23
  URL: https://www.penguinrandomhouse.com/books/592344/determined-by-robert-m-sapolsky/
  Sapolsky's full case against free will, drawing on neuroscience, genetics, and evolutionary biology. Rigorous and readable. If he is right, every system of punishment and reward needs rethinking.

- The WEIRDest People in the World | 2020 | rating 4.21
  URL: https://www.goodreads.com/book/show/51710349-the-weirdest-people-in-the-world
  Henrich on how Western, Educated, Industrialized, Rich, Democratic psychology became the global default. The Catholic Church's marriage policies as the origin of individualism. Changes how you see institutions.

- Seeing Like a State | 1998 | rating 4.21
  URL: https://yalebooks.yale.edu/book/9780300078152/seeing-like-a-state/
  Scott on how states simplify complex realities to make them legible and controllable, and the catastrophes that follow. Essential for anyone building systems that govern human behavior.

- The Greeks and the Irrational | 1951 | rating 4.20
  URL: https://www.ucpress.edu/books/the-greeks-and-the-irrational/paper
  Dodds on the role of irrationality, madness, and divine possession in Greek culture. The counterargument to the myth of Greece as pure rationalism.

- The Dawn of Everything | 2021 | rating 4.20
  URL: https://www.goodreads.com/book/show/56269264-the-dawn-of-everything
  Graeber and Wengrow rewrite the standard narrative of human prehistory. Societies were not stuck on a ladder from bands to states. Political organization was always a choice, and people knew it.

- The Sovereign Individual | 1997 | rating 4.19
  URL: https://www.simonandschuster.com/books/The-Sovereign-Individual/James-Dale-Davidson/9780684832722
  Davidson and Rees-Mogg predicted in 1997 that digital technology would erode the nation-state's monopoly on violence and taxation. Written before Bitcoin, more relevant after it.

- The Revolt of the Public | 2014 | rating 4.19
  URL: https://www.stripepress.com/books/the-revolt-of-the-public
  Gurri on how the information revolution destroyed the authority of institutions without replacing them. The best framework for understanding the last decade of politics.

- The Beginning of Infinity | 2011 | rating 4.17
  URL: https://www.penguinrandomhouse.com/books/210236/the-beginning-of-infinity-by-david-deutsch/
  Deutsch on knowledge creation, the nature of explanation, and why progress has no limit. The most optimistic serious book about the future of civilization.

- Four Thousand Weeks | 2021 | rating 4.17
  URL: https://www.goodreads.com/book/show/54785515-four-thousand-weeks
  Burkeman on time, finitude, and the impossibility of optimization. The argument that productivity culture is a defense mechanism against mortality. Philosophy of time that actually lands.

- Debt: The First 5,000 Years | 2011 | rating 4.16
  URL: https://www.goodreads.com/book/show/6617037-debt
  Graeber on how debt preceded money, not the other way around. Moral obligation, violence, and the institutional machinery that turns human relationships into accounting. Pairs with The Dawn of Everything.

- History of Western Philosophy | 1945 | rating 4.13
  URL: https://www.simonandschuster.com/books/History-of-Western-Philosophy/Bertrand-Russell/9780671201586
  Russell's opinionated survey from the pre-Socratics to logical positivism. Better as intellectual history than neutral philosophy. The prose alone is worth reading.

- A Source Book in Chinese Philosophy | 1963 | rating 4.12
  URL: https://books.apple.com/us/book/a-source-book-in-chinese-philosophy/id395891594
  Wing-Tsit Chan's anthology of primary texts from Confucius through neo-Confucianism. The standard entry point for engaging with Chinese philosophical traditions directly.

- Complexity: A Guided Tour | 2009 | rating 4.11
  URL: https://academic.oup.com/book/35598
  Melanie Mitchell's accessible introduction to complexity science. Emergence, self-organization, and computation in biological and social systems. From the Santa Fe Institute tradition.

- When We Cease to Understand the World | 2020 | rating 4.10
  URL: https://www.goodreads.com/book/show/62069739-when-we-cease-to-understand-the-world
  Labatut's earlier work. Fictionalized accounts of Schwarzschild, Heisenberg, Grothendieck, and others at the edge of knowledge. Science as encounter with the incomprehensible. Bridges genius and madness beautifully.

- The Man Who Solved the Market | 2019 | rating 4.05
  URL: https://www.goodreads.com/book/show/43889703-the-man-who-solved-the-market
  The story of Jim Simons and Renaissance Technologies. How a mathematician built the most successful hedge fund in history using signal processing, not financial theory.

- Sync | 2003 | rating 4.07
  URL: https://www.goodreads.com/book/show/354421.Sync
  Strogatz on spontaneous synchronization in nature. Fireflies, neurons, bridges, and planets all following the same mathematical patterns. How order emerges without a conductor.

- Amp It Up | 2022 | rating 4.01
  URL: https://ampitupbook.com/
  Frank Slootman on operational intensity. The CEO of Snowflake, Data Domain, and ServiceNow on raising the bar, cutting complexity, and refusing to accept mediocre performance.

- Against the Gods | 1996 | rating 3.95
  URL: https://www.goodreads.com/book/show/128429.Against_the_Gods
  Bernstein's history of risk. From ancient gambling to modern financial theory. How humanity learned to quantify uncertainty and what that changed about institutions and decision-making.

- From Bacteria to Bach and Back | 2017 | rating 3.79
  URL: https://books.apple.com/us/book/from-bacteria-to-bach-and-back/id1090862474
  Dennett on how minds, meaning, and culture evolved from mindless processes. Competence without comprehension as the engine of both biology and technology.

- The Technological Republic | 2025 | rating 3.58
  URL: https://www.penguinrandomhouse.com/books/721900/the-technological-republic-by-alex-karp/
  Alex Karp (Palantir CEO) on the relationship between technology companies and democratic governance. The argument that defense tech is inseparable from political freedom.

- The Age of AI | 2021 | rating 3.43
  URL: https://ageofaibook.com/
  Kissinger, Schmidt, and Huttenlocher on AI's implications for society, security, and the global order. The geopolitical perspective on artificial intelligence from establishment thinkers.

---

## Sources

Podcasts, publications, blogs, and references I follow across different fields

URL: https://federicocarrone.com/sources/

### Podcasts

- Cognitive Revolution
  URL: https://www.cognitiverevolution.ai/

- Philosophize This!
  URL: https://www.philosophizethis.org/

- The Jolly Swagman
  URL: https://podcasts.apple.com/au/podcast/the-jolly-swagman-podcast/id1267280945

- Sean Carroll's Mindscape
  URL: https://www.preposterousuniverse.com/podcast/

- History of Philosophy Without Any Gaps
  URL: https://historyofphilosophy.net/

- Entitled Opinions
  URL: https://entitledopinions.stanford.edu/

- Invest Like the Best
  URL: https://www.joincolossus.com/podcast

- Lunch with the FT
  URL: https://www.ft.com/lunch-with-the-ft

- The Jim Rutt Show
  URL: https://jimruttshow.com/

- Manifold (Stripe Press)
  URL: https://press.stripe.com/manifold

- Acquired
  URL: https://www.acquired.fm/

- Dwarkesh Podcast
  URL: https://www.dwarkeshpatel.com/podcast

- Conversations with Tyler
  URL: https://conversationswithtyler.com/

- EconTalk
  URL: https://www.econtalk.org/

- Odd Lots
  URL: https://www.bloomberg.com/oddlots-podcast

- In Our Time
  URL: https://www.bbc.co.uk/programmes/b006qykl

### Publications

- The Economist
  URL: https://www.economist.com/

- Financial Times
  URL: https://www.ft.com/

- FT Alphaville
  URL: https://www.ft.com/alphaville

- Palladium Magazine
  URL: https://www.palladiummag.com/

- Le Grand Continent
  URL: https://legrandcontinent.eu/

- Works in Progress
  URL: https://worksinprogress.co/

- Construction Physics
  URL: https://constructionphysics.substack.com/

- Epsilon Theory
  URL: https://www.epsilontheory.com/

- Compact
  URL: https://compactmag.com/

- Phenomenal World
  URL: https://www.phenomenalworld.org/

- Asterisk Magazine
  URL: https://asteriskmag.com/

- Doomberg
  URL: https://doomberg.substack.com/

- Howard Marks Memos (Oaktree)
  URL: https://www.oaktreecapital.com/insights

- Bismarck Analysis
  URL: https://www.bismarckanalysis.com/

- Money Stuff (Matt Levine)
  URL: https://www.bloomberg.com/account/newsletters/money-stuff

- The Diff (Byrne Hobart)
  URL: https://thediff.co/

### Technology & Systems

- Stratechery
  URL: https://stratechery.com/

- Interconnects
  URL: https://www.interconnects.ai/

- The Gradient
  URL: https://thegradient.pub/

- Pirate Wires
  URL: https://www.piratewires.com/

### Philosophy & Culture

- The Point Magazine
  URL: https://www.thepointmag.com/

- Liberties Journal
  URL: https://libertiesjournal.com/

- The Hedgehog Review
  URL: https://hedgehogreview.com/

- n+1
  URL: https://www.nplusonemag.com/

- The Philosopher's Magazine
  URL: https://www.philosophersmag.com/

### Complexity & Science

- Santa Fe Institute Complexity Podcast
  URL: https://www.santafe.edu/engage/learn/podcasts

- Quanta Magazine
  URL: https://www.quantamagazine.org/

- Melanie Mitchell (SFI)
  URL: https://www.santafe.edu/people/profile/melanie-mitchell

### Ergodicity Economics

- Ergodicity Economics (Ole Peters)
  URL: https://ergodicityeconomics.com/

- London Mathematical Laboratory Newsletter
  URL: https://www.lml.org.uk/newsletter

### Historical Monetary Economics

- Barry Eichengreen
  URL: https://econ.berkeley.edu/people/faculty/barry-eichengreen

- This Time Is Different (Reinhart & Rogoff)
  URL: https://books.apple.com/us/book/this-time-is-different/id493550544

- Chartbook (Adam Tooze)
  URL: https://adamtooze.substack.com/

### Complexity Science Applied to Markets

- Doyne Farmer
  URL: https://www.santafe.edu/people/profile/doyne-farmer

- W. Brian Arthur
  URL: https://www.santafe.edu/people/profile/w-brian-arthur

- The Origin of Wealth (Eric Beinhocker)
  URL: https://www.barnesandnoble.com/w/the-origin-of-wealth-eric-d-beinhocker/1100241782

### Formal Methods & Programming Language Theory

- SIGPLAN Blog
  URL: https://sigplan.org/blogs/

- Hillel Wayne's Newsletter
  URL: https://buttondown.email/hillelwayne

- Robert Harper's Blog
  URL: https://existentialtype.wordpress.com/

- TYPES Mailing List
  URL: https://lists.seas.upenn.edu/mailman/listinfo/types-list

- Papers We Love
  URL: https://paperswelove.org/

### Cryptography & Distributed Systems

- Real World Cryptography
  URL: https://www.realworldcryptography.com/

- IACR ePrint
  URL: https://eprint.iacr.org/

- Decentralized Thoughts
  URL: https://decentralizedthoughts.github.io/

- ZKProof Community Blog
  URL: https://zkproof.org/blog/

### Geopolitics & Political Economy

- Foreign Affairs
  URL: https://www.foreignaffairs.com/

- Branko Milanovic (Global Inequality)
  URL: https://glineq.blogspot.com/

- Noahpinion
  URL: https://www.noahpinion.blog/

- Limes
  URL: https://www.limesonline.com/

### Wine & Terroir

- Jancis Robinson
  URL: https://www.jancisrobinson.com/

- Wine Scholar Guild
  URL: https://winescholarguild.com/

- Jon Bonne
  URL: https://jonbonne.substack.com/

- Decanter
  URL: https://www.decanter.com/

### Military History & Strategy

- War on the Rocks
  URL: https://warontherocks.com/

- The Strategy Bridge
  URL: https://thestrategybridge.org/

- Clausewitz, On War
  URL: https://academic.oup.com/japan/book/15906

- Supplying War (Martin van Creveld)
  URL: https://www.cambridge.org/core/books/supplying-war/DBED529A53D50FF141AD049D5B6BA060

- The Transformation of War (Martin van Creveld)
  URL: https://www.simonandschuster.com/books/The-Transformation-of-War/Martin-van-Creveld/9780684830426

### Art History

- The Burlington Magazine
  URL: https://www.burlington.org.uk/

- Farewell to an Idea (T.J. Clark)
  URL: https://yalebooks.yale.edu/book/9780300077422/farewell-to-an-idea/

- Ways of Seeing (John Berger)
  URL: https://www.penguinrandomhouse.com/books/168996/ways-of-seeing-by-john-berger/

### Long-form Journalism

- Harper's Magazine
  URL: https://harpers.org/

- London Review of Books
  URL: https://www.lrb.co.uk/

- The New York Review of Books
  URL: https://www.nybooks.com/

- Delayed Gratification
  URL: https://www.slow-journalism.com/

---