The AI Training Data Trap for Programming Languages Has an Exit

A locomotive emerging from golden rain and mist on a railway bridge, the landscape dissolving into light and atmosphere
Rain, Steam and Speed, J. M. W. Turner, 1844

Edgar Luque recently wrote about how AI creates a new adoption barrier for programming languages. His claim is that AI coding assistants need training data, training data only exists for popular languages, and so new languages get bad AI support, which prevents adoption, which in turn prevents training data from accumulating. A self-reinforcing loop that locks in whatever is already dominant.

If you are building a new general-purpose language that competes with Python, Go, or Rust on roughly the same terms, Luque’s analysis is devastating. What makes it worse than previous adoption barriers is that you cannot community-effort your way out of it. The AI training pipelines belong to a handful of companies, and those companies will always prioritize the languages where the most data already exists.

But there is a blind spot in the argument.

#Where I Think Luque Is Wrong

Luque assumes that languages are passive consumers of AI support. But what if a language is designed so that machines can reason about it better than they reason about established languages, because it is simple and explicit enough that an LLM can work largely from the specification?

Established languages carry enormous amounts of tacit knowledge: idioms, conventions, workarounds, and unwritten rules that live only in the collective practice of millions of developers. Think about Python’s for/else and the fact that most experienced developers avoid it entirely, or Go’s error handling conventions that are nowhere in the language spec, or the subtle difference in Rust between when you should use unwrap and when you should propagate with ?. An LLM needs a massive corpus precisely because it needs to soak up all of this unwritten knowledge through sheer exposure. As I wrote in Legibility Kills What It Measures, Michael Polanyi called this the tacit dimension: “we know more than we can tell.” The bigger a language’s tacit dimension, the more training data you need before an AI can navigate it.

A language that strips out enough tacit knowledge changes the loop. When more of what matters is already in the grammar, the types, and the capability annotations, the specification carries much more of the burden. That is the bet behind Concrete.

#What Concrete Does Differently

Every design choice points in the same direction: minimize ambiguity, maximize what a tool can figure out just by reading the code.

LL(1) grammar. The entire language parses with one token of lookahead. No ambiguous constructs, no context-dependent parsing rules. The syntactic surface is genuinely small.

Explicit control flow. What you read is what executes. No implicit destructors at scope exit, no exception unwinding through invisible paths, no operator overloading quietly changing what + does.

Explicit capabilities. If a function reads a file, allocates memory, or touches the network, the signature says so: with(File), with(Network), with(Alloc). You do not need to trace the call graph to know what a function might do.

Linear types. Owned values must be consumed exactly once. The compiler rejects code that forgets to clean up a resource or uses one after it was moved. This is the kind of bug that LLMs are particularly bad at catching, because it requires tracking state across an entire function body.

One way to do things. No closures and lambdas competing for the same job, no exceptions and result types overlapping, no five different iteration styles. Result<T, E> with ? propagation, and nothing else. Less surface area means fewer opportunities to pick the wrong approach.

Rust is the closest existing language to this list, which is exactly why it is the right comparison. But Rust still carries a large tacit layer around the core language: implicit destructors via Drop that run at scope exit, no capability system, operator overloading through traits, lifetime elision rules, Deref coercions, macro-heavy APIs, and ecosystem conventions around async, error handling, and trait patterns. Rust reduced tacit knowledge compared to C++, but a substantial amount still lives in practice rather than in the spec. Concrete pushes further in the same direction.

None of these features were designed for AI. I designed them because I think they make a better language for humans. They make behavior easier to see, APIs easier to review, and bugs easier to catch before runtime. Those same properties also make the language easier for a machine to generate correctly.

#Why This Changes the AI Problem

When an LLM generates Python, it leans heavily on patterns absorbed from millions of files. It has to, because no specification captures how experienced Python developers actually write Python. A language whose grammar fits in a few pages and whose type and capability system encodes most of the rules changes the economics. You can paste the entire language spec into a context window. The model does not need to have seen a million Concrete programs to know what is legal; it can read the rules and apply them.

That matters more now than it would have three years ago. Context windows are long enough to hold a full language spec alongside the code being generated. Tool use lets the model call the compiler mid-generation and read the errors back. Iterative repair workflows are standard. All of these trends help every language, but they help a small, explicit language disproportionately, because the spec fits in context and the compiler errors are precise enough to actually drive the fix loop.

The bottleneck shifts from “how much code exists in this language” to “how much of the language can be recovered from the spec and the compiler.” New languages will lose if the contest is raw corpus size. They can still compete if the contest is whether a model can generate valid code from explicit rules.

#Errors Have To Be Legible

In Python or JavaScript, many wrong programs still make it past generation and into execution. The LLM generates a function that forgets to close a file handle, or swallows an exception, or mutates shared state in a way that only breaks under concurrency. The bug surfaces later, often outside the moment when the code was written.

In Concrete, the compiler catches it. Forgot to consume a linear value? Compile error. Called a function that does I/O without the right capability? Compile error. Used a value after it was moved? Compile error. The error messages tell the LLM exactly what to fix.

The useful question is not just whether the LLM gets it right on the first try. It is whether the generate-check-fix loop converges quickly. Precise compiler errors help. Silent runtime failures do not.

Concrete goes further. It is written in Lean 4 and its core calculus is formalized and proven sound. As the volume of machine-generated code grows, tests and human review will fall further behind. If you want strong correctness guarantees at that scale, formal verification is the endgame.

#What Follows From This

Luque suggests that new languages can survive by retreating to niches where AI matters less. Concrete’s position is the opposite: target a world where AI matters more, where most code is machine-generated, and build the language so it works with that reality instead of hiding from it.

The AI adoption barrier is real. For most new languages, it makes adoption harder in a way they cannot easily fix. My view is that languages designed for machine generation and machine verification can compete on a different axis entirely. That is the case I am making for Concrete.