Number Theory and the Arithmetic of Recurrence

Saturn against black space, its pale banded disc encircled by broad rings divided by a dark gap between the bright inner ring and the outer one
Saturn and its rings, imaged by the Cassini orbiter in 2008. The dark Cassini Division is a band swept nearly empty by orbital resonance with the moon Mimas. NASA / JPL / Space Science Institute, public domain.

The first three essays were about one engine: iteration. A rule is applied again and again, and invariant objects appear. Fixed points. Attractors. Invariant sets. Invariant measures. Scaling laws. None of those systems knew anything about themselves. The logistic map does not contain its own formula. It is a rule being turned, and we watched what the turning left behind.

This essay is still about that engine, but it follows it to a place that looks, at first, like a different subject entirely: number theory.

The reason number theory has to enter is simple once you say it out loud. The moment a system repeats in time, integers appear. One cycle. Two cycles. Three. “Come back after $q$ steps.” “Line up after $p$ turns of one oscillator and $q$ turns of another.” Repetition is counting, and counting is arithmetic. So as soon as we take the single trajectory seriously, asking whether it returns, how nearly it returns, and how often, we are doing number theory whether we meant to or not.

That is the whole content of this essay. Whether a motion is stable can come down to whether a single ratio is rational, and if it is irrational, how irrational it is. By the end, a frequency being “hard to approximate by fractions” will be the difference between a structure that survives a disturbance and one that disintegrates.

There is also a second, deeper role for arithmetic waiting at the far end. The same integers that count returns can also be used to encode things: formulas, programs, proofs. When that happens, a system can begin to act on descriptions of itself, and the series changes engines. But that is the next essay. Here, arithmetic is still the clock and the ruler of dynamics, not yet a language a system speaks about itself.

#The Minimum Vocabulary

A frequency is a rate of rotation or oscillation: how many full turns or cycles happen per unit of time. A clock hand has a frequency. A planet orbiting the sun has one. A pendulum has one.

Two frequencies are commensurable if their ratio is a rational number, like $2/3$ or $7/4$. They are incommensurable if their ratio is irrational, like $\sqrt{2}$ or the golden ratio. This distinction will turn out to decide whether two repeating motions ever line up again exactly.

A resonance occurs when integer combinations of frequencies nearly cancel:

$$k\cdot\omega\approx 0.$$

Do not let the notation hide the picture. $\omega$ is a list of frequencies, $k$ is a list of whole numbers, and the dot product asks: is there a small whole-number recipe that makes these frequencies almost cancel out? When the answer is yes, a small push delivered at the right rhythm can build up instead of averaging away. A child on a swing learns this physically: push at the right frequency and small pushes accumulate into a large arc.

For a first intuition from linear algebra, think of solving a system of equations and finding a denominator that is nearly zero. The answer blows up. Near-resonances are exactly the places where the denominators of perturbation theory become nearly zero, and so the corrections blow up. We will make that precise later.

Diophantine approximation is the part of number theory that asks how well an irrational number can be approximated by fractions $p/q$. This is the technical heart of the essay. Near-rational frequencies create near-returns, near-returns create near-resonances, and near-resonances decide stability. So “how close is this number to a fraction?” is not idle arithmetic. It is the control parameter.

#Two Engines, And Why This One Reaches Arithmetic

Before going further, a word on where this essay sits, because the series is about to change character.

The claim of the series has never been that chaos, fractals, power laws, Gödel sentences, and recursive types are secretly the same object. They are not. They live in different mathematical worlds. The claim is narrower and more defensible: two related mechanisms keep forcing invariant objects to appear.

The first mechanism is iteration. Take a rule,

$$x\mapsto f(x),$$

and apply it over and over. That alone is enough to produce everything in the first three essays: fixed points, attractors, cycles, bifurcations, chaos, fractals, invariant measures, and power-law scaling. The system never represents itself. It is simply turned.

The second mechanism is self-reference: a system becomes rich enough to represent its own rules, and then act on those representations. That is the world of Gödel numbering, Turing machines, quines, and recursive types. It is the subject of the next essay, not this one.

This essay is the bridge between them, and number theory is the bridge’s keystone. Here is why. Number theory plays two completely different roles depending on which engine you are running.

Role of arithmeticWhat arithmetic doesEngine
Timekeepercounts returns, periods, near-returnsiteration
Stability filtercontrols small denominators in perturbation theoryiteration
Encoding substrateturns formulas and proofs into numbersself-reference
Self-reference mediumlets statements talk about their own codesself-reference

The top two rows are this essay. Integers count how a repeated motion comes back, and how irrational a frequency is decides whether a structure survives being disturbed. The bottom two rows are the next essay, where the same integers stop being a clock and become an alphabet. Keeping these roles apart is the single most useful habit for reading the rest of the series. The same subject, number theory, hinges the two halves because it can do both jobs.

A fixed point, by itself, is often a dull equation. The mathematics comes alive when you ask how it is reached, how it breaks under a small change, and whether the same shape reappears one level up. That is the spirit of everything below.

#Why Number Theory Suddenly Appears

At first, number theory seems unrelated to dynamics. It is not.

Number theory is the study of integers and the structures built from them: divisibility, primes, congruences, rational numbers, irrational numbers, and how well one can be approximated by the other. That sounds far from rolling marbles and folding intervals. But recall the observation we started with: the instant a system repeats, integers walk in. One return. Two returns. A motion that closes after $q$ steps. Two oscillators that realign after $p$ turns of one and $q$ of the other. Repetition counts, and counting is the door.

The basic question we need is the simplest one number theory can ask:

How close is a real number to a fraction?

That is Diophantine approximation, named after Diophantus. We will see that it controls whether a repeating motion comes back near where it started, and how a small disturbance lands when it does.

#A circle is the simplest place repetition lives

To see arithmetic enter, take the cleanest possible repeating system: a point hopping around a circle by a fixed amount each step.

$$\theta_{n+1}=\theta_n+\omega \pmod 1.$$

Read $\theta$ as a position on a clock face, except the clock is normalized so that $0$ and $1$ are the same point, the way 12 o’clock and 0 o’clock are the same place. Each step adds $\omega$ and, when the total passes $1$, wraps back around. The number $\omega$ is the rotation number: how far around you move per step. The phrase $\pmod 1$ just means “keep only the fractional part,” because going all the way around brings you back.

This is the one-step version of every clock, orbit, and oscillation in the essay. Everything else is this with more dimensions.

Now ask the only interesting question: does the point ever return exactly to where it started?

Case one: a rational rotation. Suppose

$$\omega=\frac{p}{q},$$

a fraction in lowest terms. Then after $q$ steps,

$$\theta_q=\theta_0+q\cdot\frac{p}{q}=\theta_0+p\equiv \theta_0 \pmod 1.$$

The $+p$ is a whole number of full turns, which $\pmod 1$ erases. So the point lands exactly back on its start. The orbit is periodic, and its period is the denominator $q$. Rational rotation means exact recurrence. The denominator literally tells you how many steps the cycle takes.

Case two: an irrational rotation. Suppose $\omega$ is irrational. Then the orbit never returns exactly, because $q\omega$ is never a whole number for any $q$, so $\theta_q$ is never exactly $\theta_0$. But something subtler happens: the orbit comes arbitrarily close to every point on the circle. Given enough steps, it will pass as near as you like to any target. Mathematicians say the orbit is dense. It never repeats, yet it eventually visits the whole circle.

So already, from one line of arithmetic, we have three regimes:

  1. rational rotation: exact recurrence,
  2. irrational rotation: no exact recurrence, but the orbit fills the circle,
  3. well-approximated irrational rotation: no exact recurrence, but very close near-returns.

The third regime is where the real action is, and it is the one that needs Diophantine approximation to describe. Dynamics often does not care whether a motion returns exactly. It cares whether it returns close enough, often enough, for a small repeated push to add up. That is a question about how well $\omega$ can be approximated by fractions.

#Measuring “how close to a fraction”

For an irrational $\omega$, the precise quantity is

$$|q\omega-p|,$$

asking how small this can be made by choosing integers $p$ and $q$. It measures the gap between $\omega$ and the nearest fraction $p/q$, because dividing through by $q$ gives

$$\left|\omega-\frac{p}{q}\right|=\frac{|q\omega-p|}{q}.$$

Why does this number matter dynamically? Because if $|q\omega-p|$ is tiny, then after $q$ rotations the point is extremely close to its start:

$$\theta_q=\theta_0+q\omega\approx \theta_0+p\equiv\theta_0\pmod 1.$$

That is a near-period. The motion does not close, but it almost does, and “almost” can be enough. If some small external forcing nudges the system once per rotation, then at a near-period the nudges arrive at nearly the same phase again and again, all pushing in nearly the same direction. Over many near-returns, those aligned pushes accumulate. That accumulation is what “resonance” means, written in the language of arithmetic. A frequency that admits very good rational approximations is a frequency that resonates easily.

#Continued fractions: the right ruler for irrationality

To talk about how well a number can be approximated, we need the right tool. That tool is the continued fraction.

Every irrational number has a unique continued fraction expansion, written compactly as

$$\omega=[a_0;a_1,a_2,a_3,\ldots],$$

which unpacks into a nested tower of fractions,

$$\omega=a_0+\cfrac{1}{a_1+\cfrac{1}{a_2+\cfrac{1}{a_3+\cdots}}}.$$

The whole numbers $a_0,a_1,a_2,\ldots$ are called the partial quotients. If you stop the tower early, you get a fraction called a convergent, and the convergents are, in a precise sense, the best possible rational approximations of $\omega$ for their size of denominator. They are the fractions a clockmaker would choose.

The crucial fact is that the size of the partial quotients tells you how approximable the number is. A large partial quotient means an unusually good approximation is available, because it means the next convergent barely changes the value while sharply shrinking the error. A number with large partial quotients is easy to approximate by fractions, and so it resonates easily.

Two examples make this concrete:

$$\sqrt{2}=[1;2,2,2,\ldots],\qquad \varphi=[1;1,1,1,\ldots].$$

The golden ratio $\varphi=(1+\sqrt 5)/2$ has the smallest possible partial quotients, all ones, forever. Since large partial quotients are what produce exceptionally good rational approximations, having all ones means the golden ratio never offers one. It is, in a precise and provable sense, the hardest irrational number to approximate by fractions. The poetic phrase is that it is “the most irrational number.” The concrete content is that $|q\varphi-p|$ refuses to get small too quickly as $q$ grows.

This is not a curiosity; it is why phyllotaxis works. A sunflower head or an aloe rosette places each new seed or leaf one fixed angle of rotation from the last. If that angle is a rational fraction of a full turn, the seeds line up into a few radial spokes and waste space. The golden angle, which is $\varphi$ expressed as a fraction of a circle, is the one rotation that never lets the seeds line up, so they pack with no preferred direction and no gaps. Phyllotaxis is the circle-rotation map run in a flower, choosing the most irrational rotation number for the most even packing. The same fact that makes the golden ratio fill a seed head evenly will, in a moment, make a particular orbit in the solar system the most robust one.

A conceptual plot comparing small divisors for the golden ratio and an easily approximated irrational number

#Simulation: Small Divisors

Small values of |qω - p| are near-resonances. The golden ratio avoids exceptionally small denominators better than many other irrationals.

The plot shows

$$|q\omega-p|$$

with $p$ chosen as the nearest integer to $q\omega$, swept over many denominators $q$. Compare the golden ratio against a number engineered to have a very good rational approximation:

import numpy as np
import matplotlib.pyplot as plt

phi = (1 + np.sqrt(5)) / 2
easy = np.sqrt(2) + 1 / 10_000
qs = np.arange(1, 400)

def small_divisors(omega):
    ps = np.round(qs * omega)
    return np.abs(qs * omega - ps)

plt.semilogy(qs, small_divisors(phi), label="golden ratio")
plt.semilogy(qs, small_divisors(easy), label="easier approximation")
plt.xlabel("q")
plt.ylabel("|qω - p|")
plt.legend()

What the reader should see: irrational numbers are not interchangeable. Some produce denominators that dip dangerously small; the golden ratio keeps its distance. In a moment, those small denominators are exactly where stability breaks.

#The ruler is itself a dynamical system

Here is a loop back to the earlier essays that is too pretty to skip. Continued fractions are not produced by hand. They come out of a dynamical system, the Gauss map:

$$G(x)=\left{\frac{1}{x}\right},$$

where ${\cdot}$ means “take the fractional part.” Feed in a number between $0$ and $1$, take its reciprocal, throw away the integer part, and repeat. Each application peels off the next partial quotient of the continued fraction. The tool we use to measure irrationality is itself a rule being iterated.

And like the chaotic maps of the second essay, the Gauss map has an invariant measure, the Gauss measure:

$$d\mu(x)=\frac{1}{\log 2}\frac{dx}{1+x}.$$

This is the distribution left unchanged when you push a whole density of starting numbers through $G$ once, exactly the kind of invariant object the power-laws essay introduced as a fixed point of the operator that moves distributions forward. So continued fractions tie together recurrence, chaos, invariant measures, and arithmetic in one small map. The instrument we built to study the ladder turns out to be another rung of it.

#Simulation: Gauss Map

Move the initial number and replay the orbit. The Gauss map G(x) = {1/x} generates continued-fraction digits, showing that rational approximation is itself driven by a chaotic dynamical system.

#From One Frequency to Many: KAM Theory

So far there has been one rotation number on one circle. Real mechanical systems, a planet pulled by several others, a set of coupled pendulums, an asteroid between Jupiter and the sun, have several frequencies at once. The question of recurrence becomes the question of whether all of those frequencies realign, and that is where the deepest result in this essay lives: KAM theory, named for Kolmogorov, Arnold, and Moser.

To get there, we need a little of the natural language of mechanics.

The systems of classical mechanics are usually written in Hamiltonian form, an energy-preserving bookkeeping invented for exactly these problems. Their state lives in phase space, which records not just where things are but how fast they are moving. For one particle in one dimension, phase space has two coordinates,

$$q=\text{position},\qquad p=\text{momentum},$$

and for many moving parts it has many such pairs. A phase-space point is a complete snapshot: positions and momenta together.

A Hamiltonian system is called integrable when it is solvable cleanly enough that its motion lies on a smooth surface called an invariant torus. A torus is the shape of a donut’s surface, and in higher dimensions it is the product of several circles. The point of the name is that motion on it is quasi-periodic: the system winds around the donut with several frequencies at once, like several independent clock hands turning at incommensurable rates, never quite repeating but never wandering off the surface either.

The circle rotation from before is the one-clock version. The torus version is

$$\theta_{n+1}=\theta_n+\omega \pmod 1,\qquad \theta=(\theta_1,\ldots,\theta_d),\quad \omega=(\omega_1,\ldots,\omega_d).$$

Each component is one angle on one circle; each step advances all of them. If the frequencies have rational relations among them, the orbit eventually closes onto a lower-dimensional loop. If they do not, it winds quasi-periodically and fills the torus, the multi-clock version of the dense orbit on a single circle.

#The question KAM answers

Integrable systems are rare and special. Real systems are integrable systems plus a small disturbance: the planets would trace perfect ellipses if they only felt the sun, but they also tug faintly on each other. The natural question is whether the beautiful invariant tori survive that tug, or whether the smallest perturbation tears them apart and lets the system wander freely, which over astronomical time would mean the solar system is not stable.

KAM theory’s answer is strange and precise: it depends on arithmetic. Tori whose frequencies are “sufficiently irrational” survive a small perturbation. Tori whose frequencies are close to a resonance are destroyed first. Stability is sorted not by energy or size but by how well a frequency vector can be approximated by integer relations.

For this section, keep one picture in mind: a torus is a smooth track in phase space, and KAM asks whether that track survives when the system is nudged. The arithmetic enters because repeated nudges are harmless when they arrive out of phase, but dangerous when they keep arriving almost in sync.

A resonance among several frequencies is the multi-dimensional version of the near-period. It occurs when

$$k\cdot \omega \approx 0$$

for some vector of integers $k$ that is not all zeros. Written out in two dimensions,

$$k_1\omega_1+k_2\omega_2\approx 0,$$

which says one frequency is nearly a whole-number multiple of the other; the two clocks nearly relock. The dot product $k\cdot\omega$ is the multi-frequency generalization of the quantity $q\omega-p$ from the circle. Exactly zero is an exact resonance; nearly zero is a near-resonance.

#Why near-resonance is dangerous: small divisors

This is the mechanism, and the technical core of the whole essay, so it pays to go slowly.

When physicists want to understand a slightly-perturbed system, they try to “change coordinates” to make the perturbation disappear, one Fourier piece at a time. A typical piece of the perturbation looks like a wave around the torus,

$$a_k, e^{i k\cdot\theta},$$

and the coordinate change that cancels it comes out looking like

$$\frac{a_k}{i,k\cdot\omega}, e^{i k\cdot\theta}.$$

The exact details vary by system, but the shape is robust and is all we need: the correction has $k\cdot\omega$ in the denominator. Now read off the three cases.

If $k\cdot\omega$ is comfortably away from zero, the correction is small and controlled. The perturbation can be absorbed.

If $k\cdot\omega$ is tiny, the correction is enormous, because we are dividing by a near-zero number. A small perturbation produces a large response.

If $k\cdot\omega$ is exactly zero, the correction is infinite. The perturbation is in perfect resonance with the motion and cannot be removed by any small change of coordinates at all.

This is the small-divisor problem, and it is the same arithmetic we met on the circle, now with stakes. There, $|q\omega-p|$ being small meant a near-period. Here, $|k\cdot\omega|$ being small means a near-resonance that perturbation theory divides by. The survival of a geometric structure, an entire invariant torus in phase space, comes down to whether these denominators can be kept away from zero.

The engineering analogy is resonance in a bridge or circuit. A tiny input is not dangerous by itself; it becomes dangerous when the timing lines up so the response adds instead of cancels. KAM is the high-dimensional, geometric version of that timing problem.

#The Diophantine condition

So KAM needs frequencies whose small divisors stay controlled. The precise requirement is a Diophantine condition:

$$|k\cdot \omega| \geq \frac{\gamma}{|k|^\tau}\quad\text{for all nonzero integer vectors }k.$$

In words: resonances are allowed to creep closer to zero as the integer vector $k$ grows larger, but not too fast. The constants $\gamma$ and $\tau$ measure how stubbornly irrational the frequency vector is. This inequality is not decorative. It is the promise that keeps the denominators in the coordinate change from collapsing, which is exactly what KAM needs to prove a torus survives.

Which frequencies satisfy it best? The maximally irrational ones. In two dimensions, the most robust torus of all is the one whose frequency ratio is the golden ratio,

$$\varphi=\frac{1+\sqrt 5}{2}=[1;1,1,1,\ldots].$$

Its all-ones continued fraction makes it the hardest number to approximate by rationals, which makes its small divisors the largest, which makes its torus the last to break. The golden torus is robust not because gold is mystical but because the arithmetic of $\varphi$ is the worst possible case for resonance. This is the same fact that packed the sunflower; here it stabilizes an orbit.

#What the surviving phase space looks like

KAM does not promise a clean split into “all stable” or “all chaotic.” It promises a mixture, and the mixture has an arithmetic skeleton. Sufficiently irrational tori survive as smooth barriers. Tori near resonances dissolve, and around each resonance appear islands and thin chaotic layers. Between them, the surviving Diophantine tori act as walls that trajectories cannot cross. Discrete arithmetic facts, which ratios are near rational and which are stubbornly irrational, carve continuous geometric structure into the phase space. That is one of the most beautiful statements in mathematical physics: a number-theoretic property of a frequency decides a geometric property of a flow. Saturn’s rings make the same point visible from a backyard telescope. The dark Cassini Division in the photograph at the top of this essay is a band swept nearly empty because particles orbiting there would fall into a 2:1 resonance with the moon Mimas, the same nudge arriving at nearly the same phase orbit after orbit until it accumulates.

There is a standard way to write the setup. Start from an integrable system in action-angle coordinates, where the unperturbed motion is just rotation,

$$\dot{\theta}=\omega(I),\qquad \dot{I}=0,$$

so the angles turn at frequencies $\omega$ and the actions $I$ stay fixed, pinning the motion to a torus. Now add a small perturbation,

$$H(I,\theta)=H_0(I)+\epsilon H_1(I,\theta).$$

Removing it order by order produces exactly the small-divisor denominators $k\cdot\omega$ described above. Where those denominators stay large, the expansion converges and the torus survives, slightly bent. Where they vanish, the expansion breaks and the torus is destroyed.

#KAM as a fixed-point statement

This connects back to the spine of the series; it is another fixed-point story wearing unfamiliar clothes.

KAM does not say the old tori sit unchanged. It says many of them deform into nearby tori, and asks whether there exists a change of coordinates that carries the perturbed motion back to clean quasi-periodic rotation on the deformed torus. In fixed-point language, you are solving for an invariant object: a map $K$ that embeds an abstract torus into the real phase space so that running the physical dynamics matches simply rotating the abstract torus. Schematically,

$$\text{flow}\circ K = K\circ \text{rotation}_\omega.$$

Read it as: embed the torus, then flow, and you get the same result as rotating first, then embedding. The unknown $K$ is not a number, and not even a point. It is a whole geometric surface left invariant by the dynamics. KAM is the question of whether that invariant surface survives perturbation, and the answer is decided by the arithmetic of $\omega$.

The takeaway is modest but important: KAM is not asking a student to memorize a theorem. It is showing that stability can depend on arithmetic. Two systems can look physically similar, but the one whose frequencies avoid near-integer relations can keep a smooth invariant torus while the resonant one breaks into islands and chaos.

The ninth lesson:

Number theory shapes phase space because arithmetic controls resonance.

#The Bridge: From Non-Ergodicity to Recurrence

Why does number theory belong here, right after the previous essay on non-ergodicity, rather than as a detour?

That essay ended with a warning: an ensemble average can be the wrong object when the lived path is a single trajectory through time. The instant the single trajectory becomes the thing you care about, recurrence becomes the question. Does the path return near where it started? Does it revisit a region often enough for averages to settle? Does it miss whole regions entirely? Does a small periodic forcing keep hitting the system at nearly the same phase?

These are time questions first. They become number-theory questions the moment the motion is rotational, oscillatory, or quasi-periodic, because then “return near the start” translates directly into arithmetic. For the circle rotation,

$$\theta_n=\theta_0+n\omega \pmod 1,$$

a near return means $\theta_n\approx\theta_0$, which after subtracting $\theta_0$ becomes $n\omega\approx p$ for some integer $p$, that is,

$$|n\omega-p|\approx 0.$$

So the lived-path question “does this trajectory come back near where it started?” is the same question as “how well can $\omega$ be approximated by rationals?” That is the concrete reason recurrence drags arithmetic onto the ladder. A rational ratio gives eventual exact closure. An irrational ratio gives no closure, but irrationality is not one thing: some irrationals are easy to approximate and resonate readily, while others, the golden ratio chief among them, resist. That difference controls near-recurrence; near-recurrence controls resonance; resonance controls stability.

So the chain is tight. Once time matters, repeated motion matters. Once repeated motion matters, the question is whether cycles line up, nearly line up, or never line up, and that question is arithmetic. The integer vector $k$ is not decorative notation. It is the bookkeeper of every possible way the frequencies could realign.

#Arithmetic As The Hinge

This is the point where the series is about to change engines, and number theory is the hinge it turns on.

Everything above used arithmetic in one way: as the timekeeper and stability filter of dynamics. Integers count returns. Fractions describe frequency ratios. Continued fractions measure near-resonance. Diophantine inequalities decide which structures survive a disturbance. In every case, arithmetic stayed outside the system, a ruler we held up against the motion.

But arithmetic can be used a second way, and the difference is the whole pivot of the series.

Role of arithmeticWhat arithmetic doesExample
Timekeepercounts returns, periods, resonances$q\omega\approx p$
Stability filtercontrols small denominators$\lvert k\cdot\omega\rvert\geq\gamma/\lvert k\rvert^\tau$
Encoding substrateturns syntax into numbersGödel numbering
Self-reference mediumlets statements refer to their own codesthe diagonal lemma

The first two rows are this essay, the iteration engine. The bottom two rows are the next essay, the self-reference engine, and they rest on a single new idea: the same integers that count symbols going around a clock can also count symbols inside a formula. A statement, a proof, a program is just a finite string of symbols, and a finite string of symbols can be packed into a single integer. Once that is done, arithmetic stops being a ruler held up against the system and becomes a language the system can speak about itself. Numbers can encode statements about numbers. A rule can be fed its own description.

That is the threshold of representational closure, and it is where iteration’s invariant objects, attractors and tori, give way to a new family: Gödel sentences, halting problems, quines, the Y combinator, and recursive types. The same fixed-point question carries across, but the engine driving it is different.

We have now followed the iteration engine as far as it goes, from a number pulled toward $1$ all the way to an invariant torus held together by the golden ratio. The next essay turns the hinge: it lets arithmetic encode syntax, and watches self-reference become the second engine that manufactures fixed points.

#Further Reading

For number theory inside dynamics:

  1. Vladimir Arnold, Mathematical Methods of Classical Mechanics. The canonical route into Hamiltonian mechanics, action-angle variables, and the small-divisor problem.
  2. Jurgen Moser, Stable and Random Motions in Dynamical Systems. A classic treatment of KAM ideas and the survival of invariant tori.
  3. Hendrik Broer and Floris Takens, Dynamical Systems and Chaos. Useful for connecting invariant tori, resonance, and Diophantine conditions to bifurcations.
  4. A. Ya. Khinchin, Continued Fractions. The short classic on continued fractions, convergents, and how well numbers can be approximated by rationals.
  5. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers. Background on Diophantine approximation and the special role of the golden ratio.