A Proof of Ostrowski’s Theorem

So this coming Friday I’ve been asked to be a backup speaker for the local maths club, in case our current speaker cancels (again). Since I like cake, and I like to be given a fairly sliced piece of cake, I decided to talk about Sperner’s lemma and it’s applications to fair division problems.

This stuff is very well known, and there’s a popular expository article about it here. So I won’t talk too much about that.

On the other hand, something I will talk about is a fantastic related problem with an equally fantastic solution. It goes something like this. Suppose you have a square cake; for which integers {n} can we divide the cake into {n} triangles of equal area?

This was first posed in the American Mathematical Monthly after (I assume) a heated debate during somebody’s birthday celebrations. It was eventually solved by a mathematician named Paul Monsky in 1970, with an astonishingly clever mix of combinatorics and valuation theory.

It’s quite easy to show that you can perform such a division for even {n}, and Monsky shows that these are the only {n} for which it is furthermorepossible. You should definitely go read his proof – the crux of the argument relies on colouring the points in {\mathbb{R}^2} with three colours, such that on any given line one finds exactly two of the colours. Monsky then uses Sperner’s lemma – and some very special properties of the colouring he constructs – to conclude.

None of this is too astonishing. What really surprised me, however, is that to construct such a colouring he uses (of all things!) {p}-adic valuations on {\mathbb{R}}. And while this might seem esoteric, in the half century since he published, nobody seems to have found a more elementary proof!

Another paper on similar colourings of projective planes also finds strong links to non-archimedian valuations, which leads me to believe that there is something very strange going on here! I haven’t read this later paper very closely, but I really should.

In any case, these things have gotten me thinking a lot about different kinds of norms on {\mathbb{Q}}. And I felt like sharing a proof of the following well known theorem of Ostrowski:

Theorem 1 Every non-trivial valuation on {\mathbb{Q}} is equivalent to either the standard valuation {|\cdot|_\infty} or one of the {p}-adic valuations {|\cdot|_p}.

This is pretty cool, and I think it provides some motivation as to why we should care about {p}-adic stuff at all. But first, let’s unpack things a little bit. First of all, what is a valuation?

Definition 2 A valuation on a field {\mathbb{F}} is a map {|\cdot| : \mathbb{F} \rightarrow \mathbb{R}} satisfying:

\displaystyle  \begin{aligned} |x| &\geq 0 \\ |x| &= 0 \iff x = 0 \\ |xy| &= |x||y| \\ |x + y| &\leq |x| + |y| \end{aligned}

A valuation is non-archimedian if it satisfies the ultrametric inequality:

\displaystyle  |x + y| \leq \max(|x|, |y|)

We also say that two valuations {|\cdot|_1} and {|\cdot|_2} are equivalent if there exists some {c \in \mathbb{R}} such that {|\cdot|_1 = |\cdot|_2^c}, or the valuations generate the same topology on the underlying field.

The standard valuation {|\cdot|_\infty} on {\mathbb{Q}} is given by the Euclidean norm. This valuation is obviously archimedian. On the other hand, for any prime {p}, we can construct a non-archimedian valuation on {\mathbb{Q}} as follows:

Any rational can be written in the form {x = p^n\frac a b}, where {\gcd(ab, p) = 1}. We define the {p}-adic valuation of {x} as {|x|_p = p^{-n}}. It is not difficult to show that this is indeed a valuation, and that it is non-archimedian.

Ostrowski’s theorem asserts that these are essentially the only valuations we can find on {\mathbb{Q}}. We will split the proof into the archimedian and non-archimedian cases; curiously, despite the strange nature of non-archimedian valuations, this second case is actually markedly easier to show. But first, a lemma.

Lemma 3 Suppose {|\cdot|} is a non-trivial valuation on {\mathbb{Q}}. Then for any {m,n \in \mathbb{Z}}, we have:

\displaystyle  |m| \leq \max(1, |n|)^{\log(m)/\log(n)}

Proof: Note that {m = a_0 + a_1n + \dots + a_rn^r} for some integers {0 \leq a_i < n} and {a_r \neq 0}. We also note that {|a_i| \leq a_i|1| < n}, and {r < \log(m)/\log(n)}. Hence:

\displaystyle  \begin{aligned} |m| &\leq |a_0| + |a_1n| + \dots + |a_rn^r| \\ &\leq n(r+1)\max(1, |n|)^r \\ &\leq n \left(1 + \frac{\log(m)}{\log(n)}\right) \max(1, |n|)^{\log(m)/\log(n)} \end{aligned}

We can now apply the tensor power trick. Replace {m} with {m^t} for some integer {t} and take {t^{th}} roots of both sides:

\displaystyle  |m| \leq n^{1/t} \left(1 + t\frac{\log(m)}{\log(n)}\right)^{1/t} \max(1, |n|)^{\log(m)/\log(n)}

Taking the limit as {t \rightarrow \infty} concludes the proof. \Box

Another useful lemma we will need is the following characterisation of non-archimedian valuations over {\mathbb{Q}}:

Lemma 4 {|\cdot|} is a non-archimedian valuation on {\mathbb{Q}} if and only if {|n| \leq 1} for all {n \in \mathbb{Z}}.

Proof: The forwards direction is trivial, so suppose {|n| \leq 1} for all {n \in \mathbb{Z}}. Now consider some {x, y \in \mathbb{Q}}. By the binomial theorem, we have:

\displaystyle  \begin{aligned} |(x+y)|^n &= \left|\sum_i {n \choose i} x^i y^{n-i}\right| \\ &\leq \sum_i |x^iy^{n-i}| \\ &\leq (n+1)\max(|x|, |y|)^n \end{aligned}

Taking {n^{\text{th}}} roots of both sides (the tensor power trick again!) concludes the proof. \Box

We are now in a position to prove the archimedian side of the proof directly:

Theorem 5 If {|\cdot|} is a non-trivial archimedian valuation on {\mathbb{Q}}, then {|\cdot|} is equivalent to {|\cdot|_\infty}.

Proof: First of all, we note that {|n| \geq 1} for all {n \in \mathbb{N}}, as if this was not the case we would have {|m| < 1} for some {m}. By lemma 3, this would imply that {n \leq 1} for all {n \in \mathbb{Z}}, contradicting the assumption that {|\cdot|} is archimedian. Consider some {m, n \in \mathbb{Z}}. By lemma 3, we have:

\displaystyle  \begin{aligned} |m| &\leq \max(1, |n|)^{\log(m)/\log(n)} \\ &\leq |n|^{\log(m)/\log(n)} \end{aligned}

Hence {|m|^{1/\log(m)} \leq |n|^{1/\log(n)}}. Note that this expression does not depend on the choice of {m} or {n}, thus there exists a constant {c \in \mathbb{R}} such that {|n|^{1/\log(n)} = c} for all {n>1 \in \mathbb{Z}}. But then {|n| = | |n|_\infty | = c^{\log|n|_\infty} = |n|_\infty^{\log(c)}} for all such {n}, and we are done. \Box

Finally, a cute algebraic proof wraps up the non-archimedian part of Ostrowski’s theorem.

Theorem 6 If {|\cdot|} is a non-trivial non-archimedian valuation on {\mathbb{Q}}, then {|\cdot|} is equivalent to {|\cdot|_p} for some prime {p}.

Proof: Consider the set {\mathfrak{a} = \{ n \in \mathbb{Z} : |n| < 1 \}}. This set is non-empty as the valuation is non-trivial. We also know by lemma 4 that {|n| \leq 1} for all {n \in \mathbb{Z}} and so {\mathfrak{a}} is an ideal; in particular, it is a prime ideal. Thus {\mathfrak{a} = p\mathbb{Z}} for some prime {p}. Consider some rational {x = p^n\frac{a}{b}}, where {\gcd(p, ab) = 1}. Then we have:

\displaystyle  \begin{aligned} |x| &= |p^n|\cdot\left|\frac{b}{c}\right| = |p|^n \cdot |1| \\ &= e^{n\log|p|} \\ &= p^{-n \cdot (-\frac{\log|p|}{\log(p)})} \\ &= |x|_p^{-\frac{\log|p|}{\log(p)}} \end{aligned}

\Box