Locally Euclidean

A Proof of Ostrowski’s Theorem

So this coming Friday I’ve been asked to be a backup speaker for the local maths club, in case our current speaker cancels (again). Since I like cake, and I like to be given a fairly sliced piece of cake, I decided to talk about Sperner’s lemma and it’s applications to fair division problems.

This stuff is very well known, and there’s a popular expository article about it here. So I won’t talk too much about that.

On the other hand, something I will talk about is a fantastic related problem with an equally fantastic solution. It goes something like this. Suppose you have a square cake; for which integers {n} can we divide the cake into {n} triangles of equal area?

This was first posed in the American Mathematical Monthly after (I assume) a heated debate during somebody’s birthday celebrations. It was eventually solved by a mathematician named Paul Monsky in 1970, with an astonishingly clever mix of combinatorics and valuation theory.

It’s quite easy to show that you can perform such a division for even {n}, and Monsky shows that these are the only {n} for which it is furthermorepossible. You should definitely go read his proof – the crux of the argument relies on colouring the points in {\mathbb{R}^2} with three colours, such that on any given line one finds exactly two of the colours. Monsky then uses Sperner’s lemma – and some very special properties of the colouring he constructs – to conclude.

None of this is too astonishing. What really surprised me, however, is that to construct such a colouring he uses (of all things!) {p}-adic valuations on {\mathbb{R}}. And while this might seem esoteric, in the half century since he published, nobody seems to have found a more elementary proof!

Another paper on similar colourings of projective planes also finds strong links to non-archimedian valuations, which leads me to believe that there is something very strange going on here! I haven’t read this later paper very closely, but I really should.

In any case, these things have gotten me thinking a lot about different kinds of norms on {\mathbb{Q}}. And I felt like sharing a proof of the following well known theorem of Ostrowski:

Theorem 1 Every non-trivial valuation on {\mathbb{Q}} is equivalent to either the standard valuation {|\cdot|_\infty} or one of the {p}-adic valuations {|\cdot|_p}.

This is pretty cool, and I think it provides some motivation as to why we should care about {p}-adic stuff at all. But first, let’s unpack things a little bit. First of all, what is a valuation?

Definition 2 A valuation on a field {\mathbb{F}} is a map {|\cdot| : \mathbb{F} \rightarrow \mathbb{R}} satisfying:

\displaystyle  \begin{aligned} |x| &\geq 0 \\ |x| &= 0 \iff x = 0 \\ |xy| &= |x||y| \\ |x + y| &\leq |x| + |y| \end{aligned}

A valuation is non-archimedian if it satisfies the ultrametric inequality:

\displaystyle  |x + y| \leq \max(|x|, |y|)

We also say that two valuations {|\cdot|_1} and {|\cdot|_2} are equivalent if there exists some {c \in \mathbb{R}} such that {|\cdot|_1 = |\cdot|_2^c}, or the valuations generate the same topology on the underlying field.

The standard valuation {|\cdot|_\infty} on {\mathbb{Q}} is given by the Euclidean norm. This valuation is obviously archimedian. On the other hand, for any prime {p}, we can construct a non-archimedian valuation on {\mathbb{Q}} as follows:

Any rational can be written in the form {x = p^n\frac a b}, where {\gcd(ab, p) = 1}. We define the {p}-adic valuation of {x} as {|x|_p = p^{-n}}. It is not difficult to show that this is indeed a valuation, and that it is non-archimedian.

Ostrowski’s theorem asserts that these are essentially the only valuations we can find on {\mathbb{Q}}. We will split the proof into the archimedian and non-archimedian cases; curiously, despite the strange nature of non-archimedian valuations, this second case is actually markedly easier to show. But first, a lemma.

Lemma 3 Suppose {|\cdot|} is a non-trivial valuation on {\mathbb{Q}}. Then for any {m,n \in \mathbb{Z}}, we have:

\displaystyle  |m| \leq \max(1, |n|)^{\log(m)/\log(n)}

Proof: Note that {m = a_0 + a_1n + \dots + a_rn^r} for some integers {0 \leq a_i < n} and {a_r \neq 0}. We also note that {|a_i| \leq a_i|1| < n}, and {r < \log(m)/\log(n)}. Hence:

\displaystyle  \begin{aligned} |m| &\leq |a_0| + |a_1n| + \dots + |a_rn^r| \\ &\leq n(r+1)\max(1, |n|)^r \\ &\leq n \left(1 + \frac{\log(m)}{\log(n)}\right) \max(1, |n|)^{\log(m)/\log(n)} \end{aligned}

We can now apply the tensor power trick. Replace {m} with {m^t} for some integer {t} and take {t^{th}} roots of both sides:

\displaystyle  |m| \leq n^{1/t} \left(1 + t\frac{\log(m)}{\log(n)}\right)^{1/t} \max(1, |n|)^{\log(m)/\log(n)}

Taking the limit as {t \rightarrow \infty} concludes the proof. \Box

Another useful lemma we will need is the following characterisation of non-archimedian valuations over {\mathbb{Q}}:

Lemma 4 {|\cdot|} is a non-archimedian valuation on {\mathbb{Q}} if and only if {|n| \leq 1} for all {n \in \mathbb{Z}}.

Proof: The forwards direction is trivial, so suppose {|n| \leq 1} for all {n \in \mathbb{Z}}. Now consider some {x, y \in \mathbb{Q}}. By the binomial theorem, we have:

\displaystyle  \begin{aligned} |(x+y)|^n &= \left|\sum_i {n \choose i} x^i y^{n-i}\right| \\ &\leq \sum_i |x^iy^{n-i}| \\ &\leq (n+1)\max(|x|, |y|)^n \end{aligned}

Taking {n^{\text{th}}} roots of both sides (the tensor power trick again!) concludes the proof. \Box

We are now in a position to prove the archimedian side of the proof directly:

Theorem 5 If {|\cdot|} is a non-trivial archimedian valuation on {\mathbb{Q}}, then {|\cdot|} is equivalent to {|\cdot|_\infty}.

Proof: First of all, we note that {|n| \geq 1} for all {n \in \mathbb{N}}, as if this was not the case we would have {|m| < 1} for some {m}. By lemma 3, this would imply that {n \leq 1} for all {n \in \mathbb{Z}}, contradicting the assumption that {|\cdot|} is archimedian. Consider some {m, n \in \mathbb{Z}}. By lemma 3, we have:

\displaystyle  \begin{aligned} |m| &\leq \max(1, |n|)^{\log(m)/\log(n)} \\ &\leq |n|^{\log(m)/\log(n)} \end{aligned}

Hence {|m|^{1/\log(m)} \leq |n|^{1/\log(n)}}. Note that this expression does not depend on the choice of {m} or {n}, thus there exists a constant {c \in \mathbb{R}} such that {|n|^{1/\log(n)} = c} for all {n>1 \in \mathbb{Z}}. But then {|n| = | |n|_\infty | = c^{\log|n|_\infty} = |n|_\infty^{\log(c)}} for all such {n}, and we are done. \Box

Finally, a cute algebraic proof wraps up the non-archimedian part of Ostrowski’s theorem.

Theorem 6 If {|\cdot|} is a non-trivial non-archimedian valuation on {\mathbb{Q}}, then {|\cdot|} is equivalent to {|\cdot|_p} for some prime {p}.

Proof: Consider the set {\mathfrak{a} = \{ n \in \mathbb{Z} : |n| < 1 \}}. This set is non-empty as the valuation is non-trivial. We also know by lemma 4 that {|n| \leq 1} for all {n \in \mathbb{Z}} and so {\mathfrak{a}} is an ideal; in particular, it is a prime ideal. Thus {\mathfrak{a} = p\mathbb{Z}} for some prime {p}. Consider some rational {x = p^n\frac{a}{b}}, where {\gcd(p, ab) = 1}. Then we have:

\displaystyle  \begin{aligned} |x| &= |p^n|\cdot\left|\frac{b}{c}\right| = |p|^n \cdot |1| \\ &= e^{n\log|p|} \\ &= p^{-n \cdot (-\frac{\log|p|}{\log(p)})} \\ &= |x|_p^{-\frac{\log|p|}{\log(p)}} \end{aligned}


Tangent Spaces

Over the last day I read the first ten pages of Milnor’s famous notes, Topology from the Differentiable Viewpoint. As one might expect, these early pages are spent covering the definitions and basic facts that will be needed further along in the book. The most important of these is, of course, the following:

Definition 1 An topological manifold {M} is an {n}-dimensional smooth manifold if, for every {x \in M}, there exists a neighbourhood {U} of {x} diffeomorphic to a subset of {{\mathbb R}^n}.

photo 1

I find it quite interesting to contrast Milnor’s exposition with more modern takes, such as Lee’s Introduction to Smooth Manifolds. The biggest difference here for me was in his definition of the tangent space of a smooth manifold. Milnor does this extrinsically, again appealing to an embedding of {M}:

Definition 2 Consider some {x \in M}, and some smooth parameterisation {f: U \subset {\mathbb R}^n \rightarrow M} with {f(u) = x}. We can consider {M} to be a subset of {{\mathbb R}^m} for some {m}, and hence we can take the derivative of {f} in the usual way:

\displaystyle df_u: {\mathbb R}^n \rightarrow {\mathbb R}^m

Define {TM_x}, the tangent space of {M} at {x}, to be the image of {df_u}.

photo 2

It is a standard exercise to show that the tangent space is well defined (does not depend on the choice of parameterisation) and is an {n}-dimensional real vector space. This definition of a tangent space is very intuitive, and so I feel it’s good from a pedagogical perspective.

The problem for me is two-fold. Firstly, we often construct (smooth) manifolds by gluing together open sets of Euclidean spaces with appropriate maps. Obviously it is true that the result does in fact embed in some {{\mathbb R}^m}, but the proof is a bit gross. Because of this, it would be nice not to rely on this fact.

The second reason is generality. It would be nice to extend the notion of a tangent space to objects where the standard analysis tools might not be available. These would be things like locally ringed spaces, algebraic varieties and the like.

To deal with the first issue, one way of thinking about the tangent space is as (obviously) the set of tangent vectors at a point. Intuitively, we should be able to define this using only local data about {M}. We can make this precise as follows:

Definition 3 Consider some {x \in M}, and a smooth coordinate chart {\varphi: U \owns x \rightarrow R^n}. Let {\Gamma} be the set of curves {\gamma: (-1, 1) \rightarrow M} such that {\gamma(0) = x} and {\varphi \circ \gamma} is differentiable at {0} in the usual sense. We can form an equivalence relation {\sim} on {\Gamma} by defining {\gamma_1} and {\gamma_2} to be equivalent if {(\varphi \circ \gamma_1)'(0) = (\varphi \circ \gamma_2)'(0)}. Then {TM_x} can be identified with {\Gamma / \sim}.

photo 3

In other words, each element of the tangent space at {x} can be identified with the derivative of some curve through {x}. Again, proving that this is independent of the choice of coordinate chart is left to the reader. The chain rule is your friend.

This gets rid of the issue with embeddings. It is still not quite good enough, however – we are still making direct use of analysis in the definition. To do better, we have to get a lot more sneaky.

The idea is to note that the tangent space, being related to derivatives at {x}, essentially describes the first order behaviour of lines, surfaces etcetera passing through {x}. Taking a hint from algebraic geometry, these surfaces can be viewed as the zeros of real-valued functions of our manifold. Passing to the first order behaviour then amounts to killing off the non-linear parts of these functions.

Firstly, however, what do we mean by a real-valued functions of {M}?

Definition 4 Let {C^\infty(M)} be the set of functions {f: M \rightarrow {\mathbb R}} such that, for any smooth coordinate chart {\varphi: U \rightarrow R^n}, the composition {f \circ \varphi^{-1}} is smooth. {C^\infty(M)} forms an associative, unital algebra over {{\mathbb R}} in the obvious way.

To define the notion of surfaces passing through {x}, let {I = \{f \in C^\infty(M) : f(x) = 0 \}}. This set is the kernel of the evaluation homomorphism {f \mapsto f(x)}, and so is in fact an ideal of {C^\infty(M)}. Furthermore, the evaluation homomorphism is surjective onto a field (consider constant functions), so {I} is a maximal ideal.

We would now like to identify functions in {I} with the same first order behaviour at {x}. Morally, this should amount to identifying functions with the same derivative at {x}, much like we have done up till now. To get a feel for this, consider the {1}-dimensional case. As the functions in {I} are analytic, we can expand them about {x} with Taylor’s theorem:

\displaystyle f(u) = \sum_{k=1}^{\infty} a_k(u - x)^k

Now suppose that {f, g \in I} have the same first order behaviour, or derivative, at {x}. Then note that:

\displaystyle (f - g)'(x) = \sum_{k=1}^{\infty}k(a_k - b_k)(u - x)^{k-1} = a_1 - b_1

So in particular:

\displaystyle (f - g)(u) = c_2(u - x)^2 + c_3(u - x)^3 + \dots

Each of the terms on the right hand side is a product of two elements of {I}. In other words, functions in {I} with the same first order behaviour differ by some element of the ideal {I^2}. To identify them, we need merely quotient!

Definition 5 The cotangent space {TM^*_x} of {M} at {x} is defined as the quotient (vector) space {I/I^2}.

You read correctly – this is not quite the tangent space. The difference here is that while the tangent space is a set of vectors, {TM^*_x} as we have defined it is actually a set of linear functionals to {{\mathbb R}}. This is nothing but the dual of {TM_x}, so we can recover the tangent space by dualising again:

\displaystyle TM_x \simeq (TM^*_x)^*

Another way of seeing that {TM^*_x} is not the tangent space is by considering a smooth map {\varphi: M \rightarrow N} between two smooth manifolds. The derivative of this function gives rise to a map {d\varphi_x: TM_x \rightarrow TN_{\varphi(x)}} – for a nice exercise, try to define this!

On the other hand, consider some {f: N \rightarrow {\mathbb R}} in {TN^*_{\varphi(x)}}. Pulling back along {\varphi}, we get a map {f \circ \varphi: M \rightarrow {\mathbb R}} which vanishes at {x} – in other words, a member of {TM^*_x}! Hence {\varphi} gives rise to a map {d\varphi^*_x: TN^*_{\varphi(x)} \rightarrow TM^*_x}.

photo 4

The cool thing about how we’ve defined the cotangent space here is that, once we have an appropriate notion of functions on our space, the rest is just algebra. Thus this generalises nicely – the Zariski tangent space is exactly this idea applied to algebraic varieties, for instance!


Year Plans

It’s been a long while since I’ve put any of the mathematics I’ve been doing online. Now that I’m in my third year of my undergraduate degree, and finding myself with more time to do what I love, I think it’s a fine time to begin anew!

Most of what I write will probably be notes or streams of thought.

I’ve spent a lot of time recently playing with algebraic topology and homological algebra. I’m working through Massey’s introductory homology textbook, which presents singular homology using cubes rather than simplices. This makes a lot of things easier – for instance, proving that one can subdivide cubes into smaller cubes is trivial. On the other hand, a rigorous display of barycentric subdivision for simplices is a tedious affair.

One thing that’s struck me is the use of some basic analysis in the proof of excision. A critical lemmas used is a shrinking lemma of sorts:

Lemma 1 (Shrinking Lemma) Let {{\mathcal U}} be a collection of subsets of a space {X} such that the interiors cover {X}. Consider the homology groups {H_n(X, {\mathcal U})} generated by all singular cubes contained within some {{\mathcal U}_i}. Then the inclusion {H_n(X, {\mathcal U}) \rightarrow H_n(X)} is an isomorphism.

This lemma essentially tells us that we only need to consider cubes that are ‘small’ in some sense to compute the homology. The proof works by pulling back the cover to a given cube, using compactness and then subdividing the cube enough times that each subcube lies within a set of the cover. The Lebesgue number of the cover tells us how many times is enough – this is the smallest {\epsilon \in {\mathbb R}^+} such that every ball of radius {\epsilon} is covered by some set in the cover. Of course, this relies on the fact that a cube is a metric space.

Given the combinatorial roots of homology, I would not be surprised if there was some way of bypassing this analytic argument. I would love to see a proof using something like the nerve of the cover here instead. A theorem I find quite pleasant in this sort of direction is the following:

Theorem 2 (Nerve Theorem) Let {{\mathcal U}} be an open cover of a paracompact space {X} such that every finite intersection of sets in {{\mathcal U}} is contractible. Then the geometric realisation of the Čech nerve of {{\mathcal U}} is homotopy equivalent to {X}.


This is something I’d like to think more about in the future. Indeed, I love the interplay between combinatorics and topology that homology theory provides! A lot of combinatorial results like Sperner’s lemma have topological equivalents, such as (in this case) Brouwer’s fixed point theorem. The original proof of the Kneser conjecture, for instance, proceeded using homology. I’ll have to write about this stuff some time – it’s fascinating.

So what does this year look like for me, mathematically and otherwise?

First of all, I want to read some of the great texts in algebraic topology, and watch publically available lectures by the masters. This is due to a desire to seriously up the ante with my repertoire of tools this year. I plan to tackle most of the following:

  1. Topology from the Differentiable Viewpoint by Milnor.
  2. Morse Theory by Milnor.
  3. Characteristic Classes by Milnor and Stasheff.
  4. K-Theory by Atiyah.
  5. Differential Forms in Algebraic Topology by Bott and Tu.
  6. Algebraic Topology by Spanier.

These are all classic sets of lecture notes and/or books. As far as algebraic topology is concerned, they introduce a lot of the basic tools in various settings – I’m especially interested in the periodicity phenomena in K-theory, the crash course in Riemannian geometry nestled within Milnor’s notes, and learning a bit of cobordism. There’s a lot of overlap in this list, however, so I’ll probably be skimming here and there.

Another thing I’d like to get some familiarity with at some point (probably not this year) is surgery, {h}-cobordism and where we stand with the classification of {n}-manifolds. A book tying a lot of this together is The Wild World of 4-Manifolds by Scorpan, which now makes me associate manifolds with constellations.


On the flip side, I’ve been perusing Weibel’s famous textbook, An Introduction to Homological Algebra. It’s certainly made the constructions in homology crystal clear to me, and I’m looking forward to when things get a whole lot more hardcore.

ADE classifications are all part of the game, and having finished up Erdmann’s Introduction to Lie Algebras, I think I should move onwards to Humphrey’s Introduction to Lie Algebras and Representation Theory. The story of simple Lie algebras is a fascinating one of analogy and connection. Also, my linear algebra has improved a crazy amount reading these.

Realistically, I doubt I’ll finish reading all of these books. But I hope I can at least make a stab at each of them. I’m thinking of LaTeXing up some solutions to exercises and posting them here along with notes. We’ll see how that turns out!