MathOverflow: a miscellany (Part V)

Nice quote-worthy answers and questions from MathOverflow every Tuesday!


Proofs that require fundamentally new ways of thinking

I

Let me give some examples and say why I would be interested in more. Perhaps what the examples have in common is that a powerful and unexpected technique is introduced that comes to seem very natural once you are used to it.

Example 1. Euler’s proof that there are infinitely many primes.

If you haven’t seen anything like it before, the idea that you could use analysis to prove that there are infinitely many primes is completely unexpected. Once you’ve seen how it works, that’s a different matter, and you are ready to contemplate trying to do all sorts of other things by developing the method.

Example 2. The use of complex analysis to establish the prime number theorem.

Even when you’ve seen Euler’s argument, it still takes a leap to look at the complex numbers. (I’m not saying it can’t be made to seem natural: with the help of Fourier analysis it can. Nevertheless, it is a good example of the introduction of a whole new way of thinking about certain questions.)

Example 3. Variational methods.

You can pick your favourite problem here: one good one is determining the shape of a heavy chain in equilibrium.

Example 4. Erdős’s lower bound for Ramsey numbers.

One of the very first results (Shannon’s bound for the size of a separated subset of the discrete cube being another very early one) in probabilistic combinatorics.

Example 5. Roth’s proof that a dense set of integers contains an arithmetic progression of length 3.

Historically this was by no means the first use of Fourier analysis in number theory. But it was the first application of Fourier analysis to number theory that I personally properly understood, and that completely changed my outlook on mathematics. So I count it as an example (because there exists a plausible fictional history of mathematics where it was the first use of Fourier analysis in number theory).

Example 6. Use of homotopy/homology to prove fixed-point theorems.

Once again, if you mount a direct attack on, say, the Brouwer fixed point theorem, you probably won’t invent homology or homotopy (though you might do if you then spent a long time reflecting on your proof).

The reason these proofs interest me is that they are the kinds of arguments where it is tempting to say that human intelligence was necessary for them to have been discovered. It would probably be possible in principle, if technically difficult, to teach a computer how to apply standard techniques, the familiar argument goes, but it takes a human to invent those techniques in the first place.

Now I don’t buy that argument. I think that it is possible in principle, though technically difficult, for a computer to come up with radically new techniques. Indeed, I think I can give reasonably good Just So Stories for some of the examples above. So I’m looking for more examples. The best examples would be ones where a technique just seems to spring from nowhere — ones where you’re tempted to say, “A computer could never have come up with that.”

Edit: I agree with the first two comments below, and was slightly worried about that when I posted the question. Let me have a go at it though. The difficulty with, say, proving Fermat’s last theorem was of course partly that a new insight was needed. But that wasn’t the only difficulty at all. Indeed, in that case a succession of new insights was needed, and not just that but a knowledge of all the different already existing ingredients that had to be put together. So I suppose what I’m after is problems where essentially the only difficulty is the need for the clever and unexpected idea. I.e., I’m looking for problems that are very good challenge problems for working out how a computer might do mathematics. In particular, I want the main difficulty to be fundamental (coming up with a new idea) and not technical (having to know a lot, having to do difficult but not radically new calculations, etc.). Also, it’s not quite fair to say that the solution of an arbitrary hard problem fits the bill. For example, my impression (which could be wrong, but that doesn’t affect the general point I’m making) is that the recent breakthrough by Nets Katz and Larry Guth in which they solved the Erdős distinct distances problem was a very clever realization that techniques that were already out there could be combined to solve the problem. One could imagine a computer finding the proof by being patient enough to look at lots of different combinations of techniques until it found one that worked. Now their realization itself was amazing and probably opens up new possibilities, but there is a sense in which their breakthrough was not a good example of what I am asking for.

While I’m at it, here’s another attempt to make the question more precise. Many many new proofs are variants of old proofs. These variants are often hard to come by, but at least one starts out with the feeling that there is something out there that’s worth searching for. So that doesn’t really constitute an entirely new way of thinking. (An example close to my heart: the Polymath proof of the density Hales-Jewett theorem was a bit like that. It was a new and surprising argument, but one could see exactly how it was found since it was modelled on a proof of a related theorem. So that is a counterexample to Kevin’s assertion that any solution of a hard problem fits the bill.) I am looking for proofs that seem to come out of nowhere and seem not to be modelled on anything.

Further edit. I’m not so keen on random massive breakthroughs. So perhaps I should narrow it down further — to proofs that are easy to understand and remember once seen, but seemingly hard to come up with in the first place.

II

Grothendieck’s insight how to deal with the problem that whatever topology you define on varieties over finite fields, you never seem to get enough open sets. You simply have to re-define what is meant by a topology, allowing open sets not to be subsets of your space but to be covers.

I think this fits the bill of “seem very natural once you are used to it”, but it was an amazing insight, and totally fundamental in the proof of the Weil conjectures.

III

Gromov is someone who has arguably introduced more radical thoughts into mathematics than anyone else. Examples involving groups with polynomial growth and holomorphic curves have already been cited in other answers to this question. I have two other obvious ones but there are many more.

I don’t remember where I first learned about convergence of Riemannian manifolds, but I had to laugh because there’s no way I would have ever conceived of a notion. To be fair, all of the groundwork for this was laid out in Cheeger’s thesis, but it was Gromov who reformulated everything as a convergence theorem and recognized its power.

Another time Gromov made me laugh was when I was reading what little I could understand of his book Partial Differential Relations. This book is probably full of radical ideas that I don’t understand. The one I did was his approach to solving the linearized isometric embedding equation. His radical, absurd, but elementary idea was that if the system is sufficiently underdetermined, then the linear partial differential operator could be inverted by another linear partial differential operator. Both the statement and proof are for me the funniest in mathematics. Most of us view solving PDE’s as something that requires hard work, involving analysis and estimates, and Gromov manages to do it using only elementary linear algebra. This then allows him to establish the existence of isometric embedding of Riemannian manifolds in a wide variety of settings.

IV

My favorite example from algebraic topology is Rene Thom’s work on cobordism theory. The problem of classifying manifolds up to cobordism looks totally intractable at first glance. In low dimensions (0,1,2), it is easy, because manifolds of these dimensions are completely known. With hard manual labor, one can maybe treat dimensions 3 and 4. But in higher dimensions, there is no chance to proceed by geometric methods.

Thom came up with a geometric construction (generalizing earlier work by Pontrjagin), which is at the same time easy to understand and ingenious. Embed the manifold into a sphere, collapse everything outside a tubular neighborhood to a point and use the Gauss map of the normal bundle… What this construction does is to translate the geometric problem into a homotopy problem, which looks totally unrelated at first sight.

The homotopy problem is still difficult, but thanks to work by Serre, Cartan, Steenrod, Borel, Eilenberg and others, Thom had enough heavy guns at hand to get fairly complete results.

Thom’s work led to an explosion of differential topology, leading to Hirzebruch’s signature theorem, the Hirzebruch-Riemann-Roch theorem, Atiyah-Singer, Milnor-Kervaire classification of exotic spheres…..until Madsen-Weiss’ work on mapping class groups.

V

The method of forcing certainly fits here. Before, set theorists expected that independence results would be obtained by building non-standard, ill-founded models, and model theoretic methods would be key to achieve this. Cohen’s method begins with a transitive model and builds another transitive one, and the construction is very different from all the techniques being tried before.

This was completely unexpected. Of course, in hindsight, we see that there are similar approaches in recursion theory and elsewhere happening before or at the same time.

But it was the fact that nobody could imagine you would be able to obtain transitive models that mostly had us stuck.

VI

Sometimes mathematics is not only about the methods of the proof, it is about the statement of the proof. E.g., it is hard to imagine an theorem-searching algorithm ever finding a proof of the results in Shannon’s 1948 Mathematical Theory of Communication, without that algorithm first “imagining” (by some unspecified process) that there could BE a theory of communication.

Even so celebrated a mathematician as J. L. Doob at first had trouble grasping that Shannon’s reasoning was mathematical in nature, writing in his AMS review (MR0026286):

[Shannon’s] discussion is suggestive throughout, rather than mathematical, and it is not always clear that the author’s mathematical intentions are honorable.

The decision of which mathematical intentions are to be accepted as “honorable” (in Doob’s phrase) is perhaps very difficult to formalize.

[added reference]

One finds this same idea expressed in von Neumann’s 1948 essay The Mathematician:

Some of the best inspirations of modern mathematics (I believe, the best ones) clearly originated in the natural sciences. … As any mathematical discipline travels far from its empirical source, or still more, if it is a second or third generation only indirectly inspired by ideas coming from “reality”, it is beset by very grave dangers. It becomes more and more purely aestheticizing, more and more l`art pour le art. … Whenever this stage is reached, the only remedy seems to me to be the rejuvenating return to the source: the reinjection of more or less directly empirical ideas.

One encounters this theme of inspiration from reality over-and-over in von Neumann’s own work. How could a computer conceive theorems in game theory … without having empirically played games? How could a computer conceive the theory of shock waves … without having empirically encountered the intimate union of dynamics and thermodynamics that makes shock wave theory possible? How could a computer conceive theorems relating to computational complexity … without having empirically grappled with complex computations?

The point is straight from Wittgenstein and E. O. Wilson: in order to conceive mathematical theorems that are interesting to humans, a computer would have to live a life similar to an ordinary human life, as a source of inspiration.

VII

It seems that certain problems seem to induce this sort of new thinking (cf. my article “What is good mathematics?“). You mentioned the Fourier-analytic proof of Roth’s theorem; but in fact many of the proofs of Roths’ theorem (or Szemeredi’s theorem) seem to qualify, starting with Furstenberg’s amazing realisation that this problem in combinatorial number theory was equivalent to one in ergodic theory, and that the structural theory of the latter could then be used to attack the former. Or the Ruzsa-Szemeredi observation (made somewhat implicitly at the time) that Roth’s theorem follows from a result in graph theory (the triangle removal lemma) which, in some ways, was “easier” to prove than the result that it implied despite (or perhaps, because of) the fact that it “forgot” most of the structure of the problem. And in this regard, I can’t resist mentioning Ben Green’s brilliant observation (inspired, I believe, by some earlier work of Ramare and Ruzsa) that for the purposes of finding arithmetic progressions, that the primes should not be studied directly, but instead should be viewed primarily [pun not intended] as a generic dense subset of a larger set of almost primes, for which much more is known, thanks to sieve theory…

Another problem that seems to generate radically new thinking every few years is the Kakeya problem. Originally a problem in geometric measure theory, the work of Bourgain and Wolff in the early 90s showed that the combinatorial incidence geometry viewpoint could lead to substantial progress. When this stalled, Bourgain (inspired by your own work) introduced the additive combinatorics viewpoint, re-interpreting line segments as arithmetic progressions. Meanwhile, Wolff created the finite field model of the Kakeya problem, which among other things lead to the sum-product theorem and many further developments that would not have been possible without this viewpoint. In particular, this finite field version enabled Dvir to introduce the polynomial method which had been applied to some other combinatorial problems, but whose application to the finite field Kakeya problem was hugely shocking. (Actually, Dvir’s argument is a great example of “new thinking” being the key stumbling block. Five years earlier, Gerd Mockenhaupt and I managed to stumble upon half of Dvir’s argument, showing that a Kakeya set in finite fields could not be contained in a low-degree algebraic variety. If we had known enough about the polynomial method to make the realisation that the exact same argument also showed that a Kakeya set could not have been contained in a high-degree algebraic variety either, we would have come extremely close to recovering Dvir’s result; but our thinking was not primed in this direction.) Meanwhile, Carbery, Bennet, and I discovered that heat flow methods, of all things, could be applied to solve a variant of the Euclidean Kakeya problem (though this method did appear in literature on other analytic problems, and we viewed it as the continuous version of the discrete induction-on-scales strategy of Bourgain and Wolff.) Most recently is the work of Guth, who broke through the conventional wisdom that Dvir’s polynomial argument was not generalisable to the Euclidean case by making the crucial observation that algebraic topology (such as the ham sandwich theorem) served as the continuous generalisation of the discrete polynomial method, leading among other things to the recent result of Guth and Katz you mentioned earlier.

EDIT: Another example is the recent establishment of universality for eigenvalue spacings for Wigner matrices. Prior to this work, most of the rigorous literature on eigenvalue spacings relied crucially on explicit formulae for the joint eigenvalue distribution, which were only tractable in the case of highly invariant ensembles such as GUE, although there was a key paper of Johansson extending this analysis to a significantly wider class of ensembles, namely the sum of GUE with an arbitrary independent random (or deterministic) matrix. To make progress, one had to go beyond the explicit formula paradigm and find some way to compare the distribution of a general ensemble with that of a special ensemble such as GUE. We now have two basic ways to do this, the local relaxation flow method of Erdos, Schlein, Yau, and the four moment theorem method of Van Vu and myself, both based on deforming a general ensemble into a special ensemble and controlling the effect on the spectral statistics via this deformation (though the two deformations we use are very different, and in fact complement each other nicely). Again, both arguments have precedents in earlier literature (for instance, our argument was heavily inspired by Lindeberg’s classic proof of the central limit theorem) but as far as I know it had not been thought to apply them to the universality problem before.

VIII

Technically, the following are not proofs, or even theorems, but I think they count as insights that have the quality that it’s hard to imagine computers coming up with them. First, there’s:

Mathematics can be formalized.

Along the same lines, there’s:

Computability can be formalized.

If you insist on examples of proofs then maybe I’d be forced to cite the proof of Goedel’s incompleteness theorem or of the undecidability of the halting problem, but to me the most difficult step in these achievements was the initial daring idea that one could even formulate a mathematically satisfactory definition of something as amorphous as “mathematics” or “computability.” For example, one might argue that the key step in Turing’s proof was diagonalization, but in fact diagonalization was a major reason that Goedel thought one couldn’t come up with an “absolute” definition of computability.

Nowadays we are so used to thinking of mathematics as something that can be put on a uniform axiomatic foundation, and of computers as a part of the landscape, that we can forget how radical these insights were. In fact, I might argue that your entire question presupposes them. Would computers have come up with these insights if humans had not imagined that computers were possible and built them in the first place? Less facetiously, the idea that mathematics is a formally defined space in which a machine can search systematically clearly presupposes that mathematics can be formalized.

More generally, I’m wondering if you should expand your question to include concepts (or definitions) and not just proofs?

Edit. Just in case it wasn’t clear, I believe that the above insights have fundamentally changed mathematicians’ conception of what mathematics is, and as such I would argue that they are stronger examples of what you asked for than any specific proof of a specific theorem can be.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s