MathOverflow: a miscellany (Part IV)

Nice quote-worthy answers and questions from MathOverflow every Tuesday!

Why is differentiating mechanics and integrating art?

It is often said that “Differentiation is mechanics, integration is art.” We have more or less simple rules in one direction but not in the other (e.g. product rule/simple <-> integration by parts/u-substitution/often tricky).

There are all kinds of anecdotes alluding to this fact (see e.g. this nice one from Feynman). Another consequence of this is that differentiation is well automatable within CAS but integration is often not.

We know that there is a deep symmetry based on the Fundamental theorem of calculus, yet there seems to be another fundamental structural asymmetry. What is going on here…and why?

The main objection to the question is that asymmetry between two inverse operations is more the rule than the exception in math so they are not very surprised by this behaviour.

There is no doubt about that – but, and that is a big but, there is always a good reason for that kind of behaviour! E.g. multiplying prime numbers is obviously easier than factoring the result since you have to test for the factors doing the latter. Here it is understandable how you define the original operation and its inverse.

With symbolic differentiation and integration the case doesn’t seem to be that clear cut – this is why there are so many good discussions taking place in this thread (which by the way please me very much). It is this Why at the bottom of things I am trying to understand.


One relevant thing here is that you are referring to differentiating and integrating within the class of so-called elementary functions, which are built recursively from polynomials and complex exponential and logarithmic functions and taking their closure under the arithmetic operations and composition. Here one can argue by recursion to show that the derivative of an elementary function is elementary, but the antiderivatives might not be elementary. This should surprise one no more than the fact than the square of a rational number is rational, but the square root of a rational number might be irrational. (The analogy isn’t completely idle, as shown by differential Galois theory.)

In other words, the symmetry you refer to is really based on much wider classes of functions (e.g. continuous and continuously differentiable functions), far beyond the purview of the class of elementary functions.

But let’s put that aside. The question might be: is there a mechanical procedure which will decide when an elementary function has an elementary antiderivative (and if it does, exhibit that antiderivative)? There is an almost-answer to this, the so-called Risch algorithm, which I believe is a basis for many symbolic integration packages. But see particularly the issues mentioned in the section “Decidability”.

There is another interesting asymmetry: in first-order logic, derivatives are definable in the sense that given some expansion of the structure of real numbers, say for example the real numbers as an exponential field, the derivative of a definable function is again definable by a first-order formula. But in general there is no purely first-order construction of for example the Riemann integral (involving quantification over finer and finer meshes). I seem to recall that there are similar difficulties in getting a completely satisfactory notion of integration for recursively defined functions on the surreals, due in part to the incompleteness (i.e., the many holes) in the surreal number line.


I want to try a different way of answering the question of why differentiation is somehow the “primary” operation and anti-differentiation the inverting operation. I’ll try to make it as elementary as possible.

First let me say what I don’t count as an answer. (This is not supposed to be a new contribution to the discussion — just making my starting point explicit.) It’s not enough to point out that differentiation obeys the product and chain rules and explicit differentiability of 1/x (thereby giving the quotient rule as well) and that anti-differentiation doesn’t. Somehow one wants an answer that explains why we should have expected this in advance. I was going to say something about differentiation tending to simplify functions, but then realized that that’s not really true: it may be true for polynomials but there are lots of functions for which it’s false.

The small suggestion I wanted to make was to discretize the question and think about summation versus taking difference functions. Here the situation is slightly confusing because expressions like Nn=1f(n)tend to come up more often than expressions like f(n)f(n1). But let’s forget that and think about what it is that we have to do if we want to work out Nn=1f(n) explicitly. Usually we need to guess a function g and prove that g(n)g(n1)=f(n) for every n, from which it follows by induction that g(N)=Nn=1f(n)+g(0). This (the finding of the function g) is a discrete analogue of anti-differentiating.

Looked at this way, to work out the sum we have to solve the functional equation g(n)g(n1)=f(n) (where we are given f and are required to solve for g). By contrast, when we work out the difference function we are solving the similar looking, but much easier, functional equation g(n)=f(n)f(n1). It’s much easier because the unknown function is involved only once: indeed, in a sense there’s nothing to solve at all, but experience shows that we can usually simplify the right-hand side. Of course, with the first equation we can’t simplify the left-hand side in a similar way because we don’t know what g is.

It’s a bit like the difference between solving the equation x2+x=10 and solving the equation x=102+10 (that is, the difference between algebra and arithmetic). So here is a real sense, in a closely analogous situation, where one operation is direct and the other is indirect and involves solving for something.


Differentiation is inherently a (micro-)local operation. Integration is inherently a global one.

EDIT: the reason that locality helps with symbolic differentiation of elementary functions (and is not present to help with symbolic integration of the same functions) is that the basic arithmetic operations used to build elementary functions are simpler locally than they are globally. In particular, multiplication and division become linear in the infinitesimal variables,


leading of course to the product and quotient rules which are two of the primary reasons why symbolic differentiation is so computable.

Note that properties such as holomorphicity, mentioned in the comments, are not quite as local as differentiation, because in order to be holomorphic at a point, one must not only be complex differentiable at a point, but also complex differentiable on a neighbourhood around that point. (In the jargon ofmicrolocal analysis, it is merely a local property rather than a microlocal one.)

Finally, the reason why the inverse of a local operation (differentiation) is global is because differentiation is not locally injective (constants have zero derivative). In order to eliminate this lack of injectivity, one needs to impose a global condition, such as a vanishing at one endpoint of the domain.

SECOND EDIT: Another way to see the relationship between locality and computational difficulty is to adopt a computational complexity perspective. A Newton quotient, being local, only requires O(1) operations to compute. On the other hand, a Riemann sum, being global, requires O(N) operations to compute, where N is the size of the partition. This helps explain why the former operation preserves the class of elementary (or bounded complexity) functions, while the latter does not.

(This is only if one works in the category of exact calculation. If one is instead interested in numerical calculation and is willing to tolerate small errors, then the situation becomes reversed: numerical integration, being more stable than numerical differentiation, usually has lower complexity thanks to tools such as quadrature. Here, one can turn the global nature of integration to one’s advantage, by allowing one to largely ignore small-scale structure, assuming of course that the integrand enjoys some regularity.)


Todd Trimble’s (fantastic) answer is the clear limiting factor as to why integration is harder in this setting. But I think this leaves the OP scratching his head: why do we take the point of view that differentiation is the basic operation and integration is the inverse? The answer is that there is a good reason it is harder but that people still take this point of view.

One could attempt to take the opposite point of view and assume integration is the “basic” operation and that differentiation is the opposite of integration; the basic question is then: given a function g, when do we have an f such that g(x)=C+x0f(t)dt? The answer is: whenever g is absolutely continuous. This property is quite a bit stronger than regular continuity and also quite a bit more difficult to check (unless gwere Lipschitz, say). Continuity allows you to check each point, but for absolute continuity one essentially has to look at all points in an interval simultaneously. Thus transferring theorems about integrals to theorems about derivatives is more delicate.

Even so, this is in many ways the “modern” view of the derivative in analysis, e.g. Sobolev spaces, weak solutions to differential equations, and so on. The derivative is such a poorly behaving operator that it is actually more convenient to take this approach.


A counting (or entropy) argument can also be used to heuristically indicate why inverting differentiation should not be easy. One can empirically verify that if one differentiates an elementary function of complexity N (i.e. it takes N mathematical symbols in order to describe it), one usually obtains an elementary function of complexity greater than N. (Polynomials are an exception to this rule, but they are virtually the only such exception. Not coincidentally, polynomials are one of the rare subclasses of elementary functions for which integration is easy.) If differentiation was invertible within the class of elementary functions, this would suggest that if one integrated a typical elementary function of complexity at most N, one should get an elementary function of complexity strictly less than N. (This is unfortunately not a rigorous implication, because the notion of “typical” need not be preserved by differentiation or integration, but let us suspend this issue for the sake of the heuristic.) But there are significantly more functions in the former class than the latter (the number of functions of a given complexity grows exponentially in that complexity), a contradiction.

Note though that the above argument is quite far from rigorous. There are certainly many operations (e.g. inverting a large square matrix) such that both the operation and its inverse (which, in the case of matrix inversion, is itself) map bounded complexity objects to bounded complexity objects, and typically increase complexity. However, it does suggest that in the absence of any particular reason to believe that inversion is easy, the default assumption should be that inversion is hard.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s