If integration is summing many small piles, we have to figure out how big the piles are. Their height is usually given by a mathematical function, and our first example will be the same as in the Feynman trick article.
\[f(x) = \frac{x - 1}{\ln{x}}\]
This is to be integrated from zero to one, i.e. we want to know the size of the shaded area in the plot below. You can think of each column of shaded pixels as one pile, and we sum the size of all of them to get the total area.1 Of course, this is an svg image so there are no columns of pixels. Alternatively, the more we zoom in, the thinner the columns become – but the more of them there are. This is why we need integration: it’s dealing with the limit case of infinitely many, infinitely thin columns.
We could imagine drawing six random numbers between zero and one, and plotting piles of the corresponding height at those locations. Since there are six piles, their width is one sixth of the width of the area we are integrating.
Even though some of these piles overlap by chance, and even though there are some random gaps between them, the sum of their areas (0.66) comes very close to the actual shaded area determined analytically (0.69). If we draw more piles, we have to make them correspondingly thinner, but the agreement between their sum and the total size of the area improves.
These are 100× as many piles, and they’re 1/100th as thick to compensate. Their total area is 0.70 – very close to 0.69. If we draw even more piles, we’ll get even closer.
This illustrates a neat correspondence between integrals and expected values. In the simple case, we can frame it mathematically as
\[\int_a^b f(x) \mathrm{d}x = E(f(x))\]
In words, this says that integrating the function \(f\) between \(a\) and \(b\) is the same as taking the expected value of \(f(x)\) at uniformly distributed random points between \(a\) and \(b\).