In the past couple of weeks, I’ve been posting about seemingly simple mathematical problems that defy intuition, and where the answers we find on the internet turn out to be shallow or hard to parse. For a taste, you might enjoy the articles on Gödel’s beavers or on infinite decimals. Today, let’s continue by asking a simple question: how many dimensions does a line have? A trained mathematician might blurt out an answer involving vector spaces or open set coverings, but there’s no fun in that. Instead, let’s take the scenic route. The “container” dimension What does it mean for a space to have a certain number of dimensions? Informally, we could say that a dimension is an independent axis along which we can position a geometric object. In one-dimensional space, a point can only be moved along a single path. In 2D, we typically talk of two orthogonal axes, x and y. In three dimensions, we have x, y, and z. There’s more nuance to certain exotic or stripped-down (topological) spaces, but we don’t need to go into any of that. The definition lends itself to a simple, common-sense way to classify the dimensionality of geometric shapes: we can look at the minimum number of spatial dimensions required to contain the object in question. A pencil sketch fits on a piece of paper, so it’s two-dimensional; a rock in your hand is 3D. The simplest way to define the dimensionality of a shape. Yet, this common-sense definition is unsatisfying if we consider that a lower-dimensional object might end up straddling a higher-dimensional space. If a line segment is rotated or bent, does that make it 2D? Or is that object forever one-dimensional, somehow retaining the memory of its original orientation and curvature? One dimension or two? We could argue either way, but no matter which option we choose, the answer doesn’t feel particularly principled. This tells us to look for a more substantive approach somewhere else. The “degrees-of-freedom” dimension To make progress, we can flip the script: instead of trying to contain the object within an n-dimensional space, we can analyze the dimensionality of the space within (or “under”) the object itself. Informally, the space under the earlier squiggles is one-dimensional. It’s a conga line of points; you can only move toward the front or the back. In contrast, the space under a disk (a filled circle) accommodates two-dimensional geometry: Two types of two-dimensionality. A simple and slightly less hand-wavy geometric test to distinguish between these cases is to pick an origin point within the shape and see how many degrees of freedom we have: The degrees-of-freedom model of dimensionality. Equivalently, we can ask how many different coordinates are needed to provide unambiguous “directions” from some chosen point A to B. If the points lie on a curve, a single value should do. On the surface of a filled square, we seemingly need two offsets: horizontal and vertical. This might sound clever — and indeed, it’s clever enough to be sometimes taught in school. But then, consider the following shape: The ambiguity of the DOF approach. Depending on the chosen origin, we get different results. Across the limbs, we rack up an infinite number of 1-DOF readings; but there’s also a single 2-DOF reading at the center of the shape. In other words, we need to amend our definition to resolve the ambiguity. If we let the peak value prevail, the shape must be declared two-dimensional, despite not really filling a 2D space. If we use the average value, we get a more logical result: it’s infinitely many 1-DOFs versus a singular 2-DOF, so the average is exactly 1D. We reach the same conclusion if we categorically exclude singularities. The day is saved… or so one would hope. Space-filling curveballs In the previous section, we tried to draw a fundamental distinction between one-dimensional curves and shapes that fill higher-dimensional spaces. The good news is that the resulting model is pretty neat. The bad news is that the distinction isn’t as fundamental as we hope. Consider the Hilbert curve: The construction of the Hilbert curve. The shape is constructed iteratively. We start by dividing the unit square into quadrants, then connect the centers of each quadrant to form a horseshoe-like curve (top, left). Next, we tile four scaled copies of the pattern, with the bottom two images rotated 90° CW and CCW (top, center). After that, we add three line segments that connect the tiled copies (top, right) back into a single curve. Finally, we use this new curve in lieu of the original horseshoe pattern as we repeat the tile-and-connect process (bottom row). Each iteration halves the spacing between the segments of the curve (and between the curve and the perimeter of the unit square). It follows that if we iterate forever, the gaps are reduced to zero and the curve crosses through each and every point within its build envelope. By construction, the shape is a one-dimensional curve; any point along it can be unambiguously described by a single coordinate. By the final appearance, however, the curve is indistinguishable from a filled square: it wholly covers a two-dimensional region without leaving any gaps. In fact, the Hilbert curve forms a wonderfully counterintuitive mapping from 1D to 2D. The approach can be extended to higher dimensions, hinting at a peculiar, wiggly equivalence between 1- and n-dimensional spaces. So… how do we develop a dimensionality model that seamlessly accounts for that kind of an object — or for other unusual, fractal shapes? Enter Mr. Minkowski There are several ways to answer this question, but by far the most intuitive is the approach outlined by Hermann Minkowski. The method is quite brilliant, so it’s a shame that it’s typically presented on the internet without explaining why it works. To get started, imagine a two-dimensional, square measurement grid spanning s = 10 cells horizontally (and the same number vertically). On that grid, we place the shape to be measured, and then count the number of cells (“boxes”) that need to be filled to completely cover the shape. For a simple line segment, we might get: Covering a one-dimensional shape. For now, pay no mind to errors that might be introduced if the grid is too coarse; let’s assume that the resolution is high enough to capture the shape with reasonable fidelity. With this assumption in mind, if we were to switch to a finer s = 20 grid, the number of filled cells would obviously double. We can generalize this by saying that for any one-dimensional shape, at a sufficient grid resolution, the required number of boxes (N) is: \(N_\textrm{(1D)} \approx C \cdot s^1\) Again, N is the box count and s is the “resolution” of the grid. As for the new constant C, it’s just a property of the measured object and its placement on the grid. In the example above, C = 0.6. For different line segment, we would get another number — but it’s always just a constant independent of s. Now, let’s try to cover a two-dimensional filled square: Box-counting a filled square. In this case, we’re essentially using boxes to measure area; the area is proportional to the square of edge length. We need N = 36 boxes for the pictured s = 10 grid; if we double the resolution to s = 20, the required cover increases four-fold. Once again, we can generalize this by saying that the cover for a two-dimensional shape is always: \(N_\textrm{(2D)} \approx C \cdot s^2\) As before, C is an object-specific constant; for the square above, it happens to be 0.36. We’d get a different number for a disk, but the area of a disk still scales with the square of its size. The same relationship holds true for higher-dimensional objects: the volume of a cube is proportional to the cube of the length of its sides, and so forth. If d is the “space-filling” dimension of the shape in question, the number of boxes should be: \(N \approx C \cdot s^d\) If we’re given the box count for an arbitrary shape and want to solve for d, we first gotta rearrange the equation: sd ≈ N/C. Next, we need an operation that’s the inverse of exponentiation. Specifically, we need a logarithm, which gives us the exponent for a known base: \(d \approx \log_{s}(N / C)\) For exponentiation, we have a number of familiar rules that let us simplify equations; for example, we know that xa · xb = xa+b. There are very similar, albeit oft-forgotten, rules for logarithms; in particular, log a (b) can be rewritten as log(b)/log(a). This allows us to tidy up the earlier equation, getting rid of the non-standard logarithm base of s: \(d \approx { \log(N / C) \over \log(s) }\) Further, we can split out the division by C because log(a/b) = log(a) - log(b): \(d \approx { \log(N/C) \over \log(s) } = {\log(N) \over \log(s)} - {\log(C) \over \log(s)}\) We have previously remarked that using a coarse grid introduces measurement errors; because of this, we want the number of cells to be as high as possible. In mathematical terms, we want s to “tend to infinity”. Under such an assumption, the second term of the equation — a fraction involving a finite constant C divided by an infinity-bound number — becomes infinitesimal. In other words, for the hypothetical s → ∞ endgame, term can be ignored. The position and the rough outline of the measured object no longer matters; the expression simplifies to: \(d_\textrm{box} = \lim_\limits{s \rightarrow \infty} {\log(N) \over \log(s)}\) This is the definition of the Minkowski dimension. Some sources use the size of a single box (ε) instead of the resolution of the measurement grid (s). In that case, the formula can be restated as: \(d_\textrm{box} = \lim_\limits{\varepsilon \rightarrow 0} {\log(N) \over \log(1 / \varepsilon)}\) So, what do you do with that? Well, for many shapes — even bizarre ones — we can intuitively reason about the relation between grid size and N. For example, for the Hilbert curve, the box-counting dimension is pretty evidently equal to 2. In contrast to the DOF approach, we don’t get different results depending on how we think about the shape. In other instances, we can use a computer to approximate the value. The machine can simply count the boxes using an increasingly fine grid until the readings converge to a stable value. Heck, this technique can even estimate the dimensionality of complex real-world shapes, such as plant leaves or coastlines. But here’s the kicker: for many fractal shapes, the box-counting dimension is not a whole number! For example, the Sierpiński triangle has a d box ≈ 1.6. The result makes sense: the fractal is a Swiss-cheese triangle that clearly fills space better than a straight line, but worse than any triangle without a pattern of infinitely many holes. The first steps of constructing the Sierpiński triangle. In fact, these measurements give us a proper hierarchy; we can sort arbitrarily complex objects based on their space-filling ability. It’s quite beautiful: we started with mushy intuition, and ended up with a new way to think about geometry. If you liked the content, please subscribe; there’s no better way to stay in touch with the writers you like.