Counterintuitive properties of high-dimensional spaces

I have been enjoying 3Blue1Brown’s Youtube videos for a quite a while. He does a terrific job by using visualization and animation to convey the otherwise abstract and obscure mathematical ideas. I am inspired by his recent, stunning video – The most beautiful formula not enough people understand – that discusses the volumes of high-dimensional balls. His sense of humor (for example, when introducing Donald Knuth around 39:25) is an added bonus to the entertainment of the show.

This post extends some of the ideas from the video regarding how high-dimensional geometry is counterintuitive. The counter-intuition makes visualization hard. However, understanding these phenomena is crucial to statistics and machine learning, where high-dimensional vectors prevail. Some phenomena are interestingly tied to the concentration of measures.

For starter, let us summarize the surface areas ($A_n$) and ball volumes ($B_n$) in $\mathbb{R}^n$ in the following table.

\[\begin{array}{c|cccccccc} d & 0 & 1 & 2 & 3 & \cdots & n-2 & n-1 & n \\ \hline A_d(r) & 0 & 2 & \color{red} 2\pi r & 4\pi r^2 & \cdots & & & \color{orange} \dfrac{n\pi^{\frac{n}{2}}}{\Gamma(\frac{n}{2}+1)}r^{n-1} \\ V_d(r) & 1 & 2r & \pi r^2 & \dfrac{4}{3}\pi r^3 & \cdots & \color{green} \dfrac{\pi^{\frac{n}{2}-1}}{\Gamma(\frac{n}{2})}r^{n-2} & & \\ \end{array}\]

A word about notation. We use $A_n$ to denote the surface area of an $n$-dimensional ball. I do not use the word “sphere” in this blog post to avoid confusion. Formally, a sphere living in $\mathbb{R}^n$ is called an $(n-1)$-dimensional sphere, or $(n-1)$-sphere, defined by the equation $x_1^2+\cdots+x_n^2=r^2$. The area of an $(n-1)$-sphere is generally denoted by $S_{n-1}$. It is the dimension concept $n-1$ versus $n$ that easily causes confusion. In this blog post, I call the sphere “the surface of an $n$-dimensional ball” instead. Its area is called the surface area and is denoted by $A_n$. One can do the equivalent substitution $S_{n-1} = A_n$.

We only need two formulas (together with the initial conditions at $n=0$ and $1$) to establish the above table. The first formula essentially says that the volume is an integration over the surface area, which is straightforward:

\[A_n(r) = \frac{d}{dr} V_n(r).\]

The second formula comes from Archimedes. He found that the surface area of a 3-dimensional ball, $A_3(r)$, is the same as the side area of the cylinder inscribing the ball (that is, the cylinder has a height $2r$ and base radius $r$). In other words, $A_3(r) = 2r \times 2\pi r = V_1(r) \times A_2(r)$. He proved this by projecting the surface of the 3-dimensional ball onto the side surface of the cylinder. See 3Blue1Brown’s video (starting from 21:35) for a visual explanation of the proof. Generalizing to higher dimensions, here is the critical formula:

\[{\color{orange} A_n(r)} = {\color{green} V_{n-2}(r)} \times {\color{red} A_2(r)}.\]

In fact, I do not see how the 3-dimensional intuition generalizes to high dimensions, but the formula is correct regardless and it can be proved from other angles. It is the beautiful animation from the video that makes the formula memorable.

With the above two formulas, we can show that

\[\boxed{A_n(r) = \frac{n\pi^{\frac{n}{2}}}{\Gamma(\frac{n}{2}+1)}r^{n-1} \quad\text{and}\quad V_n(r) = \frac{\pi^{\frac{n}{2}}}{\Gamma(\frac{n}{2}+1)}r^{n}.}\]

Many counterintuitive phenomena of high-dimensional spaces start from this result.

High-dimensional balls are infinitesimal. The volume of the unit ball (in fact, any ball of a fixed radius $r$) vanishes, because

\[V_n(1) = \frac{\pi^{\frac{n}{2}}}{\Gamma(\frac{n}{2}+1)} \to 0 \text{ as } n \to 0.\]

When $n$ is an even integer, the gamma function is the factorial and hence $V_n(1) = \frac{\pi^{n/2}}{(n/2)!}$. This is the “most beautiful formula not enough people understand” in 3Blue1Brown’s mind. We shall see how it plays a role in the visualization of a high-dimensional cube.

High-dimensional cubes are spiky. Imagine a box of side-length 2 in $\mathbb{R}^2$. A ball of radius 1 is placed at each of the four corners of the box. An additional ball at the center of the box, tangent to all the outer balls, has a radius $\sqrt{2}-1$.

Similarly, in $\mathbb{R}^3$, eight balls of radius 1 are placed at the eight conners of a cube with side-length 2. An additional ball at the center of the cube, tangent to all the outer balls, has a radius $\sqrt{3}-1$.

Following this logic, in $\mathbb{R}^4$, the centered ball has a radius $\sqrt{4}-1=1$, which means that it touches not only the corner balls, but also the faces of the cube. Then, in $\mathbb{R}^5$, the radius $\sqrt{5}-1>1$, suggesting that the centered ball swells outside of the cube. Our visualization system starts to strike.

How should we picture the $n$-dimensional cube and the centered ball?

The following images, respectively, visualize the 2-dimensional setup, the 3-dimensional setup, the spiky characteristic of the $n$-dimensional cube, and the full visualization of the cube, which has $2^n$ corners, together with $2^n$ outer balls at the corners.

Snapshots from 3Blue1Brown's video.

I fully appreciate the spiky visualization of the $n$-dimensional cube. It is spiky because the distance from the center to a corner keeps increasing, while the distance to a face is fixed. In the fourth image, the centered ball, indicated by the big green circle, swells so much that it contains the cube along most of the directions. Numerically, we shall be convinced that this is indeed the case, as the volume of the cube is $2^n$, while the volume of the centered ball is

\[V_n(\sqrt{n}-1) = \frac{\pi^{\frac{n}{2}}(\sqrt{n}-1)^n}{\Gamma(\frac{n}{2}+1)}.\]

To understand how big it is, we perform a Taylor approximation on its logarithm and obtain

\[\log \frac{\pi^{\frac{n}{2}}(\sqrt{n}-1)^n}{\Gamma(\frac{n}{2}+1)} = \frac{1+\log2\pi}{2} n - \sqrt{n} -\frac{1}{2}\log n - \frac{1+\log\pi}{2} + O\left(\frac{1}{\sqrt{n}}\right) > \log 2^n \quad\text{for large } n.\]

There are more interesting things to say pertaining to the fourth image. Let us keep increasing $n$. Recall that $n$-dimensional unit balls have a vanishing volume. Hence, the corner balls will be smaller and smaller as $n$ increases. Meanwhile, the cube will be more and more skinny with a shrinking heart. That is, the middle white circle, with a fixed radius 1, shrinks. From the above formula, we see that the ratio between the volume of the centered ball and that of the cube increases exponentially as $e^{\frac{1+\log2\pi}{2} n} / 2^n \approx 2.07^n$, ignoring lower-order multiplicative terms.

“The mass of an orange is mostly on the skin.” While you can take this statement literally (with a slight exaggeration) when complaining the thick skins of the oranges you buy, the metaphor here reflects a fact about a high-dimensional ball: its volume mostly lies near the surface.

This is not hard to understand, because the surface-area-to-volume ratio $\frac{A_n(r)}{V_n(r)} = \frac{n}{r}$, which grows proportionally with respect to $n$. For a ball with radius $r$, a thin layer near the surface (say, $\delta r$) can constitute a large portion of the volume when $n$ is large:

\[\frac{V_n(r)-V_n(r-\delta r)}{V_n(r)} \approx \frac{A_n(r) \delta r}{V_n(r)} = n \frac{\delta r}{r}.\]

Here, the approximation sign results from using finite difference to approximate derivatives, which is relatively accurate when $\frac{\delta r}{r} \ll 1$.

The mathematically precise statement is the following: Given any $\epsilon \in (0,1)$,

\[V_n(r)-V_n(r-\delta r) \ge \epsilon V_n(r) \quad\text{holds as long as}\quad \delta r \ge [1 - (1 - \epsilon)^{\frac{1}{n}}] r.\]

For example, in $n=100$ dimensions, even an aggressive $\epsilon = 99\%$ can be satisfied with $\delta r \approx 0.045 r$.

Random high-dimensional vectors are nearly orthogonal. This near-orthogonality lemma will give us more, also counterintuitive, understanding of the concentration of volumes later. To be precise, we say that two Gaussian random vectors are nearly orthogonal, when $n$ is large. Let $x,y \sim \mathcal{N}(0,I_n)$ independently. Then, $u=\frac{x}{\|x\|}$ and $v=\frac{y}{\|y\|}$ are two independent, uniformly random points on the surface of the unit ball. The angle between $x$ and $y$ is the same as the angle between $u$ and $v$. Formally, we have

\[\sqrt{n} \langle u,v \rangle \to \mathcal{N}(0,1).\]

In other words, the angle between $u$ and $v$ is concentrated around $0$.

Proof. When $x$ follows $n$-dimensional standard normal, each element of $x$ (say, $x_i$) is independent standard normal. By the law of large numbers, $\|x\|/\sqrt{n} \to 1$, because $x_i^2$ is Chi-square with mean $1$. A similar argument applies to $y$. Moreover, by the central limit theorem, $\langle x,y \rangle / \sqrt{n} \to \mathcal{N}(0,1)$ because the $x_iy_i$’s are iid. Then, $\sqrt{n} \langle \frac{x}{\|x\|}, \frac{y}{\|y\|} \rangle \to \mathcal{N}(0,1)$. $\square$

Most of the volume of a high-dimensional ball is near the equatorial plane. We have seen that the volume of an $n$-dimensional ball mostly lies on the surface. Now, we show that the volume is concentrated near the equatorial plane. As a corollary, we can also say that the volume mostly lies on the equator.

The intuition comes from the above near-orthogonality lemma. Pick a random point on the surface of the ball; call it the pole. Because other points are nearly orthogonal to the pole, they lie near the equatorial plane.

Another intuition comes from the volume formula. The volume of the ball is $V_n(r)$, while the volume of the equatorial plane is $V_{n-1}(r)$. Their inverse ratio

\[\frac{V_{n-1}(r)}{V_n(r)} = \frac{\Gamma(\frac{n}{2}+1)}{\Gamma(\frac{n}{2}+\frac{1}{2})} \frac{1}{\sqrt{\pi}} \frac{1}{r} \sim \frac{\sqrt{n}}{\sqrt{\pi}r}\]

is approximately proportional to $\sqrt{n}$. We only need a thin layer around the equatorial plane to reach a substantial portion of $V_n(r)$. This argument is very similar to the one used to argue that the volume of a ball mostly lies on the surface.

Both intuitions point to an interesting fact: the statement works for any plane passing through the origin. That is, the ball volume is concentrated not only near the equatorial plane, but also any meridian plane!

The precise mathematical statement is the following. Pick any direction in $\mathbb{R}^n$. The volume of the $n$-dimensional ball is an integration of the volumes of the $(n-1)$-dimensional balls along the direction. That is,

\[V_n(r) = \int_{-r}^r V_{n-1}(\sqrt{r^2-x^2})\,dx.\]

For any $s \in [0,r]$, we have

\[\int_{-s}^s V_{n-1}(\sqrt{r^2-x^2})\,dx \ge \rho_s V_n(r) \quad\text{where}\quad \rho_s = 1-\frac{6re^{-\frac{(n-1)s^2}{2r^2}}}{5s\sqrt{n-1}}.\]

This inequality is called a concentration bound. Such bounds are generally hard to be made tight, but the exponential in $\rho_s$ suggests how quickly the remaining volume vanishes when $s$ increases (i.e., how quickly $\rho_s$ increases to 1). To see how much concentration the band $[-s,s]$ gathers, we have, as an example, $\rho_s = 91.67\%$ when $s = 0.2 r$ and $n=100$.

Proof. Because $V_n(r)$ is proportional to $r^n$, we have

\[V_n(r) = \int_{-r}^r V_{n-1}(\sqrt{r^2-x^2})\,dx = V_{n-1}(r) \int_{-r}^r \left(1-\frac{x^2}{r^2}\right)^{\frac{n-1}{2}}\,dx.\]

Let us perform a change of variable $x=rt$. On the one hand, we have

\[\int_{s}^r \left(1-\frac{x^2}{r^2}\right)^{\frac{n-1}{2}}\,dx = r \int_{s/r}^1 (1-t^2)^{\frac{n-1}{2}}\,dt \le r \int_{s/r}^{\infty} e^{-t^2(\frac{n-1}{2})}\,dt \le r \int_{s/r}^{\infty} \frac{rt}{s} e^{-t^2(\frac{n-1}{2})}\,dt = \frac{r^2}{s(n-1)} e^{-\frac{(n-1)s^2}{2r^2}},\]

where the first inequality uses $1-u \le e^{-u}$ for positive $u\le1$ and the second inequality uses $\frac{rt}{s} \ge 1$ on the integration interval. On the other hand, we have

\[\int_0^r \left(1-\frac{x^2}{r^2}\right)^{\frac{n-1}{2}}\,dx = r \int_0^1 (1-t^2)^{\frac{n-1}{2}}\,dt \ge r \int_0^{\frac{1}{\sqrt{n-1}}} \left(1-t^2\frac{n-1}{2}\right)\,dt = \frac{5r}{6\sqrt{n-1}},\]

where the inequality uses $(1-u)^m \ge 1-um$ when $u \le \frac{1}{m}$. Overall,

\[\int_{s}^r \left(1-\frac{x^2}{r^2}\right)^{\frac{n-1}{2}}\,dx \le (1-\rho_s) \int_0^r \left(1-\frac{x^2}{r^2}\right)^{\frac{n-1}{2}}\,dx.\]

Hence, by symmetry, we conclude the proof. $\square$

Remark. Let us add a third intuition to the concentration, inspired by the proof. With a change of variable $x = r\sin\theta$ for $\theta\in[-\frac{\pi}{2},\frac{\pi}{2}]$, we see that

\[V_n(r) = V_{n-1}(r) \cdot r \int_{-\frac{\pi}{2}}^{\frac{\pi}{2}} (\cos\theta)^n \, d\theta.\]

The integrand $(\cos\theta)^n$ is concentrated near $\theta=0$ when $n$ is large. Hence, a substantial portion of the integral comes from the integration over a narrow band $\theta \in [-\psi, \psi]$ for a small positive $\psi$. This band dictates the thickness around the equatorial plane.

Most of the surface area of a high-dimensional ball is near the equator. Not surprisingly, the concentration results extend to the surface area, and the intuitions similarly follow those for the ball volumes.

Intuition 1 (near-orthogonality lemma): Pick a random point on the surface and call it the pole. Because other points on the surface are nearly orthogonal to the pole, they lie near the equator. Again, the equator is arbitrary because the pole is arbitrary.

Intuition 2 (surface area formula): The inverse ratio between the surface area of an $n$-dimensional ball, $A_n(r)$, and the surface area of the equator, $A_{n-1}(r)$, is

\[\frac{A_{n-1}(r)}{A_n(r)} = \frac{n-1}{n} \frac{\Gamma(\frac{n}{2}+1)}{\Gamma(\frac{n}{2}+\frac{1}{2})} \frac{1}{\sqrt{\pi}} \frac{1}{r} \sim \frac{\sqrt{n}}{\sqrt{\pi}r},\]

which grows approximately as $\sqrt{n}$. We only need a thin band around the equator to reach a substantial portion of $A_n(r)$.

Intuition 3 (concentration of $(\cos\theta)^{n-1}$): Pick any direction in $\mathbb{R}^n$. The surface area of the $n$-dimensional ball is an integration of the surface areas of the $(n-1)$-dimensional balls along the direction. That is,

\[A_n(r) = \int_{-r}^r A_{n-1}(\sqrt{r^2-x^2}) \, dx.\]

Because $A_n(r)$ is proportional to $r^{n-1}$, we have

\[A_n(r) = A_{n-1}(r) \int_{-r}^r \left( 1 - \frac{x^2}{r^2} \right)^{\frac{n-2}{2}} \, dx.\]

With a change of variable $x = r\sin\theta$ for $\theta \in [-\frac{\pi}{2}, \frac{\pi}{2}]$, we see that

\[A_n(r) = A_{n-1}(r) \cdot r \int_{-\frac{\pi}{2}}^{\frac{\pi}{2}} (\cos\theta)^{n-1} \, d\theta.\]

The integrand $(\cos\theta)^{n-1}$ is concentrated near $\theta=0$ when $n$ is large. Hence, a substantial portion of the integral comes from the integration over a narrow band $\theta\in[-\psi,\psi]$ for a small positive $\psi$. This band dictates the thickness around the equator.

We can also establish a concentration bound like the one for the volume case. For any $s \in [0,r]$, we have

\[\int_{-s}^s A_{n-1}(\sqrt{r^2-x^2}) \, dx \ge \tau_s A_n(r) \quad\text{where}\quad \tau_s = 1-\frac{6re^{-\frac{(n-2)s^2}{2r^2}}}{5s\sqrt{n-2}}.\]

The proof is similar and I omit it here.

Enjoy Reading This Article?