The exponential of ... everything?

how far can we use the Taylor series to define the exponential?

I love mathematics because a majority of it, at least the part that is used in science and engineering, is self-consistent: different mathematical concepts can be traced back to the same formula. This post is concerned with the exponential function, which is linked to the following Taylor series for any real number $x$:

\[\boxed{e^x = \sum_{n=0}^{\infty} \frac{x^n}{n!} = 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} + \cdots \qquad x \in \mathbb{R}.}\]

For years, I have been using the above Taylor series to understand the exponential of many mathematical objects beyond scalars: matrices, operators, etc. It is like, if I forgot any nice thing about the exponential function, I could always use the Taylor series to derive it back.

This strategy had been successful for long, until recently I came across this YouTube video, which reminded me that the exponential of a vector is something quite different. It is unfortunately true that I cannot fully reason about the exponential of a vector by using only the Taylor series, except for special examples. Nevertheless, this blog post complements the view of the YouTube video. I challenge you to think about how far the Taylor series definition can go.


The exponential of a scalar

We shall recall that the Taylor series of $e^x$ is convergent for any real $x$. That is, we can use a truncation of it to approximate $e^x$ for any given $x$, no matter how large $x$ is (of course, the larger $x$, the more terms are needed for a reasonable approximation).

The Taylor series is also applicable to a complex $z$:

\[\boxed{e^z = \sum_{n=0}^{\infty} \frac{z^n}{n!} = 1 + z + \frac{z^2}{2!} + \frac{z^3}{3!} + \cdots \qquad z \in \mathbb{C}.}\]

Again, the series is convergent for any complex numbers. A notable special case, where $z$ is imaginary, leads to the famous formula by Euler:

\[\boxed{e^{i\theta} = \cos\theta + i \sin\theta \qquad \theta \in \mathbb{R}.}\]

A simple way to prove Euler’s formula is to separate the odd and even terms of the Taylor series of $e^{i\theta}$. The even terms sum up to $\cos\theta$, while the odd terms sum up to $i\sin\theta$:

\[e^{i\theta} = \sum_{n=0}^{\infty} \frac{(i\theta)^n}{n!} = \underbrace{1 - \frac{\theta^2}{2!} + \frac{\theta^4}{4!} + \cdots}_{\cos\theta} + \underbrace{i\theta - \frac{i\theta^3}{3!} + \frac{i\theta^5}{5!} + \cdots}_{i\sin\theta}.\]

The exponential of a matrix

The exponential of an $N \times N$ matrix $A$ (either real or complex) is naturally defined by using the Taylor series:

\[\boxed{e^A = \sum_{n=0}^{\infty} \frac{A^n}{n!} = I_{N \times N} + A + \frac{A^2}{2!} + \frac{A^3}{3!} + \cdots \qquad A \in F^{N \times N}, \quad F \in \{\mathbb{R}, \mathbb{C}\}.}\]

Here, $F$ denotes a field, which is typically taken as $\mathbb{R}$ or $\mathbb{C}$. In this definition, we identify the exponential to be a mapping from $F^{N \times N}$ to $F^{N \times N}$:

\[\exp : \mathbb{R}^{N \times N} \to \mathbb{R}^{N \times N} \qquad\text{or}\qquad \exp : \mathbb{C}^{N \times N} \to \mathbb{C}^{N \times N}.\]

To highlight the algebraic structures, the mapping is more often written by using two algebraic notations: for a ring $R$, $M_N(R)$ denotes the matrix ring that consists of all $N \times N$ matrices with entries in $R$; and for a field $F$, $\text{GL}(N,F)$ denotes the general linear group of degree $N$ — the set of $N \times N$ invertible matrices with entries in $F$. Instantiating $R$ and $F$ with either $\mathbb{R}$ or $\mathbb{C}$, we denote the exponential mapping as:

\[\exp : M_N(\mathbb{R}) \to \text{GL}(N, \mathbb{R}) \qquad\text{or}\qquad \exp : M_N(\mathbb{C}) \to \text{GL}(N, \mathbb{C}).\]

This notation signifies that the image is always nonsingular, regardless of the preimage. Additionally, note that the mapping in the $\mathbb{R}$ case is not surjective but it is in the $\mathbb{C}$ case.


From a computational perspective, an equivalent definition that I use more often is based on the Jordan normal form of $A$. Write

\[Z^{-1}AZ = J = \text{diag}(J_1, J_2, \ldots, J_p),\]

where $Z$ is invertible and each $J_k$ is a Jordan block corresponding to an eigenvalue $\lambda_k$:

\[J_k = J_k(\lambda_k) = \begin{bmatrix} \lambda_k & 1 \\ & \lambda_k & \ddots \\ & & \ddots & 1 \\ & & & \lambda_k \end{bmatrix}.\]

Then, $e^A$ can be equivalently defined as

\[\boxed{e^A = Z \cdot e^J \cdot Z^{-1} = Z \cdot \text{diag}(e^{J_k}) \cdot Z^{-1},}\]

where

\[e^{J_k} = e^{\lambda_k} \begin{bmatrix} 1 & 1 & \cdots & \frac{1}{(m_k-1)!} \\ & 1 & \ddots & \vdots \\ & & \ddots & 1 \\ & & & 1 \end{bmatrix}\]

directly follows from the Taylor series definition.

When $A$ is real symmetric or complex Hermitian, $A$ is unitarily diagonalizable. That is, $Z$ is unitary and $J$ is diagonal (with real diagonal entries!). In this case, the definition and computation of $e^A$ are much simpler because $e^J = \text{diag}(e^{\lambda_k})$ and $Z^{-1} = Z^*$.

For a broader account of matrix functions (not limited to the exponential), readers are referred to Higham’s blog post and his book Functions of Matrices: Theory and Computation.


One may recall that $e^A$ appears in the solution to the first-order ordinary differential equation $A \mathbf{x}(t) = \frac{d}{dt} \mathbf{x}(t)$. A famous example of this equation is the Schrödinger equation

\[H \ket{\psi(t)} = i \hbar \frac{d}{dt} \ket{\psi(t)},\]

where $H$ is the Hamiltonian matrix. When $H$ is time-independent, the solution is simply

\[\ket{\psi(t)} = e^{-i H t / \hbar} \ket{\psi(0)}.\]

It is interesting to note that $e^{-i H t / \hbar}$ is unitary, akin to the fact that Euler’s formula expresses $e^{i\theta}$ as a complex number with modulus $1$. All gates in quantum computing are unitary.


An operation that appears frequently in quantum computing is the exponentiation of the sum of two matrices: $e^{A+B}$. Unlike the scalar case, $e^{A+B} = e^A e^B$ does not hold in general, unless $A$ and $B$ commute. Based on the Taylor series, we have

\[e^A e^B = \left( I + A + \frac{1}{2}A^2 + \cdots \right) \left( I + B + \frac{1}{2}B^2 + \cdots \right) = I + A + B + AB + \frac{1}{2} A^2 + \frac{1}{2} B^2 + \cdots\]

and

\[e^{A+B+C} = I + A + B + C + \frac{1}{2}(A^2+BA+AB+B^2) + \frac{1}{2}(CA+CB+AC+BC+C^2) + \cdots.\]

Assuming that $C$ is a multinomial where each term has a degree at least 2, then equating the degree-2 terms of $e^A e^B$ and $e^{A+B+C}$ yields

\[C = \frac{1}{2}(AB-BA) + \text{higher-degree terms}.\]

The Baker–Campbell–Hausdorff formula gives the full expression of $C$. A straightforward consequence of this is that

\[\boxed{e^{\delta(A+B)} = e^{\delta A} e^{\delta B} + O(\delta^2),}\]

because when $\delta$ is small, the entire $\delta C$ becomes $O(\delta^2)$: \(e^{\delta(A+B+C)} - e^{\delta(A+B)} = O(\|\delta C\|) = O(\delta^2)\). This formula is often called Trotterization. By taking $\delta = \frac{1}{n}$, we obtain the Trotter product formula:

\[\boxed{e^{A+B} = (e^{\frac{A+B}{n}})^n = ( e^{\frac{A}{n}} e^{\frac{B}{n}} + O(n^{-2}) )^n = \lim_{n\to\infty} ( e^{\frac{A}{n}} e^{\frac{B}{n}} + O(n^{-2}) )^n = \lim_{n\to\infty} ( e^{\frac{A}{n}} e^{\frac{B}{n}} )^n.}\]

The exponential of the differential operator

Because a matrix is a finite-dimensional operator, one naturally asks if the matrix exponential can be extended to the exponential of an operator, possibly infinitely dimensional. The answer is affirmative. Moreover, the extension can be derived by using the Taylor series.

We are interested in the differential operator $\frac{d}{dx}$. The exponential of it is written as $e^{\frac{d}{dx}}$. More generally, we insert a variable $s$ to make it more interesting: $e^{s\frac{d}{dx}}$.

Remarkably, $e^{s\frac{d}{dx}}$ is a shift operator. When we apply it on $f$, the result $f(x+s)$ is a shift of $f(x)$ in $s$ units:

\[\boxed{(e^{s\frac{d}{dx}}) f(x) = f(x+s) \qquad f : \mathbb{R} \to \mathbb{R}.}\]

We can derive this nice property by using the Taylor series:

\[e^{s\frac{d}{dx}} = \sum_{n=0}^{\infty} \frac{s^n}{n!} \frac{d^n}{dx^n} = 1 + s\frac{d}{dx} + \frac{s^2}{2!} \frac{d^2}{dx^2} + \frac{s^3}{3!} \frac{d^3}{dx^3} + \cdots.\]

For a smooth $f$, we have

\[(e^{s\frac{d}{dx}})f(x) = 1 + sf'(x) + \frac{s^2}{2!} f''(x) + \frac{s^3}{3!} f'''(x) + \cdots,\]

which is nothing but the Taylor expansion of $f(x+s)$ around $x$, thus completing the proof.


The exponential of a vector

So far, we have used the Taylor series to understand the exponential of scalars, matrices, and differential operators. Unfortunately, this strategy comes to a break when we encounter the exponential of vectors.

For a vector $\mathbf{v} \in \mathbb{R}^N$, what does $e^{\mathbf{v}}$ mean?

The vector case appears in the context of differential geometry; the relevant concept there is the exponential map. Let $M$ be a differentiable manifold and let $\mathbf{v} \in T_{\mathbf{P}}M$ be a tangent vector to the manifold at $\mathbf{p}$. There is a unique geodesic $\gamma_{\mathbf{v}} : [0,1] \to M$ satisfying $\gamma_{\mathbf{v}}(0) = \mathbf{p}$ with initial tangent vector $\gamma_{\mathbf{v}}’(0) = \mathbf{v}$. The exponential map of $\mathbf{v}$ with respect to $\mathbf{p}$ is defined as $\exp_{\mathbf{p}}(\mathbf{v}) := \gamma_{\mathbf{v}}(1)$.

One sees that the notation $e^{\mathbf{v}}$ is imprecise in this context: the vector $\mathbf{v}$ to be exponentiated must be a tangent vector and the exponentiation depends additionally on a point $\mathbf{p}$ on the manifold. The notation $\exp_{\mathbf{p}}(\mathbf{v})$ is the right one to use. In plain language, an ant is walking on a surface by following the geodesic path starting from $\mathbf{p}$ with initial velocity $\mathbf{v}$. It lands on $\exp_{\mathbf{p}}(\mathbf{v})$ after 1 unit of time.

Let us look at a few examples. The first example considers the manifold to be the unit circle on $\mathbb{R}^2$. Let $\mathbf{p} = (1,0)$ be the right-most point on the circle and let the tangent $\mathbf{v} = (0,\theta)$ face upward ($\theta>0$). Intuitively, the geodesic $\gamma_{\mathbf{v}}(t)$ starts from $\mathbf{p}$ and traces the circle counter-clockwise. The ant walks along the circle with speed $\theta$. Hence, after 1 time unit, the ant traverses $\theta$ radians and lands on $(\cos\theta,\sin\theta)$. That is,

\[\boxed{\exp_{(1,0)}((0,\theta)) = (\cos\theta, \sin\theta).}\]

The second example considers the unit circle on the complex plane $\mathbb{C}$. Because the complex plane is diffeomorphic to $\mathbb{R}^2$, the result from the first example is straightforwardly translated to:

\[\boxed{\exp_1(i\theta) = \cos\theta + i\sin\theta.}\]

Look how similar the above equation is to Eular’s equation $e^{i\theta} = \cos\theta + i\sin\theta$ discussed earlier!

The third example considers the manifold to be the $(N-1)$-sphere $S^{N-1} \subset \mathbb{R}^N$. To make the discussions more interesting we do not restrict the sphere to have a unit radius. For a point $\mathbf{p}$ on the sphere with a tangent vector $\mathbf{v}$, we can show that the curve

\[\gamma_{\mathbf{v}}(t) = \cos\left(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}t\right) \mathbf{p} + \frac{\|\mathbf{p}\|}{\|\mathbf{v}\|} \sin\left(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}t\right) \mathbf{v}\]

stays on the sphere (because \(\|\gamma_{\mathbf{v}}(t)\| = \|\mathbf{p}\|\) for all $t$), is geodesic (because $\gamma_{\mathbf{v}}’’$ is parallel to the surface normal $\mathbf{n} = \gamma_{\mathbf{v}}$ at any time $t$), and satisfies $\gamma_{\mathbf{v}}(0) = \mathbf{p}$ and $\gamma_{\mathbf{v}}’(0) = \mathbf{v}$. Hence, the exponential map of $\mathbf{v}$ is

\[\boxed{\exp_{\textbf{p}}(\textbf{v}) = \gamma_{\mathbf{v}}(1) = \cos\left(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}\right) \mathbf{p} + \frac{\|\mathbf{p}\|}{\|\mathbf{v}\|} \sin\left(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}\right) \mathbf{v}.}\]

Clearly, the third example generalizes the first and second examples to high dimensions by letting the ant walk over the great circle defined by the intersection of the sphere and the plane that is defined by $\mathbf{p}$ and $\mathbf{v}$. In 1 time unit, the ant walks \(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}\) radians (equivalent to arc length \(\|\mathbf{v}\|\)).

The fourth example considers the manifold to be the right part of an isotropic hyperbola parameterized by $\mathbf{p} = (p\cosh\theta, p\sinh\theta)$ with $p>0$. The tangent at $\mathbf{p}$ is defined as $\mathbf{v} = (v\sinh\theta, v\cosh\theta)$ with $v>0$. It is not hard to see that

\[\gamma_{\mathbf{v}}(t) = \cosh\left(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}t\right) \mathbf{p} + \frac{\|\mathbf{p}\|}{\|\mathbf{v}\|} \sinh\left(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}t\right) \mathbf{v}\]

resides on the hyperbola (because $\gamma_{\mathbf{v},1}^2 - \gamma_{\mathbf{v},2}^2 = p^2$) and it satisfies $\gamma_{\mathbf{v}}(0) = \mathbf{p}$ and $\gamma_{\mathbf{v}}’(0) = \mathbf{v}$. Therefore, the exponential map of $\mathbf{v}$ is

\[\boxed{\exp_{\textbf{p}}(\textbf{v}) = \gamma_{\mathbf{v}}(1) = \cosh\left(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}\right) \mathbf{p} + \frac{\|\mathbf{p}\|}{\|\mathbf{v}\|} \sinh\left(\frac{\|\mathbf{v}\|}{\|\mathbf{p}\|}\right) \mathbf{v}.}\]

Is the exponential map connected to the Taylor series of the exponential function? Yes, but only for special cases such as the examples above.

To this end, we look into the Clifford algebra, which allows writing $e^{\mathbf{v}}$ legitimately. We shall understand $\mathbf{v}$ as a vector in an abstract vector space, rather than instantiating it as an Euclidean vector. The vector space is equipped with bases. What signifies the Clifford algebra is that one can square a basis and the result must be either $+1$ or $-1$. We use the notation $\text{Cl}_{p,q}(\mathbb{R})$ to denote the Clifford algebra over a real vector space which has $p+q$ bases, where $p$ of the bases square to $+1$ and $q$ of the bases square to $-1$. When all bases square to $+1$, the square of an arbitrary vector $\mathbf{v}$ is positive and hence we have \(\mathbf{v}^2 = +\|\mathbf{v}\|^2\); while if all bases square to $-1$, the square of $\mathbf{v}$ is negative and we have \(\mathbf{v}^2 = -\|\mathbf{v}\|^2\).

Now we write down the Taylor series and group the even terms and odd terms separately:

\[\begin{align*} e^{\textbf{v}} &= 1 + \textbf{v} + \frac{\textbf{v}^2}{2!} + \frac{\textbf{v}^3}{3!} + \frac{\textbf{v}^4}{4!} + \frac{\textbf{v}^5}{5!} + \cdots \\ &= \underbrace{1 + \frac{\textbf{v}^2}{2!} + \frac{\textbf{v}^4}{4!} + \cdots}_{\text{even terms}} + \underbrace{\textbf{v} + \frac{\textbf{v}^2}{3!}\textbf{v} + \frac{\textbf{v}^4}{5!}\textbf{v} + \cdots}_{\text{odd terms}}. \end{align*}\]

Consider two cases.

Case 1: When all bases square to $+1$, an example is $\text{Cl}_{3,0}(\mathbb{R})$, which is called the Pauli algebra. One can intuitively understand this algebra by considering that squaring all Pauli matrices will result in the identity matrix. In this case, the Taylor series becomes

\[e^{\textbf{v}} = \underbrace{1 + \frac{\|\textbf{v}\|^2}{2!} + \frac{\|\textbf{v}\|^4}{4!} + \cdots}_{\text{even terms}} + \underbrace{\textbf{v} + \frac{\|\textbf{v}\|^2}{3!}\textbf{v} + \frac{\|\textbf{v}\|^4}{5!}\textbf{v} + \cdots}_{\text{odd terms}}\]

which can be readily summarized as

\[\boxed{e^{\mathbf{v}} = \cosh(\|\textbf{v}\|) + \sinh(\|\textbf{v}\|) \frac{\textbf{v}}{\|\textbf{v}\|}.}\]

This identity is analogous to the fourth example (hyperbola) above, when the hyperbola is unit (that is, \(\|\textbf{p}\|=1\)). In other words, the exponential map for an hyperbola finds a connection with the Taylor series through the Clifford algebra $\text{Cl}_{3,0}(\mathbb{R})$.

Case 2: When all bases square to $-1$, an example is $\text{Cl}_{0,2}(\mathbb{R})$, the quaternions. One can intuitively understand this algebra by considering that the square of each quaternion basis is $-1$. In this case, the Taylor series becomes

\[e^{\textbf{v}} = \underbrace{1 - \frac{\|\textbf{v}\|^2}{2!} + \frac{\|\textbf{v}\|^4}{4!} + \cdots}_{\text{even terms}} + \underbrace{\textbf{v} - \frac{\|\textbf{v}\|^2}{3!}\textbf{v} + \frac{\|\textbf{v}\|^4}{5!}\textbf{v} + \cdots}_{\text{odd terms}}\]

which can be readily summarized as

\[\boxed{e^{\mathbf{v}} = \cos(\|\textbf{v}\|) + \sin(\|\textbf{v}\|) \frac{\textbf{v}}{\|\textbf{v}\|}.}\]

This identity is analogous to the third example (sphere) above, when the sphere is unit (that is, \(\|\textbf{p}\|=1\)). In other words, the exponential map for a sphere finds a connection with the Taylor series through the Clifford algebra $\text{Cl}_{0,2}(\mathbb{R})$.


In general, there is not a straightforward connection between the exponential map and the Taylor series of the exponential function. The four manifold examples we gave earlier are too special. In general, one may not have closed-form expressions for the manifolds, the geodesics, or the exponential maps. How $\exp_{\textbf{p}}(\textbf{v}) = \gamma_{\textbf{v}}(1)$ relates to the Taylor series remains a fascinating question.




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Some combinatorial graph problems that have an algebraic answer
  • Some facts about random orthonormal vectors and matrices
  • How to get the sorted array and indices in Python without importing additional modules like Numpy
  • A subtly wrong proof that ChatGPT generated
  • Distances between two multivariate normals