Some facts about random orthonormal vectors and matrices

Given a symmtrix matrix $A \in \mathbb{R}^{n \times n}$, we are interested in the quadratic forms $x^TAx$ and $X^TAX$, where $x \in \mathbb{R}^n$ is a random vector with a unit 2-norm (i.e., $x^Tx = 1$) and $X \in \mathbb{R}^{n \times n}$ is a random orthonormal matrix (i.e., $X^TX=XX^T=I_n$). More specifically, $x$ is uniformly random on the unit sphere $S^{n-1}$ and $X$ is distributed according to the Haar measure over the group of orthonormal matrices.

We first establish the properties for the random vector $x$. Let the elements of $x$ be $x_i$.

0.1. $\mathbb{E}[x_i^2] = \frac{1}{n}$.

0.2. $\mathbb{E}[x_i^2x_j^2] = \frac{1}{n(n+2)}$ for $i \ne j$.

0.3. $\mathbb{E}[x_i^4] = \frac{3}{n(n+2)}$.

0.4. $\mathbb{E}[xx^T] = \frac{1}{n}I_n$.

Now, for the quadratic form $x^TAx$, we have the following property:

1.1. $\mathbb{E}[x^TAx] = \frac{1}{n}\text{tr}(A)$.

To gain results for the second moment of $x^TAx$, we define

\[B = X^TAX.\]

As a random variable, $B_{ii} = x^TAx$ for all $i$ (but note that the different $B_{ii}$’s are not independent). Thus, the second moment of $x^TAx$ is the second moment of $B_{ii}$. We have the following properties:

2.1. $\mathbb{E}[B_{ii}] = \frac{1}{n}\text{tr}(A)$.

2.2. $\mathbb{E}[B_{ij}]=0$ for $i \ne j$.

2.3. $\mathbb{E}[\sum_{i=1}^n B_{ii}] = \text{tr}(A)$.

2.4. $\mathbb{E}[\sum_{i,j=1}^n B_{ij}^2] = \text{tr}(A^2)$.

2.5. $\displaystyle\mathbb{E}[B_{ii}^2] = \frac{2\text{tr}(A^2)+(\text{tr}A)^2}{n(n+2)}$.

2.6. $\displaystyle\mathbb{E}[B_{ij}^2] = \frac{n\text{tr}(A^2)-(\text{tr}A)^2}{n(n-1)(n+2)}$ for $i \ne j$.

2.7. $\displaystyle\mathbb{E}[B_{ii}B_{jj}] = \frac{(n+1)(\text{tr}A)^2-2\text{tr}(A^2)}{n(n-1)(n+2)}$ for $i \ne j$.

2.8. $\displaystyle\text{Var}(B_{ii}) = \frac{2n\text{tr}(A^2)-2(\text{tr}A)^2}{n^2(n+2)}$.

2.9. $\displaystyle\text{Cov}(B_{ii}, B_{jj}) = \frac{2(\text{tr}A)^2-2n\text{tr}(A^2)}{n^2(n-1)(n+2)}$ for $i \ne j$.

Proof of 0.1. By symmetry, $x_i^2$ are equal to each other for all $i$. Since they sum to $1$, we obtain $\mathbb{E}[x_i^2] = \frac{1}{n}$. $\square$

Proofs of 0.2 and 0.3 together. Squaring $\sum_{i=1}^n x_i^2 = 1$ and taking expectation, we have

\[n \mathbb{E}[x_i^4] + n(n-1) \mathbb{E}_{i \ne j}[x_i^2x_j^2] = 1.\]

Meanwhile, we let $z \sim \mathcal{N}(0,I_n)$. Clearly, $(z_i,z_j)$ and $(\frac{z_i+z_j}{\sqrt{2}}, \frac{z_i-z_j}{\sqrt{2}})$ are identically distributed. Then, for $x = z / |z|$, we also have that $(x_i,x_j)$ and $(\frac{x_i+x_j}{\sqrt{2}}, \frac{x_i-x_j}{\sqrt{2}})$ are identically distributed. Therefore,

\[\mathbb{E}_{i \ne j}[x_i^2x_j^2] = \mathbb{E}_{i \ne j}\left[\frac{(x_i+x_j)^2(x_i-x_j)^2}{4}\right] = \frac{1}{2}\mathbb{E}[x_i^4] - \frac{1}{2}\mathbb{E}_{i \ne j}[x_i^2x_j^2].\]

Combining the above two displayed equations, we solve that $\mathbb{E}[x_i^4] = \frac{3}{n(n+2)}$ and $\mathbb{E}_{i \ne j}[x_i^2x_j^2] = \frac{1}{n(n+2)}$. $\square$

Proof of 0.4. Using the result of 0.1, $\mathbb{E}[x_i^2] = \frac{1}{n}$. On the other hand, when $i \ne j$, $x_ix_j$ has the same probability/density as $-x_ix_j$. Therefore, $\mathbb{E}[x_ix_j] = 0$. $\square$

Proof of 1.1. This is simply because $\mathbb{E}[x^TAx] = \mathbb{E}[\text{tr}(Axx^T)] = \text{tr}(A\mathbb{E}[xx^T])$. Using the result of 0.4, we conclude the proof. $\square$

Proof of 2.1. This is the same as 1.1. $\square$

Proof of 2.2. Write down $B_{ij}$ explicitly in the bilinear form $B_{ij} = y^TAz$, where $y$ and $z$ are the $i$- and $j$-th column of $X$. By symmetry, $y^TAz$ and $(-y)^TAz$ have the same probability/density. Hence, $\mathbb{E}[y^TAz] = 0$. $\square$

Proof of 2.3. This simply follows from 2.1. $\square$

Proof of 2.4. Note that $\sum_{i,j=1}^n B_{ij}^2$ is the squared Frobenius norm of $B$. Thus, $\sum_{i,j=1}^n B_{ij}^2 = \text{tr}(B^2) = \text{tr}(X^TA^2X)$. Therefore, $\mathbb{E}[\sum_{i,j=1}^n B_{ij}^2] = \mathbb{E}[\text{tr}(X^TA^2X)] = \text{tr}(A^2\mathbb{E}[XX^T])$. The proof is concluded by noting that the random variable $XX^T$ is the deterministic constant $I_n$. $\square$

Proof of 2.5. We use $x^TAx$ in place of $B_{ii}$ to simplify the indices. Write

\[\begin{align*} \mathbb{E}[(x^TAx)^2] &= \mathbb{E}\left[\sum_{a,b,c,d=1}^n A_{ab}A_{cd}x_ax_bx_cx_d\right]\\ &= \sum_{a=1}^n A_{aa}^2 \mathbb{E}[x_a^4] + \sum_{a \ne b} A_{ab}A_{ab} \mathbb{E}[x_a^2x_b^2] + \sum_{a \ne b} A_{ab}A_{ba} \mathbb{E}[x_a^2x_b^2] + \sum_{a \ne c} A_{aa}A_{cc} \mathbb{E}[x_a^2x_c^2], \end{align*}\]

because the terms that include odd degrees of $x_a$ vanish by symmetry. By noting that $\sum_{a \ne b} A_{ab}A_{ab} = \text{tr}(A^2) - \sum_{a=1}^n A_{aa}^2$ and that $\sum_{a \ne c} A_{aa}A_{cc} = (\text{tr}A)^2 - \sum_{a=1}^n A_{aa}^2$, we further write

\[\mathbb{E}[(x^TAx)^2] = \Big( 2\text{tr}(A^2) + (\text{tr}A)^2 \Big) \Big( \mathbb{E}_{a \ne b}[x_a^2x_b^2] \Big) + \left( \sum_{a=1}^n A_{aa}^2 \right) \Big( \mathbb{E}[x_a^4]-3\mathbb{E}_{a \ne b}[x_a^2x_b^2] \Big).\]

Using the results of 0.2 and 0.3, we conclude the proof. $\square$

Proof of 2.6. Based on 2.4, by symmetry, we have $n \mathbb{E}[B_{ii}^2] + n(n-1) \mathbb{E}[B_{ij}^2] = \text{tr}(A^2)$. Using the result of 2.5, we conclude the proof. $\square$

Proof of 2.7. Based on 2.3, by symmetry, we have $n \mathbb{E}[B_{ii}^2] + n(n-1) \mathbb{E}[B_{ii}B_{jj}] = (\text{tr} A)^2$. Using the result of 2.5, we conclude the proof. $\square$

Proof of 2.8. Based on the definition of variance, we have $\text{Var}(B_{ii}) = \mathbb{E}[B_{ii}^2] - (\mathbb{E}[B_{ii}])^2$. Using the results of 2.5 and 2.1, we conclude the proof. $\square$

Proof of 2.9. Based on the definition of covariance, we have $\text{Cov}(B_{ii}, B_{jj}) = \mathbb{E}[B_{ii}B_{jj}] - \mathbb{E}[B_{ii}]\mathbb{E}[B_{jj}]$. Using the results of 2.7 and 2.1, we conclude the proof. $\square$

Remark. For a sanity check, let $A = \lambda I$. Then, $B = \lambda I$ is deterministic. Hence, 2.1 should be $\lambda$; 2.5 and 2.7 should be $\lambda^2$; 2.2, 2.6, 2.8, and 2.9 should be $0$; 2.3 should be $\lambda n$; and 2.4 should be $\lambda^2 n$.

Enjoy Reading This Article?