z1 z2
[1,] -0.4255127 -1.91351696
[2,] -0.2829203 0.05775194
[3,] -0.8986773 -0.48302150
[4,] 0.7065184 -2.37167063
[5,] 2.0916699 0.35882645
[6,] 1.6356643 -0.54297591
University of Oxford
If you hear a “prominent” economist using the word “equilibrium,” or “normal distribution,” do not argue with him; just ignore him, or try to put a rat down his shirt.
– Nassim Nicholas Taleb
The vector that stacks the means of each constituent random variable in the the random vector \(\mathbf{X}\). \[ \mathbb{E}(\boldsymbol{X}) = \mathbb{E}\begin{bmatrix} X_1 \\ X_2 \\ \vdots \\ X_p \end{bmatrix} \equiv \begin{bmatrix} \mathbb{E}(X_1) \\ \mathbb{E}(X_2) \\ \vdots \\ \mathbb{E}(X_p) \end{bmatrix} \]
\[ \text{Var}(\mathbf{X}) = \text{Var}\begin{bmatrix} X_1 \\ X_2 \\ \vdots \\ X_p \end{bmatrix} = \begin{bmatrix} \text{Var}(X_1) & \text{Cov}(X_1, X_2) & \cdots & \text{Cov}(X_1, X_p) \\ \text{Cov}(X_2, X_1) & \text{Var}(X_2) & \cdots & \text{Cov}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots\\ \text{Cov}(X_p, X_1) & \text{Cov}(X_p, X_2) & \cdots& \text{Var}(X_p) \end{bmatrix} \]
\[ \text{Cor}(\mathbf{X}) = \text{Cor}\begin{bmatrix} X_1 \\ X_2 \\ \vdots \\ X_p \end{bmatrix} = \begin{bmatrix} 1 & \text{Cor}(X_1, X_2) & \cdots & \text{Cor}(X_1, X_p) \\ \text{Cor}(X_2, X_1) & 1 & \cdots & \text{Cor}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots\\ \text{Cor}(X_p, X_1) & \text{Cor}(X_p, X_2) & \cdots& 1 \end{bmatrix} \]
Start with independent standard normal RVs: \(Z_1\) and \(Z_2\). \[ \begin{align*} \mathbb{E}\begin{bmatrix} Z_1 \\ Z_2 \end{bmatrix} &\equiv \begin{bmatrix}\mathbb{E}(Z_1) \\ \mathbb{E}(Z_2) \end{bmatrix} = \begin{bmatrix} 0 \\ 0\end{bmatrix}\\ \\ \text{Var} \begin{bmatrix} Z_1 \\ Z_2\end{bmatrix} &\equiv \begin{bmatrix} \text{Var}(Z_1) & \text{Cov}(Z_1, Z_2) \\ \text{Cov}(Z_2, Z_1) & \text{Var}(Z_2) \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\\ \\ \text{Cor} \begin{bmatrix} Z_1 \\ Z_2\end{bmatrix} &\equiv \begin{bmatrix} 1 & \text{Cor}(Z_1, Z_2) \\ \text{Cor}(Z_2, Z_1) & 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \end{align*} \]
z1
and z2
.These are the distributions of \(Z_1\) and \(Z_2\) separately
Alternatively, use kernel density estimation:
Two-dimensional kernel density estimator viewed from above:
Colors are brighter in regions of higher density.
# Shift means from (0, 0) to (1, -1)
x <- cbind(x1 = z[, 1] + 1, x2 = z[, 2] - 1)
x_marginals <- ggplot(as_tibble(x)) +
geom_density(aes(x = x1), fill = 'black', alpha = 0.5) +
geom_density(aes(x = x2), fill = 'orange', alpha = 0.5) +
xlab('')
x_joint <- ggplot(as_tibble(x)) +
geom_density2d_filled(aes(x = x1, y = x2)) +
coord_fixed()
x_marginals + x_joint
mom.iq
and kid.score
.mom.iq
and kid.score
. Do they appear to be normally distributed?cov2cor()
but there isn’t one called cor2cov()
. Why?Circular contours become elliptical contours:
Construct \(X_1\) and \(X_2\) as linear combinations of \((Z_1, Z_2)\)
Now the ellipses are tilted rather than axis-aligned
Suppose that \(Z_1, Z_2 \sim \text{ iid N}(0,1)\) and \[ \boldsymbol{X} = \mathbf{A}\boldsymbol{Z}, \quad \boldsymbol{X}= \begin{bmatrix} X_1 \\ X_2 \end{bmatrix}, \quad \mathbf{A} = \begin{bmatrix} a & b \\ c & d\end{bmatrix}, \quad \begin{bmatrix} Z_1 \\ Z_2 \end{bmatrix}. \] Calculate \(\text{Var}(X_1)\), \(\text{Var}(X_2)\), and \(\text{Cov}(X_1, X_2)\) in terms of the constants \(a, b, c, d\). Using these calculations, show that the variance-covariance matrix of \(\boldsymbol{X}\) equals \(\mathbf{A} \mathbf{A}'\). Use this result to work out the variance-covariance matrix of my example from above with \(a = 2, b = 1, c = 1, d = 4\) and check that it agrees with the simulations.
Find \((a, b, c, d)\) so \(\text{Var}(X_1) =\text{Var}(X_2) = 1\) and \(\text{Cor}(X_1, X_2) = 0.5\) where \[ \begin{align*} X_1 &= a Z_1 + b Z_2 \\ X_2 &= c Z_1 + d Z_2. \end{align*} \]
[,1] [,2]
[1,] 1.0 0.0000000
[2,] 0.5 0.8660254
x1 x2
x1 1.0063783 0.5035995
x2 0.5035995 1.0047603
z
from above into x
such that x1
has variance one, x2
has variance four, and the correlation between them equals 0.4. Make a density plot of your result.Every variance-covariance matrix is symmetric, but not every symmetric matrix is a variance-covariance matrix:
[,1] [,2]
[1,] 4 16
[2,] 16 9
[,1] [,2]
[1,] 1.000000 2.666667
[2,] 2.666667 1.000000
Correlations shouldn’t come out to be larger than one!
chol()
returns \(\mathbf{R} = \mathbf{L}'\), i.e. \(\boldsymbol{\Sigma} = \mathbf{R}'\mathbf{R}\).chol()
M
is p.d., proceed as follows. First check that M[1,1]
is positive. Next use det()
to check that the determinant of M[1:2,1:2]
is positive. Finally check that det(M)
is positive. If M
passes all the tests, it’s p.d. The same procedure works for any matrix: check that the determinant of each leading principal minor is positive. Only one of these matrices is p.d. Which one? \[
\mathbf{A} =\begin{bmatrix}
1 & 2 & 3\\
2 & 2 & 1\\
3 & 1 & 3
\end{bmatrix}, \quad
\mathbf{B} = \begin{bmatrix}
3 & 2 & 1 \\
2 & 3 & 1 \\
1 & 1 & 3
\end{bmatrix}
\]chol()
to make 100,000 draws from a MV normal distribution with this variance matrix. Check your work with var()
.mvtnorm()
and consult ?rmvnorm()
. Then repeat the preceding exercise “the easy way,” without using chol()
. Check your work.