15  Joint Normal Distributions

Figure 15.1: Bivariate Normal distribution with means (0, 0), standard deviations (1, 1) and correlation 0.7.

Figure 15.2: Bivariate Normal distribution with two conditional slices highlighted.

Figure 15.3: Two conditional Normal distributions.

Example 15.1 Suppose that SAT Math (\(M\)) and Reading (\(R\)) scores of CalPoly students have a Bivariate Normal distribution. Math scores have mean 640 and SD 80, Reading scores have mean 610 and SD 70, and the correlation between scores is 0.7.

  1. Identify the distribution of Math scores. Find the probability that a student has a Math score above 700.




  2. Compute and interpret \(\text{E}(M|R = 700)\).




  3. Compute and interpret \(\text{SD}(M|R = 700)\).




  4. Identify the conditional distribution of Math scores given the Reading score is 700. Find the probability that a student has a higher Math than Reading score if the student scores 700 on Reading.




  5. Compute and interpret \(\text{E}(M|R = 550)\).




  6. Compute and interpret \(\text{SD}(M|R = 550)\).




  7. Identify the conditional distribution of Math scores given the Reading score is 550. Find the probability that a student has a higher Math than Reading score if the student scores 550 on Reading.




  8. Describe how you could simulate a single \((M, R)\) pair.




  9. Find and interpret \(\text{E}(M|R)\).




Figure 15.4: A Bivariate Normal distribution with some conditional distributions and conditional expected values highlighted.
N_rep = 10000

R = rnorm(N_rep, 610, 70)
M = rnorm(N_rep, 640 + 0.7 * 80 * (R - 610) / 70, 80 * sqrt(1 - 0.7 ^ 2))

plot(R, M)

ggplot(data.frame(R, M), aes(x = R, y = M)) +
  stat_density_2d(aes(fill = ..level..), geom = "polygon", colour="white")
Warning: The dot-dot notation (`..level..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(level)` instead.

cor(R, M)
[1] 0.6973201
mean(R)
[1] 609.4482
sd(R)
[1] 70.43551
mean(M)
[1] 639.8739
sd(M)
[1] 80.14485