1.5 Exercises: Chapter 1

  1. The court case: the blue or green cap

A cab was involved in a hit and run accident at night. There are two cab companies in the town: blue and green. The former has 150 cabs, and the latter 850 cabs. A witness said that a blue cab was involved in the accident; the court tested his/her reliability under the same circunstances, and got that 80% of the times the witness correctly identified the color of the cab. what is the probability that the color of the cab involved in the accident was blue given that the witness said it was blue?

Solution Set \(WB\) and \(WG\) equal to the events that the witness said the cab was blue and green, respectively. Set \(B\) and \(G\) equal to the events that the cabs are blue and green, respectively. We need to calculate \(P(B|WB)\), then:

\[\begin{align} P(B|WB)&=\frac{P(B,WB)}{P(WB)}\\ &=\frac{P(WB|B)\times P(B)}{P(WB|B)\times P(B)+(1-P(WB|B))\times (1-P(B))}\\ &=\frac{0.8\times 0.15}{0.8\times 0.15+0.2\times 0.85}\\ &=0.41 \end{align}\]

  1. The Monty Hall problem

What is the probability of winning a car in the Monty Hall problem switching the decision if there are four doors, where there are three goats and one car? Solve this problem analytically and computationally. What if there are 100 doors, 99 goats and one car?

Solution

Let’s name \(P_i\) the event contestant picks door No. \(i\), \(H_i\) the event host picks door No. \(i\), and \(C_i\) the event car is behind door No. \(i\). Let’s assume that the contestant picked door number 1, and the host picked door number 3, then the contestant is interested in the probability of the event \(P(C_i|H_3,P_1), i = 2 \ \text{or} \ 4\). Then, \(P(H_3|C_3,P_1)=0\), \(P(H_3|C_2,P_1)=P(H_3|C_4,P_1)=1/2\) and \(P(H_3|C_1,P_1)=1/3\). Then, using equation (1.2)

\[\begin{align} P(C_i|H_3,P_1)&= \frac{P(C_i,H_3,P_1)}{P(H_3,P_1)}\\ &= \frac{P(H_3|C_i,P_1)P(C_i|P_1)P(P_1)}{P(H_3|P_1)\times P(P_1)}\\ &= \frac{P(H_3|C_i,P_1)P(C_i)}{P(H_3|P_1)}\\ &=\frac{1/2\times 1/4}{1/3}\\ &=\frac{3}{8}, \end{align}\] where the third equation uses the fact that \(C_i\) and \(P_i\) are independent events, and \(P(H_3|P_1)=1/3\) due to this depending just on \(P_1\) (not on \(C_i\)).

Therefore, changing the initial decision increases the probability of getting the car from 1/4 to 3/8!

set.seed(0101) # Set simulation seed
S <- 100000 # Simulations
Game <- function(opt = 3){
  # opt: number of options. opt > 2, it is 3 in the original game
  opts <- 1:opt 
  car <- sample(opts, 1) # car location
  guess1 <- sample(opts, 1) # Initial guess pick
  
  if(opt == 3 && car != guess1) {
    host <- opts[-c(car, guess1)]
    } else {
    host <- sample(opts[-c(car, guess1)], 1)
    }
  
  win1 <- guess1 == car # Win given no change
  
  if(opt == 3) {
    guess2 <- opts[-c(host, guess1)]
  } else {
    guess2 <- sample(opts[-c(host, guess1)], 1)
  } 
  
  win2 <- guess2 == car # Win given change

  return(c(win1, win2))
}

Prob <- rowMeans(replicate(S, Game(opt = 4))) #Win probabilities
paste("Winning probabilities no changing door is", Prob[1], sep = " ")
## [1] "Winning probabilities no changing door is 0.25151"
paste("Winning probabilities changing door is", Prob[2], sep = " ")
## [1] "Winning probabilities changing door is 0.37267"
  1. Solve the health insurance example using a Gamma prior in the rate parametrization, that is, \(\pi(\lambda)=\frac{\beta_0^{\alpha_0}}{\Gamma(\alpha_0)}\lambda^{\alpha_0-1}\exp\left\{-\lambda\beta_0\right\}\).

  2. A preliminar survey regarding detergent preferences found that 20, 30 and 50 people prefer brands A, B and C, respectively. What is the posterior probability of having 25, 40, 35 preferences for brands A, B and C, respectively? Solve this exercise using vague prior and Empirical Bayes.

  3. A mayoral election poll with two candidates shows that candidate A will get 350 votes, and candidate B will get 300. What is the probability that candidate A gets more than 50% of votes? Solve this exercise using vague prior and Empirical Bayes.

  4. Show that given the loss function, \(L({\theta},a)=|{\theta}-a|\), then \({\delta}(\mathbf{y})\) is the median.

Solution

\(\int_{{\Theta}} |{\theta}-a|\pi({\theta}|\mathbf{y})d{\theta}=\int_{-\infty}^a (a-{\theta})\pi({\theta}|\mathbf{y})d{\theta}+\int_{a}^{\infty} ({\theta}-a)\pi({\theta}|\mathbf{y})d{\theta}\). Differentiating with respect to \(a\), and equaliting to zero,

\[\begin{equation} \int_{-\infty}^a \pi({\theta}|\mathbf{y})d{\theta}=\int_{a}^{\infty} \pi({\theta}|\mathbf{y})d{\theta}, \end{equation}\]

then,

\[\begin{equation} 2\int_{-\infty}^a \pi({\theta}|\mathbf{y})d{\theta}=\int_{-\infty}^{\infty} \pi({\theta}|\mathbf{y})d{\theta}=1, \end{equation}\]

that is, \({\delta}(\mathbf{y})\) is the median.