1.5 Exercises: Chapter 1
- The court case: the blue or green cap
A cab was involved in a hit and run accident at night. There are two cab companies in the town: blue and green. The former has 150 cabs, and the latter 850 cabs. A witness said that a blue cab was involved in the accident; the court tested his/her reliability under the same circunstances, and got that 80% of the times the witness correctly identified the color of the cab. what is the probability that the color of the cab involved in the accident was blue given that the witness said it was blue?
Solution Set \(WB\) and \(WG\) equal to the events that the witness said the cab was blue and green, respectively. Set \(B\) and \(G\) equal to the events that the cabs are blue and green, respectively. We need to calculate \(P(B|WB)\), then:
\[\begin{align} P(B|WB)&=\frac{P(B,WB)}{P(WB)}\\ &=\frac{P(WB|B)\times P(B)}{P(WB|B)\times P(B)+(1-P(WB|B))\times (1-P(B))}\\ &=\frac{0.8\times 0.15}{0.8\times 0.15+0.2\times 0.85}\\ &=0.41 \end{align}\]
- The Monty Hall problem
What is the probability of winning a car in the Monty Hall problem switching the decision if there are four doors, where there are three goats and one car? Solve this problem analytically and computationally. What if there are 100 doors, 99 goats and one car?
Solution
Let’s name \(P_i\) the event contestant picks door No. \(i\), \(H_i\) the event host picks door No. \(i\), and \(C_i\) the event car is behind door No. \(i\). Let’s assume that the contestant picked door number 1, and the host picked door number 3, then the contestant is interested in the probability of the event \(P(C_i|H_3,P_1), i = 2 \ \text{or} \ 4\). Then, \(P(H_3|C_3,P_1)=0\), \(P(H_3|C_2,P_1)=P(H_3|C_4,P_1)=1/2\) and \(P(H_3|C_1,P_1)=1/3\). Then, using equation (1.2)
\[\begin{align} P(C_i|H_3,P_1)&= \frac{P(C_i,H_3,P_1)}{P(H_3,P_1)}\\ &= \frac{P(H_3|C_i,P_1)P(C_i|P_1)P(P_1)}{P(H_3|P_1)\times P(P_1)}\\ &= \frac{P(H_3|C_i,P_1)P(C_i)}{P(H_3|P_1)}\\ &=\frac{1/2\times 1/4}{1/3}\\ &=\frac{3}{8}, \end{align}\] where the third equation uses the fact that \(C_i\) and \(P_i\) are independent events, and \(P(H_3|P_1)=1/3\) due to this depending just on \(P_1\) (not on \(C_i\)).
Therefore, changing the initial decision increases the probability of getting the car from 1/4 to 3/8!
set.seed(0101) # Set simulation seed
<- 100000 # Simulations
S <- function(opt = 3){
Game # opt: number of options. opt > 2, it is 3 in the original game
<- 1:opt
opts <- sample(opts, 1) # car location
car <- sample(opts, 1) # Initial guess pick
guess1
if(opt == 3 && car != guess1) {
<- opts[-c(car, guess1)]
host else {
} <- sample(opts[-c(car, guess1)], 1)
host
}
<- guess1 == car # Win given no change
win1
if(opt == 3) {
<- opts[-c(host, guess1)]
guess2 else {
} <- sample(opts[-c(host, guess1)], 1)
guess2
}
<- guess2 == car # Win given change
win2
return(c(win1, win2))
}
<- rowMeans(replicate(S, Game(opt = 4))) #Win probabilities
Prob paste("Winning probabilities no changing door is", Prob[1], sep = " ")
## [1] "Winning probabilities no changing door is 0.25151"
paste("Winning probabilities changing door is", Prob[2], sep = " ")
## [1] "Winning probabilities changing door is 0.37267"
Solve the health insurance example using a Gamma prior in the rate parametrization, that is, \(\pi(\lambda)=\frac{\beta_0^{\alpha_0}}{\Gamma(\alpha_0)}\lambda^{\alpha_0-1}\exp\left\{-\lambda\beta_0\right\}\).
A preliminar survey regarding detergent preferences found that 20, 30 and 50 people prefer brands A, B and C, respectively. What is the posterior probability of having 25, 40, 35 preferences for brands A, B and C, respectively? Solve this exercise using vague prior and Empirical Bayes.
A mayoral election poll with two candidates shows that candidate A will get 350 votes, and candidate B will get 300. What is the probability that candidate A gets more than 50% of votes? Solve this exercise using vague prior and Empirical Bayes.
Show that given the loss function, \(L({\theta},a)=|{\theta}-a|\), then \({\delta}(\mathbf{y})\) is the median.
Solution
\(\int_{{\Theta}} |{\theta}-a|\pi({\theta}|\mathbf{y})d{\theta}=\int_{-\infty}^a (a-{\theta})\pi({\theta}|\mathbf{y})d{\theta}+\int_{a}^{\infty} ({\theta}-a)\pi({\theta}|\mathbf{y})d{\theta}\). Differentiating with respect to \(a\), and equaliting to zero,
\[\begin{equation} \int_{-\infty}^a \pi({\theta}|\mathbf{y})d{\theta}=\int_{a}^{\infty} \pi({\theta}|\mathbf{y})d{\theta}, \end{equation}\]
then,
\[\begin{equation} 2\int_{-\infty}^a \pi({\theta}|\mathbf{y})d{\theta}=\int_{-\infty}^{\infty} \pi({\theta}|\mathbf{y})d{\theta}=1, \end{equation}\]
that is, \({\delta}(\mathbf{y})\) is the median.