4 Day 4 (January 30)

4.1 Announcements

Please upload a photo of the signature sheet to Canvas to get 2 magic extra credit points.
Please read (and re-read) Ch. 2 and 3 in BBM2L book.
Table of distributions in book
Activity 2 is posted

4.2 Example with differential equations

Motivating data example
```
url <- "https://www.dropbox.com/scl/fi/uokqouco07c44cder5vse/Butler-et-al.-Table-1.csv?rlkey=z37llt9dgvs59dw0ee8hnu7uq&dl=1"
df1 <- read.csv(url)
plot(df1$Winter, df1$N, typ = "b", xlab = "Year", ylab = "Population Size", 
ylim = c(0, 300), lwd = 1.5, pch = "*")
```
- Statistical model \[y_t\sim~\text{Poisson}(\lambda(t))\] where \[\lambda(t)=\lambda_{0}e^{\gamma(t-t_0)}\]
  - Assume $\lambda_0$ is known and equal to the the population size in 1950 ($\lambda_0 = 31$)
  - We want to obtain the distribtuion of $\gamma$ given the data $\mathbf{y}$
  - What else do we need to fully specify the model?

K <- 10^6                   # Number of samples to try  
samples <- matrix(,K,2)     # Save samples
lambda0 <- df1$N[1]         # Lambda0
y <- df1$N[-1]              # Data
t <- df1$Winter[-1] - 1950  # Time

for(i in 1:K){
  
  gamma <- runif(1,0,0.1)
  lambda <- lambda0*exp(gamma*t)
  y.sample <- rpois(length(t),lambda)
  
  samples[i,1] <- mean(abs(y-y.sample))
  samples[i,2] <- gamma
}

par(mfrow=c(1,2))
hist(samples[,2],col="grey", main="Prior distribution",xlab=expression(gamma),ylab=expression("["*gamma*"]"),ylim=c(0,100),freq=FALSE,breaks=seq(0,0.1,by=0.005))
hist(samples[which(samples[,1]<30),2],ylim=c(0,100),xlim=range(samples[,2]),col="grey", main="Posterior distribution",xlab=expression(gamma*"|"*bold(y)),ylab=expression("["*gamma*"|"*bold(y)*"]"),freq=FALSE,breaks=seq(0,0.1,by=0.005))

4.3 Building our first statistical model

The backstory
- Bat
- Rabies
Building a statistical model using a likelihood-based (classical) approach
- Specify (write out) the likelihood
- Select an approach to estimate unknown parameters (e.g., maximum likelihood)
- Quantify uncertainty in unknown parameters (e.g., using normal approximation, see here)
Building a statistical model using a Bayesian approach
- Specify (write out) the likelihood/data model
- Specify the parameter model (or prior) including hyper-parameters
- Select an approach to obtain the posterior distribution
  - Analytically (i.e., pencil and paper)
  - Simulation-based (e.g., Metropolis-Hastings, MCMC, importance sampling, ABC, etc)

4.4 Numerical Integration

Why do we need integrals to do Bayesian statistics?
- Example using Bayes theorem to estimate prevalence rate of rabies
- Why it is important to keep track of what we are calculating (i.e., clarity in what is being estimated)
Numerical approximation vs. analytical solutions
Definition of a definite integral \[\int_{a}^b f(z)dz = \lim_{Q\to\infty} \sum_{q=1}^{Q}\Delta q f(z_q)\] where $\Delta q =\frac{b-a}{Q}$ and $z_q = a + \frac{q}{2}\Delta q$.
Riemann approximation (midpoint rule)\[\int_{a}^b f(z)dz \approx \sum_{q=1}^{Q}\Delta q f(z_q)\] where $\Delta q =\frac{b-a}{Q}$ and $z_q = a + \frac{2q - 1}{2}\Delta q$.

Using similar approach in R (Adaptive quadrature)

fn <- function(y){dnorm(y,0,1)}
integrate(f=fn,lower=-4,upper=4,subdivisions=10)

## 0.9999367 with absolute error < 4.8e-12