Chapter 5 Week 9
5.1 prop.test
Test of Equal or Given Proportions
- When apply to a multi-sample data, the
prop.test()
command performs a test for proportions, and gives a confidence interval for the difference in proportions as part of the output.
prop.test(x, n, p = NULL,
alternative = c("two.sided", "less", "greater"),
conf.level = 0.95, correct = TRUE)
The input could be:
x
only:A two-dimensional table (or matrix) with 2 columns
if
x
is a table or matrix,n
would be ignored.The first column as the counts of successes (e.g. \(n(D)\)), and the second as counts of failures (e.g. \(n(\overline D)\)).
x
andn
:x
: a vector of counts of successesn
: a vector of count trials
x
,n
andp
:p
: a vector of null probabilities of success.
Note: The length of
n
orp
must be the same as the number of groups specified byx
.
Test Assumptions: The function operates on the assumption that each of the
length(x)
samples is independent of the others, and that each sample consists of a pre-determined numbern[i]
of independent trials, for which the true probability of success is constant.Hypothesis: If the argument
p=NULL
, and there are at least two groups, the null hypothesis states that the true probability of success is the same in every group.When there are two groups, the alternative hypothesis asserts that the probability of success in the first group is greater than, less than, or simply not equal to that in the second group, depending on the value of the argument alternative.
When there are more than two groups, the alternative hypothesis is that there is at least one group whose probability of success is different from the others; thus alternative is two.sided.
If the argument
p
is notNULL
, the null hypothesis states that the true probability of success in groupi
isp[i]
, for each value ofi
. The alternative hypothesis, when there are at least two groups, is that there is some group for which this relation does not hold; thus alternative is two.sided.
5.1.1 Example: Survive by multiple levels of ticket class
- Using the data set,
titanic
, conduct Overall test of association of dying (survived
) as a passenger’s ticket class (pclass
) changes from 1st to 2nd to 3rd.
<- with(titanic, table(pclass, survived))
table table
## survived
## pclass Died Survived
## 1 123 200
## 2 158 119
## 3 528 181
# We can use either column as the number of event
<- table[,"Died"] # counts of D (Died), x <- table1[,1]
x x
## 1 2 3
## 123 158 528
<- rowSums(table) # counts of total sample size in each level
n n
## 1 2 3
## 323 277 709
<- rep(sum(x)/sum(n), 3) # H0: probabilities of D (Died) are the same among 3 exposure levels
p p
## [1] 0.618029 0.618029 0.618029
\(H_0\): Risk of dying as a passenger is independent of the ticket class.
Your test using R could following any of the example format, and the results would be the same:
prop.test(table1)
prop.test(x, n)
prop.test(x, n, p)
prop.test(table)
##
## 3-sample test for equality of proportions without continuity
## correction
##
## data: table
## X-squared = 127.86, df = 2, p-value < 2.2e-16
## alternative hypothesis: two.sided
## sample estimates:
## prop 1 prop 2 prop 3
## 0.3808050 0.5703971 0.7447109
Comparing 127.86 to a \(\chi^2\) distribution with 3-1=2 degrees of freedom, we get a \(p\)-value very close to zero.
We therefore reject the null hypothesis that the risk of dying are equal across levels of ticket classes (i.e. that death and ticket classes are independent).
Note that, for 2 x 2 table, the standard chi-square test in
chisq.test()
is exactly equivalent to prop.test() but it works with data in matrix form.
# perform the chi-square test of association
chisq.test(table)
##
## Pearson's Chi-squared test
##
## data: table
## X-squared = 127.86, df = 2, p-value < 2.2e-16
5.2 prop.trend.test
Test for trend in proportions
prop.trend.test(x, n, score = seq_along(x))
Input: Note that input for
prop.trend.test
cannot be a matrixx
: Number of eventsn
: Number of trialsscore
: Group score
Hypotheses: With at least three groups, the null hypothesis states that the There is no trend among the proportions (independence).. The alternative states that the proportions have an increasing or decreasing trend.
5.2.1 Example: Test of trend: Survive by multiple levels of ticket class
Using the data set,
titanic
, conduct Test of trend of dying (survived
) as a passenger’s ticket class (pclass
) changes from 1st to 2nd to 3rd.Null hypothesis: \(H_0: P(Died|pclass = 1) = P(Died|pclass = 2) = P(Died|pclass = 3)\)
Alternative hypothesis: \(H_A: P(Died|pclass = 1) < P(Died|pclass = 2) < P(Died|pclass = 3)\) or \(P(Died|pclass = 1) > P(Died|pclass = 2) > P(Died|pclass = 3)\)
prop.trend.test(x, n)
##
## Chi-squared Test for Trend in Proportions
##
## data: x out of n ,
## using scores: 1 2 3
## X-squared = 127.81, df = 1, p-value < 2.2e-16
Comparing 127.81 to a \(\chi^2\) distribution with 1 degree of freedom, we get a \(p\)-value very close to zero.
We therefore reject the null hypothesis that the risk of deaths are equal across levels of ticket class.