A.7 Answer: TW 7 tutorial
Answers for Sect. 7.2
This is just one box of matches (one observation), but the claim is about the population mean. Some boxes will have more than \(45\), and some fewer.
Jake is correct in one sense: You can't have \(0.9\) of a match. But the value is the mean number, and that can be a decimal. Suppose \(10\) boxes had \(49\) matches, and \(10\) boxes had \(50\) matches... is the mean \(49\), or is it \(50\)? Neither are correct; the mean is \(49.5\).
Jake is confusing the sample and population mean. The claim is that the population mean is \(45\). The sample produced a mean of \(44.9\).
Why should the mean of two different things be the same? It's like expecting your height and your Mum's height to be the same: they are both heights, but of different things. Why should they be the same?
Of course, every sample will produce a different sample mean. This sample may just have an unusually low number of matches.
Either (1) The manufacturer is lying; or (2) the manufacturer is not lying, and this sample just happens to have a smaller number of matches: (bad) luck.
A CI gives some indication of the variation implied by the sample.
The standard error for the mean is \(0.124\div\sqrt{25} = 0.0248\). So the approximate \(95\)% CI is: \(44.9 \pm (2\times 0.0248)\), or \(44.9\pm 0.0496\), or from \(44.85\) to \(44.95\).
No. A \(95\)% CI may or may not contain the population mean. Of course, the manufacturer may indeed be lying... but we'd need to be cautious about making such a bold claim on just this evidence. Ideally, we would repeat this study a few times or take a larger sample. But it is looking suspicious...
If we had many, many sets of \(25\) matches boxes, \(95\)% of these sets of \(25\) would have a mean between \(44.85\) and \(44.95\).
\(\bar{x}\) is the mean of the sample, so \(\bar{x} = 44.9\).
\(\mu\) is the mean of the population; the true mean if you like. \(\mu\) is claimed to be 45, but the the value of \(\bar{x}\) will, of course, vary.
Answers for Sect. 7.3
Answers implied by H5P.
\(\displaystyle \text{s.e.}(\hat{p}) = \sqrt{ \big(\hat{p}\times(1 - \hat{p}) \big)/n}\), where \(\hat{p}\) is the sample proportion; \(n\) the sample size; "s.e." the "standard error".
Answers for Sect. 7.4
- \(123/(404 - 123) = 123/281 = 0.44\).
- \(\hat{p} = 123/404 = 0.304455\).
- The odds: The likelihood of surviving is about \(0.44\) times the probability of dying (ie. it is lower). Or: For every \(100\) that die, about \(44\) survive.
- No: sampling variation!
- \(\text{s.e.}(\hat{p}) = \sqrt{0.304455 \times (1 - 0.304455)/404} = \sqrt{0.00052416} = 0.022894\), or about \(0.023\).
A definition can be found in the textbook Glossary. Essentially, each sample is likely to produce a different value for the sample proportion, \(\hat{p}\) (the estimate of the population proportion, \(p\)), and that is what we mean by "sampling variation". - (Not provided.)
- The values of \(\hat{p}\) will have an approximate normal distribution, with a standard deviation equal to standard error (\(0.023\)) and centred around the true proportion \(p\). A \(95\)% CI: \(0.304455 \pm (2\times0.022894)\), or \(0.30446\pm0.04579\), or \(0.26\) to \(0.35\).
- One way of writing communicating: “The population proportion of patients surviving after BVM treatment has a \(95\)% chance of lying between \(26\)% and \(35\)%.” This is not strictly correct, but acceptable and very, very commonly used (as explained in the textbook.
- Larger, to get a tighter (more precise) CI than the one calculated.
- The number of surviving and non surviving both exceed \(5\).
Answers for Sect. 7.5
- \(\mu\) is the population mean diameter size of all EB pizzas; \(\bar{x}\) is the mean diameter of the pizzas in the sample.
- \(\bar{x} = 11.486\) inches; it's not sensible to quote the diameter to \(0.001\) of a cm; what would be is sensible? We don't know the value of \(\mu\), and we never will. Our best estimate is the value of \(\bar{x}\).
- \(s = 0.24658\) inches. It's not sensible to quote the diameter to \(0.001\) of a cm though. \(\sigma\) is the standard deviation of the population. We don't know the value of \(\sigma\), and we never will.
- \(\displaystyle\text{s.e.}(\bar{x}) = s/\sqrt{n} = 0.24658/\sqrt{125} = 0.02205\).
- The first measures the variation in the diameters of individual pizzas; the second measures the precision of the sample mean when used to estimate the population mean.
- Almost certainly not the same. Probably close to \(\bar{x} = 11.486\) inches. More precisely, probably within three standard errors (\(3\times 0.022\)) of \(\bar{x}\).
- Normal; mean \(\mu\); std. dev is the standard error of \(0.02205\).
- The approximate \(95\)% CI is \(11.486\pm (2\times 0.02205)\) or \(11.486\pm0.044\), which is from \(11.44\) to \(11.53\) inches.
- Based on the sample, a \(95\)% confidence interval for the population mean for the pizza diameter is between \(11.44\) and \(11.53\) inches.
- \(n > 25\) or \(n\le 25\) and population has normal distribution.
- We do not need to assume that \(n > 25\) because we know that it is. We do not require that the sample or the population has a normal distribution. We require that the sample means have an approximate normal distribution, which they will if \(n > 25\). So the CI is statistically valid, and the histogram is not needed.
- Population mean diameter probably not \(12\) inches based on the CI.
Answers for Sect. 7.6
- \(\sqrt{0.70\times(1 - 0.70)/25} = \sqrt{0.084} = 0.2898275\), or about 0.2898.
- \(\sqrt{0.25\times(1 - 0.25)/100} = \sqrt{0.001875} = 0.04330127\), or about 0.04330.
- \(0.08724964\), or about \(0.08725\).
- \(0.0534479\), or about \(0.05345\).
All statistically valid.
Note: Students commonly forget to take the square root.
Note: If you calculator gives an answer something like 1.875 E-03
or similar, it is using scientific notation.
It means \(1.875\times 10^{-3}\), or \(0.001875\).
Answers for Sect. 7.7.1
See Table A.2.
Applies to \(\mu\) | Applies to \(\bar{x}\) | |
---|---|---|
Has a standard error | No | YES |
Refers to the population | YES | No |
Refers to the sample | No | YES |
Value is known before the sample is taken | YES | No |
Value is unknown until sample is taken | No | YES |
Value is estimated | YES | No |
Answers for Sect. 7.7.2
- \(\bar{x} = 16.02\)m.
- \(s = 7.145\)m; \(\text{s.e.}(\bar{x}) = s/\sqrt{n} = 7.145/\sqrt{44} = 1.077\)m. The first is a measure of the variation in the original data; the second is a measure of the precision of the sample mean when estimating the population mean. 3.The CI is from \(13.85\) to \(18.19\) m.
- \(95\)% CI for population mean guess: \(13.85\) to \(18.19\) m.
- The population of differences has a normal distribution, and/or \(n > 25\) or so.
- Since \(n > 25;\), all OK if the histogram isn't severely skewed; probably OK.
- Not really; the CI doesn't contain the true width. But was this just due to the metric units... or perhaps students are just very poor at estimating widths in general! In fact, the Professor also had the students estimate the width of the hall in imperial units also, as a comparison.
Answers for Sect. 7.7.3
- Relational.
- \(\hat{p} = 352/2\ 864 = 0.12291\).
- \(\text{s.e.(}\hat{p}) = \sqrt{0.12291 \times (1 - 0.12291)/2864} = 0.006135\).
- An approximate 95% CI is \(0.12291 \pm (2\times 0.006135)\), or \(0.12291\pm 0.01227\), or from \(0.111\) to \(0.135\). Either the '\(0.123\pm 0.012\)' form or the '\(0.111\) to \(0.135\)' form is fine; percentages or proportions are fine (but the calculations must done with the proportions, not the percentages).
- We need the number of boys who are late maturers and who are not late maturers to both be greater than \(5\). This is true, so the calculations are valid.
- Smaller; the current sample size estimates \(p\) to within \(1.2\%\), and less accuracy needs fewer in the sample.
- \(n = 1/(0.02)^2 = 2500\) boys.