2.5 Some statistical test
2.5.1 Parametric statistical test
2.5.1.1 \(t\)-test
2.5.1.1.1 One sample \(t\)-test
This test is used to determine if the mean of a single population is significantly different from a specified value \(\mu_0\).
-
- Null hypothesis: \(H_0: \mu = \mu_0\)
- Alternative hypothesis: \(H_1: \mu \neq \mu_0\) (two-sided), \(H_1: \mu > \mu_0\) (one-sided), or \(H_1: \mu < \mu_0\) (one-sided)
\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \] where \(\bar{x}\) is the sample mean, \(s\) is the sample standard deviation, and \(n\) is the sample size.
\(df = n - 1\)
Reject \(H_0\) if \(|t| > t_{\alpha/2, df}\) (two-sided) or \(t > t_{\alpha, df}\) (one-sided, upper tail) or \(t < -t_{\alpha, df}\) (one-sided, lower tail), where \(t_{\alpha/2, df}\) and \(t_{\alpha, df}\) are the critical values from the \(t\)-distribution with \(df\) degrees of freedom and significance level \(\alpha\).
2.5.1.1.2 Two sample \(t\)-test
This test is used to determine if the means of two independent populations are significantly different, assuming that the populations have equal variances.
-
- Null hypothesis: \(H_0: \mu_1 = \mu_2\)
- Alternative hypothesis: \(H_1: \mu_1 \neq \mu_2\) (two-sided), \(H_1: \mu_1 > \mu_2\) (one-sided), or \(H_1: \mu_1 < \mu_2\) (one-sided)
\[ t = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}} \] where \(\bar{x}_1\) and \(\bar{x}_2\) are the sample means, \(n_1\) and \(n_2\) are the sample sizes, and \(s_p\) is the pooled sample standard deviation:
\[ s_p = \sqrt{\frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}} \] where \(s_1\) and \(s_2\) are the sample standard deviations.
\(df = n_1 + n_2 - 2\)
Reject \(H_0\) if \(|t| > t_{\alpha/2, df}\) (two-sided) or \(t > t_{\alpha, df}\) (one-sided, upper tail) or \(t < -t_{\alpha, df}\) (one-sided, lower tail).
2.5.1.1.3 Welch’s \(t\)-test
This test is used to determine if the means of two independent populations are significantly different, without assuming that the populations have equal variances.
-
- Null hypothesis: \(H_0: \mu_1 = \mu_2\)
- Alternative hypothesis: \(H_1: \mu_1 \neq \mu_2\) (two-sided), \(H_1: \mu_1 > \mu_2\) (one-sided), or \(H_1: \mu_1 < \mu_2\) (one-sided)
\[ t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} \]
\[ df \approx \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{\left(\frac{s_1^2}{n_1}\right)^2}{n_1 - 1} + \frac{\left(\frac{s_2^2}{n_2}\right)^2}{n_2 - 1}} \]
Reject \(H_0\) if \(|t| > t_{\alpha/2, df}\) (two-sided) or \(t > t_{\alpha, df}\) (one-sided, upper tail) or \(t < -t_{\alpha, df}\) (one-sided, lower tail). Note that the degrees of freedom will likely be a non-integer value, so you’ll need to interpolate in a t-table or use statistical software.