16 Hypothesis testing — two means

16.1 Testing two means – sampling distribution

It often happens that we test not just one but two populations and are interested in the difference between their means (\(\mu_1-\mu_2\)).

The statistic used for estimation and testing is the difference between the sample means, \(\bar{x}_1-\bar{x}_2\).

The variable \(\bar{X}_1-\bar{X}_2\) follows a distribution with a mean of (\(\mu_1-\mu_2\)).

If the two samples are independent, then the variable \(\bar{X}_1-\bar{X}_2\) has a standard deviation of \(\sigma_{(\bar{X}_1-\bar{X}_2)}=\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}\).

If the distributions of the variables \(X_1\) and \(X_2\) in the populations (from which the means are calculated) are normal, then the above statistic follows a normal distribution.

If the samples are sufficiently large, the central limit theorem allows us to assume that the above statistic follows an approximately normal distribution.

16.2 Test dwóch średnich oparty na statystyce \(z\)

W teście dwóch średnich hipoteza zerowa brzmi, że różnica między średnimi w populacjach, z których pobieramy dwie próby, wynosi \(D_0\) (bardzo często hipotezą zerową jest to, że nie ma różnicy, czyli \(D_0=0\)):

\[ H_0: \mu_1-\mu_2 = D_0 \]

Mamy trzy opcje hipotezy alternatywnej.

W teście dwustronnym:

\[H_A: \mu_1-\mu_2 \ne D_0\]

W teście lewostronnym:

\[H_A: \mu_1-\mu_2 < D_0 \]

W teście prawostronnym:

\[H_A: \mu_1-\mu_2 > D_0 \]

Jeżeli \(D_0=0\), hipotezy powyższe można zapisać \(H_A:\mu_1\ne\mu_2\), \(H_A:\mu_1<\mu_2\) i \(H_A:\mu_1>\mu_2\), a hipotezę zerową \(H_0: \mu_1=\mu_2\).

Test statistic \(z\):

\[z = \frac{(\bar{x}_1-\bar{x}_2)-D_0}{\sqrt{\sigma_1^2/n_1 +\sigma_2^2/n_2}}\:\:\text{ or }\:\: z\approx\frac{(\bar{x}_1-\bar{x}_2)-D_0}{\sqrt{s_1^2/n_1 +s_2^2/n_2}}, \tag{16.1}\]

where \(\bar{x}_i\) is the sample mean for sample \(i\), \(D_0\) is the hypothesized difference between the two population means under the null hypothesis, \(\sigma_i\) is the population standard deviation for population \(i\) (used when known), \(s_i\) is the sample standard deviation for sample \(i\), and \(n_i\) is the sample size for sample \(i\).

The rejection region is determined in the same way as in other \(z\)-tests.
It is important to note that the test assumes the samples are independently drawn from the two studied populations (or processes). Another requirement for using the test is that both samples must be sufficiently large (in practice, for the purposes of this cours, assume \(n_1\geq 30\) and \(n_2\geq 30\)).

Note! Note!** The test can also be applied to small samples if the population standard deviations \(\sigma\) are known (which is rare but sometimes occurs) and if the population distributions are normal.

16.3 Two-sample \(t\)-test

For small samples, we often assume homogeneity of variance, meaning that both samples have the same variance (and standard deviation). In such a case, we can estimate the pooled variance:

\[ s_p^2=\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2} \tag{16.2}\]

In this test, the null and alternative hypotheses can be formulated in the same way as in the \(z\)-test.
Test statistic \(t\):

\[t=\frac{(\bar{x}_1-\bar{x}_2)-D_0}{\sqrt{s_p^2 \left(\frac{1}{n_1}+\frac{1}{n_2}\right)}} \tag{16.3}\]

The rejection region is determined in the same way as in other \(t\)-tests. Under the assumption that the null hypothesis is true, the t statistic follows a \(t\)-distribution with degrees of freedom equal to \(n_1+n_2-2\).
The \(t\)-test described in this section applies when the samples are randomly and independently drawn from the two studied populations. We should assume that the population distributions are normal (or at least approximately normal) and that the variances in both populations are equal. Similar assumptions apply when constructing confidence intervals using the \(t\) statistic.

Note! The assumption of equal variance can be relaxed; in such a case, we can use the Welch–Satterthwaite formula (16.7).

16.4 Paired samples

Sometimes, we can estimate the difference between means based on paired samples. Some textbooks refer to such situations as "dependent samples" (e.g., "test for dependent samples").

Examples:

The difference in average fuel consumption in the city and outside the city – each pair of observations consists of results obtained for the same vehicle.
The difference in average reading speed before and after a course – each pair of observations consists of results for the same participant measured before and after the course.

In such situations, the formulas and procedures used are analogous to those applied in the one-sample test. The difference within each pair is treated as a single observation.

Since the paired-sample test is more powerful (has greater statistical power) than the independent-sample test, researchers sometimes attempt to create pairs even when they do not naturally exist. For example, average salaries of men and women may be compared by pairing individuals with similar skills and experience, or average prices in two retail chains may be compared by pairing stores located next to each other. It is crucial to establish such pairs before conducting the study (pairing post factum is unjustified and intellectually dishonest).

16.5 Formulas

16.5.1 Confidence intervals for differences in parameters (formulas)

Population means (z):

\[\begin{equation} (\bar{x}_1-\bar{x}_2)\pm z_{\alpha/2}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}} \tag{16.4} \end{equation}\]

Population means (t), assuming equal variances:

\[\begin{equation} \begin{split} (\bar{x}_1-\bar{x}_2)\pm t_{\alpha/2}{\sqrt{s_p^2 \left(\frac{1}{n_1}+\frac{1}{n_2}\right)}} \\ {df}=n_1+n_2-2, \end{split} \tag{16.5} \end{equation}\]

where \(s_p^2\) is an estimate of a pooled variance:

\[\begin{equation} s_p^2=\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2} \tag{16.6} \end{equation}\]

Without equal variances assumption, one can use:

\[\begin{equation} \begin{split} (\bar{x}_1-\bar{x}_2)\pm t_{\alpha/2}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}} \\ {df'}= \frac{\left(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\right)^2}{\frac{\left(\frac{s_1^2}{n_1}\right)^2}{n_1-1}+\frac{\left(\frac{s_2^2}{n_2}\right)^2}{n_2-1}}\\ \end{split} \tag{16.7} \end{equation}\]

The degrees of freedom (\(df'\)) obtained from the above formula are usually fractional. When using statistical tables (or spreadsheet software, where the t-distribution is available only for natural \(df\) values), we round the degrees of freedom to the nearest whole number.

16.5.2 Tests for differences in means (formulas)

Population means (z):

\[\begin{equation} z=\frac{(\bar{x}_1-\bar{x}_2)-D_0}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}} \tag{16.8} \end{equation}\]

Population means (t), assuming equal variances:

\[\begin{equation} \begin{split} t=\frac{(\bar{x}_1-\bar{x}_2)-D_0}{\sqrt{s_p^2 \left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}, \:\:\:\:\:\: df=n_1+n_2-2 \\ s_p^2=\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2} \end{split} \tag{16.9} \end{equation}\]

Population means (t), without the assumption of equal variances:

\[\begin{equation} \begin{split} t=\frac{(\bar{x}_1-\bar{x}_2)-D_0}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}} \\ {df'}= \frac{\left(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\right)^2}{\frac{\left(\frac{s_1^2}{n_1}\right)^2}{n_1-1}+\frac{\left(\frac{s_2^2}{n_2}\right)^2}{n_2-1}}\\ \end{split} \tag{16.10} \end{equation}\]

16.5.3 Difference in means – paired samples

Average difference in means (z):

\[\begin{equation} {}\bar{x}_d\pm z_{\alpha/2}\left({\frac{\sigma_d}{\sqrt{n_d}}}\right) \tag{16.11} \end{equation}\]

Average difference in means (t):

\[\begin{equation} {}\bar{x}_d\pm t_{\alpha/2}\left({\frac{s_d}{\sqrt{n_d}}}\right), \:\:\: {df}=n_d-1 \tag{16.12} \end{equation}\]

16.5.4 Confidence intervals – paired samples

Average difference in means (z):

\[\begin{equation} z=\frac{\bar{x}_d-D_0}{\sigma_d/\sqrt{n_d}} \tag{16.13} \end{equation}\]

Average difference in means (t):

\[\begin{equation} t=\frac{\bar{x}_d-D_0}{s_d/\sqrt{n_d}}, \:\:\: {df}=n_d-1 \tag{16.14} \end{equation}\]

16.6 Confidence intervals versus means

When interpreting test results and confidence intervals for the difference in means, consider the following:

If a \((1-\alpha) \cdot 100\)% confidence interval (e.g., a 95% confidence interval) includes zero, a two-tailed test at a significance level of \(\alpha\) will not provide sufficient evidence to reject the null hypothesis that the parameter equals zero.
If the confidence interval contains only positive values, this indicates that \(\mu_1 - \mu_2\) is positive, meaning \(\mu_1 > \mu_2\).
If the confidence interval contains only negative values, this suggests that \(\mu_1 - \mu_2\) is negative, meaning \(\mu_1 < \mu_2\).

16.7 Effect size

When we say that the difference between sample means from two populations (or another relationship measured in a sample) is statistically significant, we mean that this difference (or the observed relationship) provides enough evidence to reject the null hypothesis. Statistical significance does not imply significance in the common or practical sense. In statistics, practical significance (e.g., the strength of a relationship) is referred to as the effect size.

The most common measure of effect size for the difference in means is Cohen's d. This measure for sample data is calculated using the following formula:

\[\begin{equation} d = \frac{\bar{x}_1-\bar{x}_2}{s_p}, \tag{16.15} \end{equation}\]

where \(s_p\) is the pooled standard deviation, which is the square root of the pooled variance described by equation (16.2).

The interpretation of effect size depends on the field and context. For Cohen's d, a general guideline is as follows: 0.2 – small effect, 0.5 – medium effect, 0.8 – large effect, 1.2 – very large effect.

16.8 Links

Cohen's d – visualization: https://rpsychologist.com/cohend/

16.9 Templates

Spreadsheets

Test and confidence intervals for 2 means — Google spreadsheet

Test and confidence intervals for 2 means — Excel template

R code

# Test z for 2 means
# Sample 1 size:
n1 <- 100
# Sample 1 mean:
xbar1 <- 76.5
# Sample 1 standard deviation:
s1 <- 38.0

# Sample 2 size:
n2 <- 100
# Sample 2 mean:
xbar2 <- 88.1
# Sample 2 standard deviation:
s2 <- 40.0

# Significance level:
alpha <- 0.05

# Null value (usually 0):
mu0 <- 0

# Alternative (sign): "<"; ">"; "<>"; "≠"
alt <- "<"

# Calculations:
# Z test statistic:
test_z <- (xbar1-xbar2-mu0)/sqrt(s1^2/n1+s2^2/n2)

# Z test critical value:
crit_z <- if (alt == "<") {qnorm(alpha)} else if (alt == ">") {qnorm(1-alpha)} else {qnorm(1-alpha/2)}

# Z test p-value:
p.value.z = if(alt == ">"){1-pnorm(test_z)} else if (alt == "<") {pnorm(test_z)} else {2*(1-pnorm(abs(test_z)))}

print(c('Mean 1' = xbar1, 
        'SD 1' = s1,
        'Sample 1 size' = n1,
        'Mean 2' = xbar2, 
        'SD 2' = s2,
        'Sample 2 size' = n2,
        'Null hypothesis' = paste0('mu1-mu2 = ', mu0),
        'Alternative hypothesis' = paste0('mu1-mu2 ', alt, ' ', mu0),
        'Z test statistic' = test_z,
        'Z test critical value' = crit_z,
        'Z test p-value' = p.value.z
))

##                 Mean 1                   SD 1          Sample 1 size                 Mean 2                   SD 2 
##                 "76.5"                   "38"                  "100"                 "88.1"                   "40" 
##          Sample 2 size        Null hypothesis Alternative hypothesis       Z test statistic  Z test critical value 
##                  "100"          "mu1-mu2 = 0"          "mu1-mu2 < 0"     "-2.1024983574238"    "-1.64485362695147" 
##         Z test p-value 
##   "0.0177548216928505"

# Test z for 2 means
# Sample 1 size:
n1 <- 14
# Sample 1 mean:
xbar1 <- 185.2142
# Sample 1 standard deviation:
s1 <- 7.5261

# Sample 2 size:
n2 <- 19
# Sample 2 mean:
xbar2 <- 184.8421
# Sample 2 standard deviation:
s2 <- 5.0471


# Significance level:
alpha <- 0.05

# Null value (usually 0):
mu0 <- 0

# Alternative (sign): "<"; ">"; "<>"; "≠"
alt <- "≠"

# Assume equal variances? (TRUE/FALSE):
eqvar <- FALSE

# Calculations
# Pooled standard deviation:
sp <- sqrt(((n1-1)*s1^2+(n2-1)*s2^2)/(n1+n2-2))

# T test statistic:
test_t <- if(eqvar) {(xbar1-xbar2-mu0)/sqrt(sp^2*(1/n1+1/n2))} else {(xbar1-xbar2-mu0)/sqrt(s1^2/n1+s2^2/n2)}

# Degrees of freedom:
df<-if(eqvar) {n1+n2-2} else {(s1^2/n1+s2^2/n2)^2/((s1^2/n1)^2/(n1-1)+(s2^2/n2)^2/(n2-1))}

# T critical value:
crit_t <- if (alt == "<") {qt(alpha, df)} else if (alt == ">") {qt(1-alpha, df)} else {qt(1-alpha/2, df)}

# T test p-value:
p.value.t = if(alt == ">"){1-pt(test_t, df)} else if (alt == ">") {pt(test_t, df)} else {2*(1-pt(abs(test_t),df))}

print(c('Mean 1' = xbar1, 
        'SD 1' = s1,
        'Sample 1 size' = n1,
        'Mean 2' = xbar2, 
        'SD 2' = s2,
        'Sample size 2' = n2,
        'Null hypothesis' = paste0('mu1-mu2 = ', mu0),
        'Alt. hypothesis' = paste0('mu1-mu2 ', alt, ' ', mu0),
        'T test statistic' = test_t,
        'T test critival value' = crit_t,
        'T test p-value' = p.value.t
))

##                Mean 1                  SD 1         Sample 1 size                Mean 2                  SD 2 
##            "185.2142"              "7.5261"                  "14"            "184.8421"              "5.0471" 
##         Sample size 2       Null hypothesis       Alt. hypothesis      T test statistic T test critival value 
##                  "19"         "mu1-mu2 = 0"         "mu1-mu2 ≠ 0"   "0.160325899672522"    "2.07753969816904" 
##        T test p-value 
##      "0.874131604618"

# Using raw data (t test)

# Two data vectors:
data1 <- c(1.2, 3.1, 1.7, 2.8, 3.0)
data2 <- c(4.2, 2.7, 3.6, 3.9)

# Storing test results as an object.
# Choose alternative: "two-sided" (default), "less", or "greater" and the the equality of variances assumption TRUE/FALSE (default is FALSE)
test_result <- t.test(data1, data2, alternative="two.sided", var.equal = TRUE)

# Printing test results. Single components can be printed using for example test_result$statistic.
print(test_result)

## 
##  Two Sample t-test
## 
## data:  data1 and data2
## t = -2.3887, df = 7, p-value = 0.04826
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.46752406 -0.01247594
## sample estimates:
## mean of x mean of y 
##      2.36      3.60

Python code

import math
from scipy.stats import norm

# Test z for 2 means
# Sample 1 size:
n1 = 100
# Sample 1 mean:
xbar1 = 76.5
# Sample 1 standard deviation:
s1 = 38.0
# Sample 2 size:
n2 = 100
# Sample 2 mean:
xbar2 = 88.1
# Sample 2 standard deviation:
s2 = 40.0
# Significance level:
alpha = 0.05
# Null value (usually 0):
mu0 = 0
# Alternative (sign): "<"; ">"; "<>"/"≠"
alt = "<"

# Calculations:
# Z test statistic:
test_z = (xbar1 - xbar2 - mu0) / math.sqrt(s1**2 / n1 + s2**2 / n2)

# Z test critical value:
if alt == "<":
    crit_z = norm.ppf(alpha)
elif alt == ">":
    crit_z = norm.ppf(1 - alpha)
else:
    crit_z = norm.ppf(1 - alpha / 2)

# Z test p-value:
if alt == ">":
    p_value_z = 1 - norm.cdf(test_z)
elif alt == "<":
    p_value_z = norm.cdf(test_z)
else:
    p_value_z = 2 * (1 - norm.cdf(abs(test_z)))

results = {
    'Mean 1': xbar1,
    'SD 1': s1,
    'Sample 1 size': n1,
    'Mean 2': xbar2,
    'SD 2': s2,
    'Sample 2 size': n2,
    'Null hypothesis': f'mu1-mu2 = {mu0}',
    'Alt. hypothesis': f'mu1-mu2 {alt} {mu0}',
    'Z test statistic': test_z,
    'Z test critical value': crit_z,
    'Z test p-value': p_value_z
}

for key, value in results.items():
    print(f"{key}: {value}")

## Mean 1: 76.5
## SD 1: 38.0
## Sample 1 size: 100
## Mean 2: 88.1
## SD 2: 40.0
## Sample 2 size: 100
## Null hypothesis: mu1-mu2 = 0
## Alt. hypothesis: mu1-mu2 < 0
## Z test statistic: -2.102498357423799
## Z test critical value: -1.6448536269514729
## Z test p-value: 0.017754821692850486

# T test for 2 means
from scipy.stats import t
# Sample 1 size:
n1 = 14
# Sample 1 mean:
xbar1 = 185.2142
# Sample 1 standard deviation:
s1 = 7.5261
# Sample 2 size:
n2 = 19
# Sample 2 mean:
xbar2 = 184.8421
# Sample 2 standard deviation:
s2 = 5.0471
# Poziom istotności:
alpha = 0.05
# Null value (usually 0):
mu0 = 0
# Alternative (sign): "<"; ">"; "<>"/"≠"
alt = "≠"
# Assume equal variances? (True/False):
eqvar = False

# Calculations
# Pooled standard deviation:
sp = math.sqrt(((n1 - 1) * s1**2 + (n2 - 1) * s2**2) / (n1 + n2 - 2))

# T test statistic and degrees of freedom:
if eqvar:
    test_t = (xbar1 - xbar2 - mu0) / math.sqrt(sp**2 * (1/n1 + 1/n2))
    df = n1 + n2 - 2
else:
    test_t = (xbar1 - xbar2 - mu0) / math.sqrt(s1**2 / n1 + s2**2 / n2)
    df = (s1**2 / n1 + s2**2 / n2)**2 / ((s1**2 / n1)**2 / (n1 - 1) + (s2**2 / n2)**2 / (n2 - 1))

# T test critival value:
if alt == "<":
    crit_t = t.ppf(alpha, df)
elif alt == ">":
    crit_t = t.ppf(1 - alpha, df)
else:
    crit_t = t.ppf(1 - alpha / 2, df)

# T test p-value:
if alt == ">":
    p_value_t = 1 - t.cdf(test_t, df)
elif alt == "<":
    p_value_t = t.cdf(test_t, df)
else:
    p_value_t = 2 * (1 - t.cdf(abs(test_t), df))

results = {
    'Mean 1': xbar1,
    'SD 1': s1,
    'Sample 1 size': n1,
    'Mean 2': xbar2,
    'SD 2': s2,
    'Sample 2 size': n2,
    'Null hypothesis': f'mu1-mu2 = {mu0}',
    'Alt. hypothesis': f'mu1-mu2 {alt} {mu0}',
    'Z test statistic': test_t,
    'Z test critical value': crit_t,
    'Z test p-value': p_value_t
}

for key, value in results.items():
    print(f"{key}: {value}")

## Mean 1: 185.2142
## SD 1: 7.5261
## Sample 1 size: 14
## Mean 2: 184.8421
## SD 2: 5.0471
## Sample 2 size: 19
## Null hypothesis: mu1-mu2 = 0
## Alt. hypothesis: mu1-mu2 ≠ 0
## Z test statistic: 0.16032589967252212
## Z test critical value: 2.0775396981690264
## Z test p-value: 0.8741316046180003

    
# Using raw data (t test)
from scipy.stats import ttest_ind, t

# Two data vectors
data1 = [1.2, 3.1, 1.7, 2.8, 3.0]
data2 = [4.2, 2.7, 3.6, 3.9]

# Storing test results as an object.
# Choose alternative: "two-sided" (default), "less", or "greater" and the the equality of variances assumption True/False (default is False)
test_result = ttest_ind(data1, data2, alternative='two-sided', equal_var=True)

print(test_result)

## TtestResult(statistic=-2.3886571085065054, pvalue=0.04826397365151946, df=7.0)

16.10 Pytania

Question 16.1 The confidence interval for \(\mu_1 - \mu_2\) is \((-20, 8)\). Which of the following conclusions would we reach when conducting the appropriate hypothesis test?

μ₁ > μ₂ μ₁ < μ₂ μ₁ = μ₂ no significant difference between the means

Question 16.2 The confidence interval for \(\mu_1 - \mu_2\) is (-20; -8). Which of the following conclusions would we reach when testing the appropriate hypothesis?

μ₁ > μ₂ μ₁ < μ₂ μ₁ = μ₂ no significant difference between the means

16.11 Zadania

Exercise 16.1 (McClave and Sincich 2012) ndependent random samples (100 observations each) are drawn from two normally distributed populations with the following means and standard deviations:

Population 1: \(\mu_1=14\), \(\sigma_1=4\)

Population 2: \(\mu_2=10\), \(\sigma_2=3\)

Let \(\bar{x}_1\) and \(\bar{x}_2\) represent the sample means of these two samples.

Provide the mean and standard deviation of the sampling distribution of \(\bar{X}_1\).
Provide the mean and standard deviation of the sampling distribution of \(\bar{X}_2\).
Suppose you need to compute the difference between the two sample means (\(\bar{x}_1\) - \(\bar{x}_2\)). Provide the mean and standard deviation of the distribution of the variable \(Y=\bar{X}_1-\bar{X}_2\).
Will the statistic Y follow a normal distribution?

Exercise 16.2 (McClave and Sincich 2012) Two independent random samples from two normally distributed populations yielded the following results:

Sample 1: 1.2; 3.1; 1.7; 2.8; 3.0

Sample 2: 4.2; 2.7; 3.6; 3.9

Determine the pooled estimate of the common variance \(\sigma^2\) for both populations.
Do the data provide sufficient evidence to conclude that \(μ_2 > μ_1\)? Perform a hypothesis test at \(\alpha = 0.10\).
Find the 90% confidence interval for (\(μ_1 - μ_2\)).
Which of the two statistical inference procedures provides more information on (\(μ_1-μ_2\)): the hypothesis test (b) or the confidence interval (c)?

Exercise 16.3 (Aczel and Sounderpandian 2018) A car manufacturer wants to evaluate the performance of an engine powered by a new fuel mixture compared to regular gasoline. In 100 trials using the mixture, the average rating was 76.5 (on a scale from 0 to 100) with a standard deviation of 38. In 100 trials using gasoline, the average rating was 88.1 with a standard deviation of 40.

Conduct a two-tailed test at α = 0.05, then determine the 95% confidence interval for the difference between the means.

Exercise 16.4 (Aczel and Sounderpandian 2018) “Active Trader” compared investment returns of companies using two strategies: announcing or not announcing their preliminary earnings. Two random samples of companies were analysed, both with a sample size of 28. In the first group, the average rate of return was 0.19%, while in the second group, it was 0.72%. The standard deviations in both samples were 5.72% and 5.10%, respectively.

Conduct a test for equality of means at α = 0.01 and construct a 99% confidence interval for the difference between the means. What assumptions had to be made?

Exercise 16.5 (Aczel and Sounderpandian 2018) A company is considering offering its employees one of two benefit packages. A random sample of employees was selected, with each employee rating both packages on a scale from 0 to 100. The order in which the two packages were presented to each employee was random. The following results were obtained (for both series, the first number represents the rating given by the first employee, the second number represents the rating given by the second employee, and so on):

Package A: 45, 67, 63, 59, 77, 69, 45, 39, 52, 58, 70, 46, 60, 65, 59, 80 Package B: 56, 70, 60, 45, 85, 79, 50, 46, 50, 60, 82, 40, 65, 55, 81, 68

Can it be claimed that employees prefer one of the packages? Justify your answer.

Exercise 16.6 In 2022, students of mathematical statistics who sent a meme had better results than those who did not send a meme.

Was the difference statistically significant?

What assumptions should be made?

Dane

Exercise 16.7 (Agresti, Franklin, and Klingenberg 2016) In a certain study, men and women were asked how much time they spend weekly on household chores. The following data were obtained (results are given in hours):

Sex	Sample size	Mean	Standard deviation
Women	476	33.0	21.9
Men	496	19.9	14.6

By how many hours, on average, did women spend more on household chores than men in the sample? What is the estimate of this difference for the population (provide a 95% confidence interval)? Conduct an appropriate test to determine whether the average time spent on household chores by women is different from the time spent by men.

Exercise 16.8 The human resources department in an American electronics manufacturing company conducted a study on the wage expectations of candidates for a skilled worker position on the production floor.

The sample included 48 individuals with work experience and 47 individuals without work experience in production. The average expected wage in the sample of individuals without experience was 18.7 USD/hour, with a standard deviation of 10.9 USD/hour. The average expected wage in the sample of individuals with work experience was 31.8 USD/hour, with a standard deviation of 14.5 USD/hour.

Conduct a test to determine whether the difference between the means in the two groups is significant at the 0.01 significance level. Assume a one-tailed alternative hypothesis: individuals with work experience have higher average expectations.

Provide a 95% confidence interval for the difference in USD/hour between the two groups. Calculate Cohen's d.

Literature

Aczel, A. D., and J. Sounderpandian. 2018. Statystyka w Zarządzaniu. PWN. https://ksiegarnia.pwn.pl/Statystyka-w-zarzadzaniu,731934758,p.html.

Agresti, Alan, Christine Franklin, and Bernhard Klingenberg. 2016. Statistics: The Art and Science of Learning from Data. 4th edition. Pearson.

McClave, J. T., and T. T. Sincich. 2012. Statistics. Pearson Education. https://books.google.pl/books?id=gcYsAAAAQBAJ.