Coding in R - Final exam

Introduction

The final exam is worth 30% of your final grade. The exam consists of a progression of tasks with increasing difficulty. Different tasks are worth different points.

If you successfully complete a task, you will receive full points for that task and may move onto the next task.
If you don’t complete a task, or you have errors in your script or output, or R shows warning messages, you will receive 0 points to the current and all following tasks (unless stated otherwise). Hence, make sure to successfully complete each task before moving to the next one.

For the successful completion of this test, you may use your lecture notes, old R scripts, and, of course, all of your other friends: Google, Stackoverflow, cheat sheets and so forth and so on. However, note that relevant University of Windsor policies on academic misconduct still applies, hence you are not allowed to communicate with other students. If you have any questions, contact me by sending me a message in the course’s Teams chat, and I will respond at soon as I can.

You’re free to use whatever R commands you want for completing the tasks, unless instructed differently.

Disclaimer: This file is property of the University of Windsor. Do not share this file.

Preface

You are a great scientist (congrats!). Sir Richard Branson heard about your great success and wants to pay you a ton of money to run a study to investigate the effect of automation on spacecraft pilots’ performance. In short, Richard is interested in understanding whether or not using automation will improve pilots’ performance.

The data from the experiment that you just finished running is saved in a csv file and is available here.

The dataset is organized as following:

Participant: participant # 1 to 12
Mode: Each pilot underwent two within-subject conditions: Manual and Automated. In the manual condition, the system was operated manually. In the automated condition, the automated system was engaged.
Events: to understand pilots’ reactions to unexpected events, pilots were instructed to press a button every time they were presented with an event (a light in this case). They were presented with 6 events in total: event #1 to 6.
RT: Response Times (RT) to the event detection task were recorded in milliseconds.

Points breakdown

Below are the tasks included in the exam, each with the numbers of points assigned to them.

Task	Task 1	Task 2	Task 3	Task 4	Task 5	Task 6	Task 7	Task 8	Task 9	Task 10
Points	1	2	4	4	3	3	4	2	4	3

How to submit the final script

At the end of exam, submit your final R file as LastName_studentID.R in Blackboard under Final Exam. Make sure the file has the correct extension.

Task 1

Read the dataset into R, and assign the varibles as following:

Participant is a factor
Mode is a factor
RT is numeric
NAs are removed

Note that for this particular task you will need to set a directory which is unique for your machine and local folder. That’s totally fine. When grading, I will change the directory in your script back to a different directory on my machine. However, make sure to assign a name to your dataset and always reference that dataset (or derivate datasets) throughout the script, and never reference the original csv file again in your script.

Task 2

Create the factor Gender and add it to the dataset.

Participants 1-6 are males
Participants 7-12 are females

Task 3

Create the factor Age and add it to the dataset. This factor has three levels: young, mid, old.

Participants 1, 3, 4, 7 are young
Participants 2, 5, 6, 8 are mid
Participants 8 through 12 re old

Task 4

Using mutate () or its combinations, create a Response variable from RT and add it to the dataset.

If RT < 500, then Response is short
If 501 < RT <1000 then Response is medium
If RT > 1001 then Response is long

Task 5

Summarize the dataset to create data_sum that has mean, standard deviation and standard error of RT broken down by mode and gender. All nonnumeric values may need to be removed.

Task 6

Utilize data_sum to create a histogram with:

RT on the y axis
Mode on the x axis
gender as grouping variable
Response Times (in ms) for the y axis’ title
System Mode for the x axis’ title

Assign the name myPlot to it.

Task 7

Complete 7.1 and 7.2

7.1 Run an ANOVA with RT as dependent measure and mode as within-subject factor.

A few important things to keep in mind:

You may need to omit empty cells or NAs from your dataset first.
When running the ANOVA, R may show you the following message.

Warning: Collapsing data to cell means. *IF* the requested effects are a subset of the full design, you must use the "within_full" argument, else results may be inaccurate.

If you see this or a similar message together with the results of the ANOVA, that’s fine and you can move on to the following task. Make sure the results of the ANOVA are shown though.

7.2 Run an ANOVA with gender as between subject variable.

You might see a similar error message as in 7.1. If so (and even if you don’t) move onto the next task and consider this task successfully completed, provided the results of the ANOVA are shown.

Task 8

Following what you have done for task 7.2, run an independent t-test to investigate the effect of gender on RT. In addition, calculate Cohen’s d for this comparison.

Task 9

Turn the data dataset from a long format to a wide format so to have RT for Manual and Automated Modes into 2 separate columns. Your new dataset should look like the one in the image below.

Dataset in the Wide format

Task 10

You run a survey study where you ask 30 participants, with different ages and genders, five questions each: Q1, Q2, Q3, Q4, Q5. Participants answer the five questions on a scale from 1 to 7. When importing the data into an Excel file, you do so by using a wide format. Unbeknownst to you, however, your adviser is fervently against using wide formats. So, before you present your work to them, you decide to quickly change the dataset from wide to long.

Access the dataset here, and turn it into a long format before your adviser finds out.

Here is what your dataset looks like now: Wide

Here is what your dataset will look like later:

Long