Chapter 2 Setting up a Project
Shannon
2.1 Git & GitHub
Here, we’ll review our basic workflow to set up a new project with version control and collaboration through GitHub.
This assumes you have RStudio and GitHub installed and connected with each other (Git
pane appears in RStudio). If this isn’t the case, refer to Jenny Bryan’s book for detailed instructions of that initial workflow. Jenny’s book is also the best resource for solving tricky Git issues.
To initialize a project using RProjects and GitHub…
- Go to January Advisors’ GitHub and click the green
New
button - Give the repository a short and specific name
- Select
Private
repository - Check the boxes for
Add a README file
andAdd .gitignore
. For.gitignore
, choose theR
template. - Click
Create Repository
- Click the green
Code
button and copy the HTTPS link - Open RStudio
- Click
File
→New Project
- On the popup, select
Version Control
→Git
- Paste in the repository HTTPS link
- It will automatically populate a local directory name, matching the name you selected in step 2. Don’t change this- it’s helpful for folder names in GitHub and your local computer to match
- It will also automatically populate a path to that directory- all of your GitHub project folders should exist in the same parent directory. For me, that’s
My Documents
→January Advisors
.
Once your project is established on GitHub and locally, and before getting too involved with the project, you will want to set up the general file structure.
2.2 File Structures
2.2.1 Shiny Apps
For Shiny App projects, start in RStudio: File
→ New File
→ Shiny Web App...
Application name = temp_app
(we’ll delete this later)
For “Application Type”, select Multiple File (ur.R/server.R)
This will make a subdirectory within the main project directory with the app files. We don’t want this- it’s easiest if the app files live in the uppermost project directory. So, go to the project in your local file explorer, open the temp_app
folder, and move the contents (ui.R
and server.R
) into the parent directory. You can now delete the temp_app
folder. You should now hove 4 files in your local directory:
README.md
.Rproj
fileserver.R
ui.R
In your file explorer, make five new folders:
data-clean
data-raw
figures
rscripts
text
www
And one new R script (in RStudio, File
→ New File
→ R Script
):
read-data.R
Your project folder should now look like this (.gitignore
is hidden and can be seen/edited in RStudio)
It’s pretty easy to guess the contents of most of the folders, but it’s worth reviewing:
- data-clean: data that has undergone cleaning and ready for the app
- data-raw: source data, untouched
- figures: a place for exploratory data analysis figures you want to save, charts to email clients… almost always some ad hoc data viz comes up and it’s good to have a place for it that doens’t clutter the main directory
- rscripts: data cleaning scripts, helper functions, and other scripts that support the app upstream but don’t make it run
- text: markdown files that contain copy for the app. For some apps with little text info, you might not need this
- www: .css style sheets and images included in the app
Read more about Shiny Apps in the Shiny Chapter (6).
2.2.2 Analysis/Report Projects
File structure here is a little less important and potentially variable based on the type of report and analyses done. Almost universally, you will want at minimum the following folders to keep things tidy.
data-clean
data-raw
graphics
data-cleaning-scripts
2.3 Commenting Standards
Truthfully, we could do more here… We make heavy use of commenting to create sections to organize our code, but don’t as diligently comment individual lines or chunks to explain what the code is doing.
2.3.1 Shiny Apps
For Shiny apps, the single most useful thing you can do is comment close parentheses on the ui.R
code. Having each close parenthesis on a separate line and commenting # closes home page
, # closes wellPanel
, etc will make it much easier to fix syntax errors when adding, editing, or rearranging sections.
For Shiny apps, we also use comments to make headers so that we can easily navigate between sections of the app. For this, try to make the section names match on ui.R
and server.R
, and also make these consistent with the data and object naming conventions. This will look a little different on each app because it should be guided by the overall structure of the app, number of pages, subsections, etc. It’s really useful to think about this organization upfront and use comments/sections to uphold it in the code.
Here’s a stripped back example of comments an an organizing method for our Aim Hire Texas app. In ui.R
:
###--- HOME PAGE ----------------------------
tabPanel(title = "Home"
# closes home page
),
###--- WDA PAGE ---------------------------
tabPanel(title = "Workforce Development Areas",
## * Well panel -----------
wellPanel(
p("Make your selections here")
# closes wellPanel
),
## * Main panel -----------
## 1. living wage households --------
h2("Living Wage Households"),
## 2. trends in working age adults --------
h2("Future Workforce"),
## 3. employment by education --------
h2("Education Pipeline")
# closes WDA page )
In server.R
, the sections should be named and organized similarly. For each page or section, I usually like to make subsections for content
, reactives
, and observes
.
###--- HOME PAGE ----------------------------
## Reactives ----
<- reactive({
selected_wda_sf <- wda_sf %>%
sf filter(wda == input$select_wda)
})
## Content -----
## map
$home_map <- renderLeaflet({
output
})
## Observes -----
## respond to map click
observeEvent(input$home_map_shape_click$id, {
# update select input and change page
updateSelectizeInput()
updateNavbarPage()
})
###--- WDA PAGE ----------------------------
## 1. living wage households --------
## reactives
<- reactive({
filter_lwh <- alice_hh_counts %>%
df filter(wda == input$select_wda)
})
## content
$lwh_plot_year <- renderHighchart({
outputfilter_lwh() %>%
hchart(type = "column", hcaes(x = year, y = value, group = name))
})
## 2. trends in working age adults --------
Lastly, for Shiny apps, reactive()
and observeEvent()
functions are really helpful to comment. Often, these functions get really complicated or work in tandem with other observes and user input. In the comment, be explicit about when a particular event in triggered and what distinguishes this from similar reactives/events. For example, a lot of times we have a design where users can select an area via a selectInput()
or by clicking a map. These two actions each need a reactive, but sometimes you forget why you have two nearly identical reactives when you look at the code later. Comments should clarify the specific and unique purpose of each reactive()
and observeEvent()
.
2.3.2 Analysis/Report Projects
We don’t have as many general principles here. Best practice, and something I do occasionally, would be to go through the entire codebase at the end of a project and clean and comment the code. Specifically, you’d want to comment new or tricky things. Some helpful things you might want to include in a comment are:
- point out functions we don’t commonly use and note what they do
- for long data cleaning pipes, some details about the input and output (this is really helpful if the pipe won’t run in the future- knowing what the output should look like is a big help)
- provide reason why a simpler method failed (constantly when I go back and run old code, I think “why did I do x, y and z, it would have been so much easier to just do p” and every time there was a reason I had to do the more convoluted thing, I just didn’t remember and didn’t write it down)
- url to any particular blog or resource that was helpful in solving an issue