Mapping species distribution

Mauro Lepore

2017-11-03

Overview

This vignette shows you how to map species distribution with map_sp() and map_sp_pdf() from the forestr package.

# To install from a private repo, generate a personal access token (PAT) in
# https://github.com/settings/tokens and supply to this argument.
GITHUB_PAT <- "your token"
# install_github("forestgeo/forestr", auth_token = GITHUB_PAT)
library(forestr)

# Also using the package dplyr for easier data manipulation
# install.packages(dplyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
# Print only a few rows of tibbles (modern dataframes) to save space
options(dplyr.print_min = 6, dplyr.print_max = 6)

Loading data for examples.

# Converting dataframe to tibble for better printing
example_data <- as_tibble(forestr::bci12t7mini)

# Let's find the top-3 most abundant species
sp_n <- count(example_data, sp)
sp_arranged <- arrange(sp_n, desc(n))
sp_top3 <- sp_arranged$sp[1:3]
sp_top3
#> [1] "hybapr" "faraoc" "des2pa"

# For a small example, keeping data of only the 3 most abundant species
census <- filter(example_data, sp  %in% sp_top3)
census
#> # A tibble: 315 x 20
#>   treeID stemID    tag StemTag     sp quadrat    gx    gy MeasureID
#>    <int>  <int>  <chr>   <chr>  <chr>   <chr> <dbl> <dbl>     <int>
#> 1   8651      1 010554         hybapr    4811   974 227.6      6832
#> 2   9480      1 011419         faraoc    4814   977 290.0      7373
#> 3  10179     11 012145    <NA> hybapr    4819   979 395.3      7898
#> 4  15240     NA 019314    <NA> hybapr    4223   845 461.2        NA
#> 5  15680     NA 019758    <NA> hybapr    4824   979 483.4        NA
#> 6  15888     NA 020180    <NA> faraoc    4701   950  29.8        NA
#> # ... with 309 more rows, and 11 more variables: CensusID <int>,
#> #   dbh <dbl>, pom <chr>, hom <dbl>, ExactDate <chr>, DFstatus <chr>,
#> #   codes <chr>, nostems <dbl>, date <dbl>, status <chr>, agb <dbl>

The next section shows you how to explore data quickly. You will produce good quality maps with available defaults. The section after the next one shows you how to customize you maps by making some common and less common changes. And the last section shows you how to fine tune your maps.

Exploring Data Quickly: Using Defaults

Mapping Species Distribution

Although the output of map_sp() and map_sp_pdf() is printed to different devices – i.e, the screen or a .pdf file – those functions are almost identical. They are documented jointly and share the same arguments (except for the file argument that only belongs to map_sp_pdf(). (These functions are not merge into a single function because their outputs are different; functions that produce different output tend to cause more problems that those which output is consistently the same).

?map_sp
# Or
?map_sp_pdf

Notice that only the first two arguments of these functions are strictly necessary.

# Print to screen
map_sp(census, c("hybapr", "faraoc"))
#> $hybapr
#> 
#> $faraoc

# Defaults to save in working directory. Returns invisibly.
map_sp_pdf(census, c("hybapr", "faraoc"))
#> Saving as map.pdf
# Force printing the invisible output
print(map_sp_pdf(census, c("hybapr", "faraoc")))
#> Saving as map.pdf
#> $hybapr
#> 
#> $faraoc

# (Printing side by side to save space)

Keep in mind that your census data must have specific variable-names:

# Show behaviour with wrong name
with_wrong_name <- rename(census, SP = sp)

# This will fail because the data set has name `SP`, not `sp`
map_sp(with_wrong_name, c("hybapr", "faraoc"))
#> Error: Check that your data contains these names: gx, gy, sp

Adding Elevation

You may have elevation data from your site.

my_elev <- forestr::bci_elevation
my_elev
#> # A tibble: 20,301 x 3
#>       x     y  elev
#> * <int> <int> <dbl>
#> 1     0     0   121
#> 2     0     5   121
#> 3     0    10   121
#> 4     0    15   121
#> 5     0    20   121
#> 6     0    25   123
#> # ... with 2.03e+04 more rows

Note that the elevation data set must have very specific variable names; or it will fail:

# This will fail because the names of `my_elev` are not correct
map_sp(census, c("hybapr", "faraoc"), elevation = my_elev)
#> Error: Check that your data contains these names: gx, gy, elev

If necessary, rename your variables:

# Renaming data and trying again
elev <- rename(my_elev, gx = x, gy = y)
map_sp(census, c("hybapr", "faraoc"), elevation = elev)
#> $hybapr
#> 
#> $faraoc

Dealing with Overplotting

Sometimes a species is so abundant that its distribution overplots.

# Fake over abundant species
crowded <- dplyr::tibble(
  sp = sample(c("species1"), 10000, replace = TRUE),
  gx = sample.int(1000, 10000, replace = TRUE),
  gy = sample.int(500, 10000, replace = TRUE)
)
map_sp(crowded, c("species1"))
#> $species1

To reduce overplotting, you can do a number of things. First, you can try reducing the size or opacity of the points, or changing their shape.

# smaller and semi transparent
map_sp(crowded, c("species1"), size = 1, alpha = 2/10)
#> $species1

# Smaller and open
map_sp(crowded, c("species1"), size = 1, shape = 21)
#> $species1

# (Printing side by side to save space)

If your research question allows, you can try subsampling your data.

# A random sample of 2000 stems
smaller <- sample_n(crowded, 2000)
map_sp(smaller, c("species1"), shape = 21)
#> $species1

Customizing Your Maps

You may want to customize your maps. For example, you may want to change some properties of the points or elevation lines (e.g. colour); or the general theme of the map. These are relatively common changes.

Most Common Changes

Changing Points’ Properties

Both map_sp() and map_sp_pdf() have the argument ..., which lets you pass arguments to ggplot2::geom_point() to change points’ properties such as the shape, size, fill, colour, and stroke. Read ?ggplot2::geom_point() to learn all your options; here you can see some common ones:

map_sp(census, "hybapr",
  # Arguments passed to ggplot2::geom_point()
  size = 4, shape = 22, fill = "green", colour = "black", stroke = 2)
#> $hybapr

Changing Properties of the Elevation Lines

You can also customize the elevation lines. This example shows all the available options:

  • Change line size;
  • set colours to represent low and high elevation;
  • set the number of intervals in which the elevation variable is to be cut (note that the cut results in ugly numbers, with many decimals; also, with increasing bins’ number the function takes longer to run).
map_sp(census, "hybapr", 
  elevation = elev, line_size = 1, low = "red", high = "blue", bins = 4)
#> $hybapr

Changing the Theme

You can change the general look of your map by changing its theme. A number of pre-set themes are available in the ggplot2 package (see ?ggplot2::theme_bw()).

map_sp(census, "hybapr", theme = ggplot2::theme_classic())
#> $hybapr
map_sp(census, "hybapr", theme = ggplot2::theme_dark())
#> $hybapr

# (Printing side by side to save space)

Less Common Changes

Here are some changes that you may need less often but are still good to keep in mind.

Change File Name

When printing to .pdf, you can provide a custom file name (including the path to a directory); but note the file name must end in .pdf or it will be replaced by a default name:

spp <- c("hybapr", "faraoc")

map_sp_pdf(census, spp, elevation = elev, file = "bad-name1")
#> Warning in check_file_extension(file): File extension should be .pdf.
#>   * Replacing given file name by default file name
#> Saving as map.pdf

map_sp_pdf(census, spp, elevation = elev, file = "bad-name2.png")
#> Warning in check_file_extension(file): File extension should be .pdf.
#>   * Replacing given file name by default file name
#> Saving as map.pdf

This is OK. (Note that you can suppress messages)

map_sp_pdf(census, spp, elevation = elev, file = "good-name.pdf")
#> Saving as good-name.pdf

# Same without any message
suppressMessages(
  map_sp_pdf(census, spp, elevation = elev, file = "good-name.pdf")
)

Change x and y Limits

Occasionally you may need to change the map limits. The map limits default to the range 0-max, where max is the maximum value of the variable gx or gy in the census data. In the unlikely case where no stem occurs at the maximum limits of your plot, the default would result in a map that spans less than the defined limits of the site’s plot.

# Fake case: The maximum-limits of the site's plot are gx = 1000 and gy = 500;
# but all stems in the census occurr at lower gx and gy limits
odd_data <- filter(census, gx < 800, gy < 200)
two_species <- unique(odd_data$sp)[1:2]
map_sp(odd_data, two_species)
#> $hybapr
#> 
#> $faraoc

# (Printing side by side to save space)

You can overwrite the default limits with the arguments xlim and ylim:

map_sp(odd_data, two_species, xlim = c(0, 1000), ylim = c(0, 500))
#> $hybapr
#> 
#> $faraoc

# (Printing side by side to save space)

Fine Tunning

This last section shows you how to fine tune your plots. For this, you will need a couple of functions from packages other than forestr.

Multiple Plots in One Page

To plot multiple maps on one page you can use gridExtra::maparrange():

library(gridExtra)
#> 
#> Attaching package: 'gridExtra'
#> The following object is masked from 'package:dplyr':
#> 
#>     combine

all_species <- unique(census$sp)
maps <- map_sp(census, all_species)
multipaged <- marrangeGrob(maps, nrow = 1, ncol = 2)
multipaged

The multi-paged maps can also be saved to a .pdf.

# Saving to .pdf: Option 1
ggplot2::ggsave("multi-paged.pdf", multipaged)
#> Saving 6 x 3.7 in image

# Saving to .pdf: Option 2
pdf()
multipaged
dev.off()
#> png 
#>   2

It is also possible to save the multi-paged maps to devices other than .pdf – the png() device, for example, will automatically create one file per page.

# Saving to .png; this will create multiple files.
png()
multipaged
dev.off()
#> png 
#>   2

Customizing the Theme

Before, you used a pre-made theme. But you can create your own theme with ggplot2::theme(), save it as an object, and pass that object to map_sp() or map_sp_pdf().

# Showing only a few options; see all the options with ?ggplot2::theme()
my_theme <- ggplot2::theme(
  text = element_text(size = 25, face = "bold.italic", colour = "white"),
  plot.background = element_rect(fill = "black"),
  plot.margin = margin(2, 2, 2, 2, "cm"),
  strip.background = element_rect(fill = "darkgreen"),
  strip.text = element_text(colour = "white"),
  # make grid to dissapear by matching background colour
  panel.background = element_rect(fill = "lightgreen"),
  panel.grid.minor = element_line(colour = "lightgreen"),
  panel.grid.major = element_line(colour = "lightgreen")
)
map_sp(census, "hybapr", theme = my_theme)
#> $hybapr

Extending with ggplot2

Finally, you can also use the output of map_sp() and map_pdf() as the starting point for further customization with ggplot2 (the next section provides resources to learn ggplot2). That may be convenient for minor extensions; for larger ones you may prefer to start from scratch. Note that map_sp() and map_sp_pdf() output not just one plot but a list of plots. That is important to keep in mind if you want to iterate over all the plot-elements of the output list. For example, this is how you would add a new layer to one element of the plots’ list:

p0 <- map_sp(census, c("hybapr", "faraoc"))

# Adding a new layer to one element of the plots' list
p0[["hybapr"]] + geom_vline(aes(xintercept = 300), colour = "red")

And this is how you can apply new layers to all the plot-elements of a list (note that the function + is applied to each item of the plots’ list):

p1 <- lapply(p0, `+`, geom_vline(aes(xintercept = 300), colour = "red"))
p1
#> $hybapr
#> 
#> $faraoc

p2 <- lapply(p1, `+`, geom_hline(aes(yintercept = 400), colour = "blue"))
p2
#> $hybapr
#> 
#> $faraoc

# (Printing side by side to save space)

Learning More

There are many visualization systems in R. The system built in map_sp() and map_sp_pdf() is that of the ggplot2 package. ggplot2 is not an arbitrary collection of functions; it is one implementation of The Grammar of Graphics. Good places to learn ggplot2 include this website; this specialized, paid book: ggplot2: Elegant Graphics for Data Analysis; and this general, free book R for Data Science.