Overview

This tutorial will show you you how to map the latest tree-status in subquadrats with map_tag().

Motivation

Imagine you need some maps to work in the field at the site Yosemite. All you need is a ViewFullTable and the function map_tag(). Here is a glimpse of your data.

glimpse(yose_vft)
#> Observations: 92
#> Variables: 32
#> $ DBHID            <int> 1, 34881, 2, 34882, 3, 34883, 4, 34884, 5, 34...
#> $ PlotName         <chr> "Yosemite Forest Dynamics Plot", "Yosemite Fo...
#> $ PlotID           <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
#> $ Family           <chr> "Pinaceae", "Pinaceae", "Pinaceae", "Pinaceae...
#> $ Genus            <chr> "Pinus", "Pinus", "Abies", "Abies", "Abies", ...
#> $ SpeciesName      <chr> "lambertiana", "lambertiana", "concolor", "co...
#> $ Mnemonic         <chr> "PILA", "PILA", "ABCO", "ABCO", "ABCO", "ABCO...
#> $ Subspecies       <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL...
#> $ SpeciesID        <int> 5, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
#> $ SubspeciesID     <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL...
#> $ QuadratName      <chr> "A01", "A01", "A01", "A01", "A01", "A01", "A0...
#> $ QuadratID        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
#> $ PX               <dbl> 2.82, 2.82, 1.06, 1.06, 3.89, 3.89, 5.86, 5.8...
#> $ PY               <dbl> 11.25, 11.25, 3.13, 3.13, 2.17, 2.17, 3.28, 3...
#> $ x                <dbl> 2.45, 2.45, 0.69, 0.69, 3.52, 3.52, 5.49, 5.4...
#> $ y                <dbl> 10.58, 10.58, 2.46, 2.46, 1.50, 1.50, 2.61, 2...
#> $ TreeID           <int> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, ...
#> $ Tag              <chr> "01-0001", "01-0001", "01-0002", "01-0002", "...
#> $ StemID           <int> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, ...
#> $ StemNumber       <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
#> $ StemTag          <chr> "01-0001", "01-0001", "01-0002", "01-0002", "...
#> $ PrimaryStem      <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL...
#> $ CensusID         <int> 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, ...
#> $ PlotCensusNumber <int> 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, ...
#> $ DBH              <chr> "149", "152.4", "54.6", "56.1", "5", "5.4", "...
#> $ HOM              <dbl> 1.37, 1.37, 1.37, 1.37, 1.37, 1.37, 1.37, 1.3...
#> $ ExactDate        <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL...
#> $ Date             <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL...
#> $ ListOfTSM        <chr> "v1;A", "v1;A;b2", "v2;A", "v1;A;b2", "v1;A",...
#> $ HighHOM          <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
#> $ LargeStem        <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL...
#> $ Status           <chr> "alive", "alive", "alive", "alive", "alive", ...

First you filter the specific plot you want to produce maps for.

yose_vft1 <- filter(yose_vft, PlotID == 1)
# Using a private data set; and using only one quadrat for a small example
maps <- map_tag(yose_vft1)

Here is the first set of four subquadrats.

maps[1]
#> $A01_1

You can output a .pdf with pdf().

pdf("example-yosemite.pdf", paper = "a4")
maps
#> $A01_1
#> 
#> $A01_2
#> 
#> $A01_3
#> 
#> $A01_4
dev.off()
#> png 
#>   2

What Does map_tag() Do?

Let’s review what has just happened. Although your ViewFullTable may have data of multiple censuses, map_tag() focuses exclusively on the latest census (via PlotNameCensus). map_tag() plots not the status of individual stems but the status of a tree. That is, if in the latest census at least one stem of a tree has status “alive”, then map_tag() plots its status as “alive”; if no stem is “alive”, map_tag() plots its status as “other”. Why “other” and not “dead”? Because stems are classified not simply as “dead” or “alive”: if it is not “alive”, the value of status may be, for example, “dead” for one stem of a particular tree, and “broken below” for another stem of the same tree. Yet map_tree() is designed to plot only one value of status per individual tag. The solution is to group all values that are not “alive” into the single category “other”.

OK, that was the bad news – map_tag() won’t let you change the values of the variable status. The good news is that you can change almost anything else. You will learn how to do that in the rest of this tutorial.

Preparation: Packages and Data

The packages you will use are try, ggplot2 and dplyr. try contains the function map_tag() (after I receive some feedback I will move it to the package forestr). ggplot2 and dplyr provide powerful tools for vizualizing and wrangling data. (These packages are part of the tidyverse (https://www.tidyverse.org/) – a collection of R packages designed for data science. All packages share an underlying philosophy and common APIs.)

# install_github("forestgeo/try")
library(try)
# install.packages("ggplot2")
library(ggplot2)
# install.packages("dplyr")
library(dplyr)

# Print only a few rows of data framed to save time and space
options(dplyr.print_min = 6, dplyr.print_max = 6)

From now on, The data you will use is an example data set from Barro Colorado Island made public in 2012 (see https://repository.si.edu/handle/10088/20925 and ?try::bci12vft_mini).

# Subset of a public ViewFullTable from BCI (source:
# https://repository.si.edu/handle/10088/20925).

# Convert to tibble (modern dataframe) for better printing
bci_vft <- as_tibble(bci12vft_mini)
glimpse(bci_vft)
#> Observations: 4,374
#> Variables: 28
#> $ MeasureID        <int> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1...
#> $ PlotID           <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
#> $ Plot             <chr> "bci", "bci", "bci", "bci", "bci", "bci", "bc...
#> $ Family           <chr> "Lecythidaceae", "Myristicaceae", "Malvaceae"...
#> $ GenusSpecies     <chr> "Gustavia superba", "Virola surinamensis", "Q...
#> $ Genus            <chr> "Gustavia", "Virola", "Quararibea", "Protium"...
#> $ SpeciesName      <chr> "superba", "surinamensis", "asterolepis", "te...
#> $ SubSpeciesName   <chr> "NULL", "NULL", "NULL", "NULL", "NULL", "NULL...
#> $ SpeciesID        <int> 460, 1075, 871, 828, 108, 828, 871, 871, 1144...
#> $ Mnemonic         <chr> "gustsu", "virosu", "quaras", "protte", "bros...
#> $ QuadratID        <int> 1250, 1250, 1250, 1249, 1249, 1248, 1248, 124...
#> $ QuadratName      <chr> "4924", "4924", "4924", "4923", "4923", "4922...
#> $ x                <dbl> 14.1, 10.5, 13.5, 12.7, 1.9, 3.7, 3.0, 3.0, 1...
#> $ y                <dbl> 8.3, 8.9, 18.3, 9.3, 13.5, 16.5, 12.6, 12.6, ...
#> $ gx               <dbl> 994, 990, 994, 993, 982, 984, 983, 983, 996, ...
#> $ gy               <dbl> 488, 489, 498, 469, 474, 456, 453, 453, 447, ...
#> $ TreeID           <int> 19, 21, 23, 24, 25, 29, 30, 30, 32, 33, 34, 3...
#> $ Tag              <chr> "000002", "000004", "000006", "000007", "0000...
#> $ StemID           <int> 1, 1, 7, 1, 1, 6, 2, 3, 2, 1, 1, 1, 6, 6, 1, ...
#> $ StemTag          <chr> NA, NA, "NULL", NA, NA, "NULL", "NULL", "NULL...
#> $ PrimaryStem      <chr> "main", "main", "main", "main", "main", "main...
#> $ CensusID         <int> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
#> $ PlotCensusNumber <int> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
#> $ DBH              <int> 304, 357, 313, 456, 1290, NA, 492, 33, NA, 42...
#> $ HOM              <dbl> 3.00, 1.30, 3.00, 1.30, 5.20, NA, 3.20, 1.30,...
#> $ ExactDate        <date> 2005-10-14, 2005-10-11, 2005-10-14, 2005-10-...
#> $ ListOfTSM        <chr> "B,cylY", "NULL", "B,cylY", "NULL", "B,cylN",...
#> $ Status           <chr> "alive", "alive", "alive", "alive", "alive", ...
bci_vft1 <- filter(bci_vft, PlotID == 1)

Not all data sets may be named appropriately. map_tag() expects some variables to have specific names. If the name of your variables is different than expected, you’ll get an error.

# Replacing a crucial name with a bad name
bci_vft_bad_nms <- rename(bci_vft1, bad_x = x)

# Fails
map_tag(bci_vft_bad_nms)
#> Error: Ensure your data set has these variables (regardles of the case):
#> tag, x, y, status, quadratname, censusid, plotid

Reading the error message will help you identify which variables you need to rename.

bci_vft1_rnm <- dplyr::rename(bci_vft_bad_nms, x = bad_x)
# Using lowercase names for simiplicity
names(bci_vft1_rnm) <- tolower(names(bci_vft1_rnm))

With the techniques you’ll learn here, you can produce maps of an entire data set, or of just a few quadrats. To save space, let’s focus just on (any) one quadrat.

any_quadrat <- sample(unique(bci_vft1_rnm$quadratname), 1)
filter(bci_vft1_rnm, quadratname == any_quadrat)

Customizing Your Maps

This section shows how you can change the default of your maps (see also ?map_tag()). For example, you can customize the plot title, and the points and tags.

maps <- map_tag(bci_vft1_rnm, 
  site_name = "BCI 2012", point_size = 3, point_shape = c(17, 6), tag_size = 5
)
maps[1]
#> $`4915_1`

Customizing The Header

The header can be customized in two ways. One ways it to pass a string to the argument header.

map_tag(bci_vft1_rnm, site_name = "BCI 2012", header = "My header")[1]
#> $`4915_1`

The string can be multi-lined; to insert line brakes, use “\n”.

map_tag(bci_vft1_rnm, site_name = "BCI 2012", 
  header = "Line 1: _________\nLine 2:\nLine 3:....................."
)[1]
#> $`4915_1`

The second way is to use get_header() (see ?get_header()).

your_header <- get_header(
  line1 = "Your header-line 1: _____________________________",
  line2 = "Your header-line 3: _____________________________",
  line3 = "Your header-line 2: _____________________________"
)
map_tag(bci_vft1_rnm, site_name = "BCI 2012", header = your_header)[1]
#> $`4915_1`

Using Pre-Made and Custom Themes

Similarly to how you customized the header, you can also customize the plot theme. By default, theme = get_theme() (see ?get_theme()); but you can either use any pre-made theme (see ?ggplot2::theme_bw) or create a custom theme. This is how you can use a pre-made theme.

# Allow using pre-made themes (e.g. ggplot2::theme_bw()) and building custom
# themes (with ggplot::theme()).
library(ggplot2)

map_tag(bci_vft1_rnm, site_name = "BCI 2012", theme = theme_gray())[1]
#> $`4915_1`

And this is how you can use a custom theme (see ?ggplot2::theme()).

# An extreeme example -- to show that themes are extreemely flexible
your_theme <- ggplot2::theme(
  legend.position = "bottom",
  legend.title = element_blank(),
  legend.text = element_text(size = 8, colour = "red"),
  text = element_text(size = 11, face = "bold.italic", colour = "white"),
  plot.background = element_rect(fill = "black"),
  plot.margin = margin(2, 2, 2, 2, "cm"),
  strip.background = element_rect(fill = "darkgreen"),
  strip.text = element_text(colour = "white"),
  # make grid to dissapear by matching background colour
  panel.background = element_rect(fill = "lightgreen"),
  panel.grid.minor = element_line(colour = "black", linetype = "dotted"),
  panel.grid.major = element_line(colour = "black")
)
map_tag(bci_vft1_rnm, site_name = "BCI 2012", theme = your_theme)[1]
#> $`4915_1`

Extending The Grid Beyond the Plot Limits

Although they shouldn’t, trees sometimes are located beyond the limits of a quadrat. For example, if the side of your quadrats is 20 meters, some trees may plot at, say, x = 20.5, or y = 21. In such cases you may want to extend the plot grid to encompass those odd trees. To extend the grid use the argument shrink. (The example below has no trees beyond the quadrat limits, but hopefully you’ll still understand when to use extend_grid.)

map_tag(bci_vft1_rnm, site_name = "BCI 2012", extend_grid = 0.4)[4]
#> $`4915_4`

Customizing the Dimension of Quadrats and Subquadrats

You can customize the dimension of your quadrats and subquadrats to fit the range of x and y of your data. Let’s examine what the range is for the data you have been using so far.

x_and_y_variables <- select(bci_vft1_rnm, x, y)
lapply(x_and_y_variables, range)
#> $x
#> [1]  0.0 19.9
#> 
#> $y
#> [1]  0.0 19.9

The range is between around 0 and 20. That is why we have been using using the default quadrat dimension of 20 meters (x_q = 20; and y_sq = x_q = 20), and the default subquadrat dimension of 5 meters (x_sq = 5; and y_sq = x_sq = 5).

map_tag(bci_vft1_rnm, site_name = "BCI 2012", 
  x_q = 20, x_sq = 5,
  y_q = 20, y_sq = 5
)[1]
#> $`4915_1`

By default, y_q will be the same as x_q – so you don’t have to provide both (only x_q is mandatory). And the same is true for y_sq and x_sq. With this data – with x and y ranging 0-20 meters – the quadrat and subquadrat dimensions used above are the right ones. But map_tag() won’t complain if you choose different parameters; so you have to be careful not to shoot yourself on the foot.

The following two examples demonstrate the use wrong dimensions. Let’s first use dimensions that are smaller than the range of x and y.

map_tag(bci_vft1_rnm, 
  site_name = "BCI 2012", x_q = 10, x_sq = 2.5, 
  # if not extended, the lines surrounding the map won't plot
  extend_grid = 0.25
)[1]
#> $`4915_1`

And now let’s use dimensions that are larger than the range of the data.

# Using 
map_tag(bci_vft1_rnm, 
  site_name = "BCI 2012", x_q = 100, x_sq = 25, 
)[1]
#> $`4915_1`

This image should flag that you have this kind of problem – by which the range of data is smaller than the dimensions you set for your quadrat and subquadrat. Notice that the points (the actual positions of the trees) are all between x and y ranging 0-20. The tags, however, go beyond 20 because they automatically repel themselves to avoid overlapping.

The Right Dimensions for the Right Data

Had x and y of your data ranged 0-100, then yes – the appropriate quadrat and subquadrat dimensions to use would be 100 and 25.

# Creating new data set with x and y ranging 0-100
bigger <- bci_vft1_rnm
n <- nrow(bigger)
bigger$x <- sample(0:100, n, replace = TRUE)
bigger$y <- sample(0:100, n, replace = TRUE)

map_tag(
  bigger, 
  x_q = 100, x_sq = 25, 
  extend_grid = -1.75
)[1]
#> $`4915_1`

And if x and y of your data range 0-10, then the appropriate quadrat dimension to use would be 10 and 2.5.

# Creating new data set with x and y ranging 0-100
smaller <- bci_vft1_rnm
n <- nrow(smaller)
smaller$x <- sample(0:10, n, replace = TRUE)
smaller$y <- sample(0:10, n, replace = TRUE)

map_tag(smaller, x_q = 10, x_sq = 2.5, extend_grid = 0.25)[1]
#> $`4915_1`

Calling add_subquadrat() Directly

If you want to only calculate the variable subquadrat, you don’t need to use map_tag() – can directly call add_subquadrat().

with_subquadrat <- add_subquadrat(bci_vft1_rnm, x_q = 20, x_sq = 5)
#> Lowering names case
select(
  with_subquadrat, 
  # reorder variables to show first what's new 
  subquadrat, x, y, everything()
)

But if you don’t mind going through a little more trouble, you can also get the subquadrat variable from the data the underlies the maps.

maps <- map_tag(bci_vft1_rnm, x_q = 20, x_sq = 5)
data_list <- purrr::map(maps, "data")
data_combined <- purrr::reduce(data_list, rbind)
select(
  data_combined,
  subquadrat, x, y, everything()
)

Acknowledgements

I thank for ideas and guidance to Suzanne Lao, Stuart Davis, Shameema Jafferjee Esufa, David Kenfack and Anudeep Singh. Andudeep also wrote the algorithm of add_subquadrat() (which I translated from SQL to R).

