Overview

On Sept 6th, 2017, Gabriel, Sean and Mauro discussed what the biomass functions of the forestr package should look like. The main goal is to incorporate into the default behaviour the new tropical allometries (Chave et al. 2014). We are also interested in allowing more flexibility in the type of input that the user can provide. We plan three functions, to:

  1. decide which formula/equation applies to a given stem: get_allometry()
  2. apply the formula/equation to each stem: biomass_per_individual()
  3. apply biomass_per_individual() within user-defined groups. This is part of the abundance module and won’t be discussed here.

Get allometries, allowing site-, species-, and stem-level input from the user

The forestr package should contain functions to extract wood density from the Global Wood Density Database or CTFS databases. It may contain also allometric equations to estimate stem height. The user should use these functions to have a more complete dataset. The functions in the biomass module would assume that all the available information is already in the main data table, and that the allometric equation will use all the available information.

The function will return allometries only for the “with height” and the “without height” cases. It will return NA’s if there is no DBH or wood density. This mirrors the available defaults for tropical forests in Chave et al. (2014), Eq. 4 and 7.

The allometric equation to use may potentially depend on the site (or conditions of the site, E), the species (e.g. different equations for palms), and the stem (e.g. different equations for broken trees). The user should be allowed to provide input at the three levels.

The general flow in the get_allometry() function is:

  1. generate default allometries.
  2. overwrite with user-provided site-level allometries.
  3. overwrite with user-provided species-level allometries.
  4. overwrite with user-provided individual-level allometries.

The site-level input could take these forms:

  1. NULL: do nothing.
  2. a vector or a list with one or two allometric equations (for the with/without height cases)

The species-level input could take these forms:

  1. NULL: do nothing.
  2. a list, named with species codes, with one or two allometric equations per species.

If the user has family-level allometries, e.g. for palms, it should not be difficult to translate that into a species-level input. In any case, those could be accessory functions to be developed elsewhere within the package.

The stem-level input could take these forms:

  1. NULL: do nothing.
  2. a list, named with StemID’s, with one or two allometric equations per stem.

The function returns one and only one allometry per stem, based on data availability. It also returns what is the level of estimation for the selected allometry. It returns a table with as many rows as rows has the main dataset. If we find that performance is an issue, we would get rid of the stem-level options, and the function will work on (and return) a table with as many rows as species has the main dataset.

Note that the functions assume that stem height is called “H”, DBH is called “D” and wood density (wood specific gravity) is called “WSG”. This applies both to the allometric equations and the column names in the dataset. The get_allometry() function in particular relies on grepl("H", ...).

get_allometry <- function(x, siteID = NULL, E = NULL, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
{
  # Determine the E value
  if(is.null(E))
  {
    if(is.null(siteID)) siteID = attr(x, "siteID")
    E = CTFS.E[siteID] # this will return zero length if unknown site
    if(length(E) == 0) E = NA
  }
  
  # Generate defaults using Chave et al. 2014. When height data are available,
  # a pantropical model is the best default (Eq. 4). When height data is not available,
  # then local models based on the E climatic parameter are the best default (Eq. 7).
  COLNAMES <- c("with.H", "without.H", "with.H.level", "without.H.level")
  allometries <- matrix(NA, nrow = nrow(x), ncol = length(COLNAMES), dimnames = list(NULL, COLNAMES))
  allometries[, "with.H"] <- paste0("0.0673 * (WSG * D^2 * H)^0.976")
  if(!is.na(E)) allometries[, "without.H"] <- paste0("exp(-1.803 - 0.976 * ", E, " + 0.976 * log(WSG) + 2.673 * log(D) - 0.0299 *(log(D)^2))")
  allometries[, "with.H.level"] <- "Chave et al. 2014, Eq. 4"
  if(!is.na(E)) allometries[, "without.H.level"] <- "Chave et al. 2014, Eq. 7"

  # Incorporate site-level input
  if(!is.null(site_level_allometry))
  {
    site_level_allometry <- unique(unlist(site_level_allometry))
    with.H <- site_level_allometry[grepl("H", site_level_allometry)]
    without.H <- site_level_allometry[!grepl("H", site_level_allometry)]
    
    if(length(with.H) == 1)
    {
      allometries[, "with.H"] <- with.H
      allometries[, "with.H.level"] <- "user-provided site-level allometry, using height"
    }
    
    if(length(without.H) == 1)
    {
      allometries[, "without.H"] <- without.H
      allometries[, "without.H.level"] <- "user-provided site-level allometry, without using height"
    }
    
  }
  
  # Incorporate species-level input
  #species_level_allometry <- list(a = c("2 * D * H", "3 * D"), b = c("2 * D"))
  if(!is.null(species_level_allometry))
  {
    with.H <- sapply(species_level_allometry, function(a) {
      a <- unique(a[grepl("H", a)])
      if(length(a) != 1) a = NA
      a
    })
    
    without.H <- sapply(species_level_allometry, function(a) {
      a <- unique(a[!grepl("H", a)])
      if(length(a) != 1) a = NA
      a
    })
    
    # Overwrite in the table:
    spp <- x$sp
    #spp <- c("a", "b", "c", "a")
    allometries[!is.na(with.H[spp]), "with.H"] <- with.H[spp][!is.na(with.H[spp])]
    allometries[!is.na(with.H[spp]), "with.H.level"] <- "user-provided species-level allometry, using height"
    allometries[!is.na(without.H[spp]), "without.H"] <- without.H[spp][!is.na(without.H[spp])]
    allometries[!is.na(without.H[spp]), "without.H.level"] <- "user-provided species-level allometry, without using height"
  }
  
  # Incorporate stem-level input
  #stem_level_allometry <- list(a = c("2 * D * H", "3 * D"), b = c("2 * D"))
  if(!is.null(stem_level_allometry))
  {
    with.H <- sapply(stem_level_allometry, function(a) {
      a <- unique(a[grepl("H", a)])
      if(length(a) != 1) a = NA
      a
    })
    
    without.H <- sapply(stem_level_allometry, function(a) {
      a <- unique(a[!grepl("H", a)])
      if(length(a) != 1) a = NA
      a
    })
    
    # Overwrite in the table:
    id <- x$StemID
    #spp <- c("a", "b", "c", "a")
    allometries[!is.na(with.H[spp]), "with.H"] <- with.H[spp][!is.na(with.H[spp])]
    allometries[!is.na(with.H[spp]), "with.H.level"] <- "user-provided stem-level allometry, using height"
    allometries[!is.na(without.H[spp]), "without.H"] <- without.H[spp][!is.na(without.H[spp])]
    allometries[!is.na(without.H[spp]), "without.H.level"] <- "user-provided stem-level allometry, without using height"
  }
  
  # Decide which allometry to use:
  allometry <- matrix(NA, nrow = nrow(x), ncol = 2, dimnames = list(x$StemID, c("allometry", "level")))
  with.H <- which(!is.na(x$D) & !is.na(x$WSG) & !is.na(x$H))
  without.H <- which(!is.na(x$D) & !is.na(x$WSG) & is.na(x$H))
  allometry[with.H, "allometry"] <- allometries[with.H, "with.H"]
  allometry[with.H, "level"] <- allometries[with.H, "with.H.level"]
  allometry[without.H, "allometry"] <- allometries[without.H, "without.H"]
  allometry[without.H, "level"] <- allometries[without.H, "without.H.level"]
  return(allometry)
}

Applying allometric equations to stems

The biomass_per_individual() function would apply the equations in a vectorized way for each chunk of stems sharing the same allometry. The allometry and the level of estimation is part of the output of this function, as it is potentially useful metadata (e.g. to report methods in publications).

The allometries argument is the output from the get_allometry() function. Since allometries are stored as raw text, a simple combination of parse and eval should work.


biomass_per_individual <- function(x, allometries)
{
    # then do this:
    allometries.id <- unique(allometries[, "allometry"])
    agb <- rep(NA, nrow(x))
    for(a in allometries.id)
    {
         i <- which(allometries[, "allometry"] == a)
         agb[i] <- eval(parse(text = a), envir = x[i,])
    }
    attr(agb, "allometries") <- allometries
    return(agb)
}

Example

library(bciex)

x <- bci12s1mini
x <- x[!is.na(x$dbh),]
x$dbh[c(3, 5, 7, 9)] <- NA
x$StemID <- paste0(x$treeID, "-", x$stemID)
x$D <- x$dbh
x$H <- NA
x$H[2:5] <- 2:5 * 10
x$WSG <- runif(nrow(x), 0.3, 1)

CTFS.E <- c(bci = 0.5, pasoh = 0.7, xxxx = 1.5)

a <- get_allometry(x = x, siteID = NULL, E = NULL, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

a <- get_allometry(x = x, siteID = NULL, E = 777, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

s <- c("2 + D")
a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = s, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

s1 <- c("2 + D")
s2 <- list(c("2 + D", "2 * D + H"), c("3 * D"))
a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = s1, species_level_allometry = s2, stem_level_allometry = NULL)
head(a, 10)

s1 <- c("2 + D")
s2 <- list(apeime = c("2 + D", "2 * D + H"), quaras = c("3 * D"))
a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = s1, species_level_allometry = s2, stem_level_allometry = NULL)
head(a, 10)


# This is the most typical case:
a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

agb.per.i <- biomass_per_individual(x, allometries = a)
head(agb.per.i, 10)
---
title: "Biomass module in forestr"
author: "Gabriel Arellano, Sean McMahon & Mauro Lepore"
date: '2017-09-21'
output: html_notebook
---

```{r setup, echo = FALSE, message=FALSE, warning=FALSE}
# hadley's settings
set.seed(1014)
options(digits = 3)

knitr::opts_chunk$set(
  echo = TRUE,  # {mine}
  comment = "#>",
  collapse = TRUE,
  # cache = TRUE,
  out.width = "70%",
  fig.align = "center",
  fig.width = 6,
  fig.asp = 0.618,  # 1 / phi
  fig.show = "hold"
)

options(dplyr.print_min = 6, dplyr.print_max = 6)
```


## Overview

On Sept 6th, 2017, Gabriel, Sean and Mauro discussed what the biomass functions of the __forestr__ package should look like. The main goal is to incorporate into the default behaviour the new tropical allometries (Chave et al. 2014). We are also interested in allowing more flexibility in the type of input that the user can provide. We plan three functions, to:

1) decide which formula/equation applies to a given stem: `get_allometry()`
2) apply the formula/equation to each stem: `biomass_per_individual()`
3) apply `biomass_per_individual()` within user-defined groups. This is part of the `abundance` module and won't be discussed here.

## Get allometries, allowing site-, species-, and stem-level input from the user

The __forestr__ package should contain functions to extract wood density from the Global Wood Density Database or CTFS databases. It may contain also allometric equations to estimate stem height. The user should use these functions to have a more complete dataset. The functions in the `biomass` module would assume that all the available information is already in the main data table, and that the allometric equation will use all the available information.

The function will return allometries only for the "with height" and the "without height" cases. It will return NA's if there is no DBH or wood density. This mirrors the available defaults for tropical forests in Chave et al. (2014), Eq. 4 and 7.

The allometric equation to use may potentially depend on the site (or conditions of the site, E), the species (e.g. different equations for palms), and the stem (e.g. different equations for broken trees). The user should be allowed to provide input at the three levels.

The general flow in the `get_allometry()` function is:

1) generate default allometries.
2) overwrite with user-provided site-level allometries.
3) overwrite with user-provided species-level allometries.
4) overwrite with user-provided individual-level allometries.

The site-level input could take these forms:

1) NULL: do nothing.
2) a vector or a list with one or two allometric equations (for the with/without height cases)

The species-level input could take these forms:

1) NULL: do nothing.
2) a list, named with species codes, with one or two allometric equations per species.

If the user has family-level allometries, e.g. for palms, it should not be difficult to translate that into a species-level input. In any case, those could be accessory functions to be developed elsewhere within the package.

The stem-level input could take these forms:

1) NULL: do nothing.
2) a list, named with StemID's, with one or two allometric equations per stem.

The function returns one and only one allometry per stem, based on data availability. It also returns what is the level of estimation for the selected allometry. It returns a table with as many rows as rows has the main dataset. If we find that performance is an issue, we would get rid of the stem-level options, and the function will work on (and return) a table with as many rows as species has the main dataset.

Note that the functions assume that stem height is called "H", DBH is called "D" and wood density (wood specific gravity) is called "WSG". This applies both to the allometric equations and the column names in the dataset. The `get_allometry()` function in particular relies on `grepl("H", ...)`.

```{r}
get_allometry <- function(x, siteID = NULL, E = NULL, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
{
  # Determine the E value
  if(is.null(E))
  {
    if(is.null(siteID)) siteID = attr(x, "siteID")
    E = CTFS.E[siteID] # this will return zero length if unknown site
    if(length(E) == 0) E = NA
  }
  
  # Generate defaults using Chave et al. 2014. When height data are available,
  # a pantropical model is the best default (Eq. 4). When height data is not available,
  # then local models based on the E climatic parameter are the best default (Eq. 7).
  COLNAMES <- c("with.H", "without.H", "with.H.level", "without.H.level")
  allometries <- matrix(NA, nrow = nrow(x), ncol = length(COLNAMES), dimnames = list(NULL, COLNAMES))
  allometries[, "with.H"] <- paste0("0.0673 * (WSG * D^2 * H)^0.976")
  if(!is.na(E)) allometries[, "without.H"] <- paste0("exp(-1.803 - 0.976 * ", E, " + 0.976 * log(WSG) + 2.673 * log(D) - 0.0299 *(log(D)^2))")
  allometries[, "with.H.level"] <- "Chave et al. 2014, Eq. 4"
  if(!is.na(E)) allometries[, "without.H.level"] <- "Chave et al. 2014, Eq. 7"

  # Incorporate site-level input
  if(!is.null(site_level_allometry))
  {
    site_level_allometry <- unique(unlist(site_level_allometry))
    with.H <- site_level_allometry[grepl("H", site_level_allometry)]
    without.H <- site_level_allometry[!grepl("H", site_level_allometry)]
    
    if(length(with.H) == 1)
    {
      allometries[, "with.H"] <- with.H
      allometries[, "with.H.level"] <- "user-provided site-level allometry, using height"
    }
    
    if(length(without.H) == 1)
    {
      allometries[, "without.H"] <- without.H
      allometries[, "without.H.level"] <- "user-provided site-level allometry, without using height"
    }
    
  }
  
  # Incorporate species-level input
  #species_level_allometry <- list(a = c("2 * D * H", "3 * D"), b = c("2 * D"))
  if(!is.null(species_level_allometry))
  {
    with.H <- sapply(species_level_allometry, function(a) {
      a <- unique(a[grepl("H", a)])
      if(length(a) != 1) a = NA
      a
    })
    
    without.H <- sapply(species_level_allometry, function(a) {
      a <- unique(a[!grepl("H", a)])
      if(length(a) != 1) a = NA
      a
    })
    
    # Overwrite in the table:
    spp <- x$sp
    #spp <- c("a", "b", "c", "a")
    allometries[!is.na(with.H[spp]), "with.H"] <- with.H[spp][!is.na(with.H[spp])]
    allometries[!is.na(with.H[spp]), "with.H.level"] <- "user-provided species-level allometry, using height"
    allometries[!is.na(without.H[spp]), "without.H"] <- without.H[spp][!is.na(without.H[spp])]
    allometries[!is.na(without.H[spp]), "without.H.level"] <- "user-provided species-level allometry, without using height"
  }
  
  # Incorporate stem-level input
  #stem_level_allometry <- list(a = c("2 * D * H", "3 * D"), b = c("2 * D"))
  if(!is.null(stem_level_allometry))
  {
    with.H <- sapply(stem_level_allometry, function(a) {
      a <- unique(a[grepl("H", a)])
      if(length(a) != 1) a = NA
      a
    })
    
    without.H <- sapply(stem_level_allometry, function(a) {
      a <- unique(a[!grepl("H", a)])
      if(length(a) != 1) a = NA
      a
    })
    
    # Overwrite in the table:
    id <- x$StemID
    #spp <- c("a", "b", "c", "a")
    allometries[!is.na(with.H[spp]), "with.H"] <- with.H[spp][!is.na(with.H[spp])]
    allometries[!is.na(with.H[spp]), "with.H.level"] <- "user-provided stem-level allometry, using height"
    allometries[!is.na(without.H[spp]), "without.H"] <- without.H[spp][!is.na(without.H[spp])]
    allometries[!is.na(without.H[spp]), "without.H.level"] <- "user-provided stem-level allometry, without using height"
  }
  
  # Decide which allometry to use:
  allometry <- matrix(NA, nrow = nrow(x), ncol = 2, dimnames = list(x$StemID, c("allometry", "level")))
  with.H <- which(!is.na(x$D) & !is.na(x$WSG) & !is.na(x$H))
  without.H <- which(!is.na(x$D) & !is.na(x$WSG) & is.na(x$H))
  allometry[with.H, "allometry"] <- allometries[with.H, "with.H"]
  allometry[with.H, "level"] <- allometries[with.H, "with.H.level"]
  allometry[without.H, "allometry"] <- allometries[without.H, "without.H"]
  allometry[without.H, "level"] <- allometries[without.H, "without.H.level"]
  return(allometry)
}

```

## Applying allometric equations to stems

The `biomass_per_individual()` function would apply the equations in a vectorized way for each chunk of stems sharing the same allometry. The allometry and the level of estimation is part of the output of this function, as it is potentially useful metadata (e.g. to report methods in publications).

The `allometries` argument is the output from the `get_allometry()` function. Since allometries are stored as raw text, a simple combination of `parse` and `eval` should work.

```{r}

biomass_per_individual <- function(x, allometries)
{
    # then do this:
    allometries.id <- unique(allometries[, "allometry"])
    agb <- rep(NA, nrow(x))
    for(a in allometries.id)
    {
         i <- which(allometries[, "allometry"] == a)
         agb[i] <- eval(parse(text = a), envir = x[i,])
    }
    attr(agb, "allometries") <- allometries
    return(agb)
}

```



## Example

```{r}
library(bciex)

x <- bci12s1mini
x <- x[!is.na(x$dbh),]
x$dbh[c(3, 5, 7, 9)] <- NA
x$StemID <- paste0(x$treeID, "-", x$stemID)
x$D <- x$dbh
x$H <- NA
x$H[2:5] <- 2:5 * 10
x$WSG <- runif(nrow(x), 0.3, 1)

CTFS.E <- c(bci = 0.5, pasoh = 0.7, xxxx = 1.5)

a <- get_allometry(x = x, siteID = NULL, E = NULL, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

a <- get_allometry(x = x, siteID = NULL, E = 777, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

s <- c("2 + D")
a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = s, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

s1 <- c("2 + D")
s2 <- list(c("2 + D", "2 * D + H"), c("3 * D"))
a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = s1, species_level_allometry = s2, stem_level_allometry = NULL)
head(a, 10)

s1 <- c("2 + D")
s2 <- list(apeime = c("2 + D", "2 * D + H"), quaras = c("3 * D"))
a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = s1, species_level_allometry = s2, stem_level_allometry = NULL)
head(a, 10)


# This is the most typical case:
a <- get_allometry(x = x, siteID = "bci", E = NULL, site_level_allometry = NULL, species_level_allometry = NULL, stem_level_allometry = NULL)
head(a, 10)

agb.per.i <- biomass_per_individual(x, allometries = a)
head(agb.per.i, 10)

```

