This vignette shows you how to calculate demographic metrics with the functions recruitment()
, mortality()
and growth()
from the forestr package – let’s load it now:
# To install from a private repo, generate a personal access token (PAT) in
# https://github.com/settings/tokens and supply to this argument.
GITHUB_PAT <- "your token"
# install_github("forestgeo/forestr@iss49_demography", auth_token = GITHUB_PAT)
library(forestr)
# Also using the package tibble for nicer printing
library(tibble)
# Print no more than 10 rows to save space and time
options(tibble.print_max = 10)
You may be familiar with the functions abundance()
, basal_area()
and biomass()
, which input data from a single census and calculate a single metric per individual stem or group of stems. Unlike those functions, the ones you’ll see here operate quite differently: recruitment()
, mortality()
and growth()
input data from two censuses and calculate multiple metrics across the entire data set, optionally spiting the data set by some variable. Because those two groups of functions work quite differently (the group of abundance()
versus that of recruitment()
) making their interphases and output feel similar is not straight forward – so prepare for a little-different experience.
This is how this document is organized. You’ll see recruitment()
and mortality()
together and growth()
separately. In each section you’ll first see the new <FUNCTION>_df()
versions of those functions and then you’ll see the traditional versions; and you will also see how all that compares to the orignal versions in the CTFS R Package. Then in the previous last section you’ll learn about some proposed changes to these functions, and at the end you’ll see some details only relevant to those interested in the software development side of things.
Currently recruitment()
and mortality()
are more similar to each other than they are to growth()
. (From a technical point of view the most important difference is that growth()
has currently many more arguments than the other two functions). To reflect that relationship recruitment()
and mortality()
are documented together. The next two commands present you with the same help file.
?recruitment
# same
?mortality
# Some data to play with
census1 <- forestr::bci12t6mini
census2 <- forestr::bci12t7mini
# Using the new wrapper `<FUNCTION>_df()` ---------------------------------
# A REALISTIC CASE
# Note `recruitment_df()` warns if some groups have dbh values full of NA
split_by_sp <- recruitment_df(census1, census2, split1 = census1$sp)
#> Warning in check_if_all_dbh_is_na(census1, census2, split = split1): At least one split-group contains all `dbh` values equal to NA.
#> * Consider removing those groups before running this function.
split_by_sp
#> # A tibble: 1,152 x 3
#> split metric value
#> <chr> <chr> <dbl>
#> 1 acaldi N2 5.0000
#> 2 acaldi R 2.0000
#> 3 acaldi rate 0.1032
#> 4 acaldi lower 0.0254
#> 5 acaldi upper 0.3035
#> 6 acaldi time 4.9483
#> # ... with 1,146 more rows
# Same for `mortality_df()`; let's explore a few species
# We don't need to be warned again (suppressing warnings)
all_species <- suppressWarnings(
mortality_df(census1, census2, split1 = census1$sp)
)
few_species <- sample(unique(all_species$split), 5)
long_format <- subset(all_species, split %in% few_species)
long_format
#> # A tibble: 45 x 3
#> split metric value
#> <chr> <chr> <dbl>
#> 1 aegipa N 0
#> 2 aegipa D 0
#> 3 aegipa rate NA
#> 4 aegipa lower NA
#> 5 aegipa upper NA
#> 6 aegipa time NA
#> # ... with 39 more rows
# From long to wide format
library(tidyr)
wide_format <- tidyr::spread(long_format, key = split, value = value)
wide_format
#> # A tibble: 9 x 6
#> metric aegipa aspicr cecrin ocotce sipagu
#> * <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 D 0 0.00e+00 2.00e+00 0 0.00e+00
#> 2 date1 NA 1.66e+04 1.65e+04 NA 1.65e+04
#> 3 date2 NA 1.84e+04 1.84e+04 NA 1.83e+04
#> 4 dbhmean NA 3.40e+01 1.08e+02 NA 2.20e+01
#> 5 lower NA 0.00e+00 3.11e-02 NA 0.00e+00
#> 6 N 0 1.00e+00 4.00e+00 0 1.00e+00
#> 7 rate NA 0.00e+00 1.36e-01 NA 0.00e+00
#> 8 time NA 4.95e+00 5.10e+00 NA 4.96e+00
#> 9 upper NA 3.73e-01 3.76e-01 NA 3.72e-01
# THE SIMPLEST CASE
# split1 defaults to `NULL`; output keeps the variable `split` for consistency
split_null <- recruitment_df(census1, census2)
split_null
#> # A tibble: 8 x 3
#> split metric value
#> <int> <chr> <dbl>
#> 1 1 N2 5.54e+02
#> 2 1 R 7.10e+01
#> 3 1 rate 2.76e-02
#> 4 1 lower 2.19e-02
#> 5 1 upper 3.48e-02
#> 6 1 time 4.97e+00
#> 7 1 date1 1.66e+04
#> 8 1 date2 1.84e+04
# For this simple case, you may want to use this alternative approach
result <- recruitment(census1, census2)
result_df <- as.data.frame(unlist(result))
head(result_df, 10)
#> unlist(result)
#> N2 5.54e+02
#> R 7.10e+01
#> rate 2.76e-02
#> lower 2.19e-02
#> upper 3.48e-02
#> time 4.97e+00
#> date1 1.66e+04
#> date2 1.84e+04
# But that approach doesn't help with splitting variables because we loose
# information about what matric each value belongs to
result <- recruitment(census1, census2, split1 = census1$sp)
#> Warning in check_if_all_dbh_is_na(census1, census2, split = split1): At least one split-group contains all `dbh` values equal to NA.
#> * Consider removing those groups before running this function.
result_df <- as.data.frame(unlist(result))
head(result_df, 10)
#> unlist(result)
#> N2.acaldi 5
#> N2.aegipa 0
#> N2.alchco 0
#> N2.alibed 2
#> N2.allops 0
#> N2.alsebl 20
#> N2.anaxpa 1
#> N2.andiin 0
#> N2.annoac 3
#> N2.annosp 0
# Using the traditional functions -----------------------------------------
# The output is more ackward to explore
traditional_by_sp <- suppressWarnings(
recruitment(census1, census2, split1 = census1$sp)
)
str(traditional_by_sp)
#> List of 8
#> $ N2 : Named num [1:144] 5 0 0 2 0 20 1 0 3 0 ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ R : Named num [1:144] 2 0 0 1 0 6 0 0 1 0 ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ rate : Named num [1:144] 0.103 NA NA 0.141 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ lower: Named num [1:144] 0.0254 NA NA 0.0201 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ upper: Named num [1:144] 0.303 NA NA 0.48 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ time : Named num [1:144] 4.95 NA NA 4.92 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ date1: Named num [1:144] 16639 NA 16720 16533 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ date2: Named num [1:144] 18464 NA NA 18387 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
# In RStudio 1.1 the function `View()` will help you see ackward lists.
# Here showing a simple trick
lapply(traditional_by_sp, head)
#> $N2
#> acaldi aegipa alchco alibed allops alsebl
#> 5 0 0 2 0 20
#>
#> $R
#> acaldi aegipa alchco alibed allops alsebl
#> 2 0 0 1 0 6
#>
#> $rate
#> acaldi aegipa alchco alibed allops alsebl
#> 0.1032 NA NA 0.1408 NA 0.0713
#>
#> $lower
#> acaldi aegipa alchco alibed allops alsebl
#> 0.0254 NA NA 0.0201 NA 0.0315
#>
#> $upper
#> acaldi aegipa alchco alibed allops alsebl
#> 0.303 NA NA 0.480 NA 0.147
#>
#> $time
#> acaldi aegipa alchco alibed allops alsebl
#> 4.95 NA NA 4.92 NA 5.00
#>
#> $date1
#> acaldi aegipa alchco alibed allops alsebl
#> 16639 NA 16720 16533 NA 16551
#>
#> $date2
#> acaldi aegipa alchco alibed allops alsebl
#> 18464 NA NA 18387 NA 18385
# `recruitment()` can take up to two splitting variables; not `recruitment_df()`
traditional_by_sp_and_quadrat <- suppressWarnings(
recruitment(census1, census2, split1 = census1$sp, split2 = census1$quadrat)
)
str(traditional_by_sp_and_quadrat)
#> List of 8
#> $ N2 : num [1:144, 1:664] 0 0 0 0 0 0 0 0 0 0 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ R : num [1:144, 1:664] 0 0 0 0 0 0 0 0 0 0 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ rate : num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ lower: num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ upper: num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ time : num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ date1: num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ date2: num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
# This intentionally fails -- to keep the output easy to view and manipulate
recruitment_df(census1, census2, split1 = census1$sp, split2 = census1$quadrat)
#> Error in recruitment_df(census1, census2, split1 = census1$sp, split2 = census1$quadrat): unused argument (split2 = census1$quadrat)
# Warnings and errors -----------------------------------------------------
# Notice that the original `recruitment()` from ctfs gives no warnings
# No warning
with_ctfs <- ctfs::recruitment(census1, census2, split1 = census1$sp)
# Warning
with_forestr <- forestr::recruitment(census1, census2, split1 = census1$sp)
#> Warning in check_if_all_dbh_is_na(census1, census2, split = split1): At least one split-group contains all `dbh` values equal to NA.
#> * Consider removing those groups before running this function.
# What happens if input is wrong?
# Introducing a wrong name for an important variable
wrong_nm1 <- census1
wrong_nm2 <- census2
wrong_nm1$DBH <- wrong_nm1$dbh
wrong_nm1$dbh <- NULL
wrong_nm2$DBH <- wrong_nm2$dbh
wrong_nm2$dbh <- NULL
# Uninformative error (gets picked up too far into the function)
with_ctfs <- ctfs::recruitment(wrong_nm1, wrong_nm2, split1 = wrong_nm1$sp)
#> Error in split.default(X, group): first argument must be a vector
# More informative error (gets picket up at the top of the function)
with_forestr <- forestr::recruitment(wrong_nm1, wrong_nm2, split1 = wrong_nm1$sp)
#> Error: Check that your data contains these names: dbh, pom, status, date
Compared to recruitment()
and mortality()
, growth()
has currently many more arguments, so it’s documented separately. Yet much of the documentation is the same – growth()
inherits arguments and description from recruitment()
and mortality()
(see ?growth()
). This section starts with what growth()
shares with recruitment()
and mortality()
and then shows some arguments exclusive of growth()
.
# Using the same data as above.
# You can expect the same warnings about `dbh` eualt to NA within some groups
# defined by `split1 = census1$sp` and `split2 = census1$quadrat`
census1 <- forestr::bci12t6mini
census2 <- forestr::bci12t7mini
# Using the new wrapper `<FUNCTION>_df()` ---------------------------------
# A REALISTIC CASE
# Note `growth_df()` warns if some groups have dbh values full of NA
split_by_sp <- growth_df(census1, census2, split1 = census1$sp)
#> Warning in check_if_all_dbh_is_na(census1, census2, split = split1): At least one split-group contains all `dbh` values equal to NA.
#> * Consider removing those groups before running this function.
split_by_sp
#> # A tibble: 1,008 x 3
#> split metric value
#> <chr> <chr> <dbl>
#> 1 acaldi rate 7.09e-01
#> 2 acaldi N 2.00e+00
#> 3 acaldi clim 1.32e+00
#> 4 acaldi dbhmean 1.10e+01
#> 5 acaldi time 4.95e+00
#> 6 acaldi date1 1.67e+04
#> # ... with 1,002 more rows
# Same for `growth_df()`; let's explore a few species
# We don't need to be warned again (suppressing warnings)
all_species <- suppressWarnings(
growth_df(census1, census2, split1 = census1$sp)
)
few_species <- sample(unique(all_species$split), 5)
long_format <- subset(all_species, split %in% few_species)
long_format
#> # A tibble: 35 x 3
#> split metric value
#> <chr> <chr> <dbl>
#> 1 ery2ma rate 2.05e-01
#> 2 ery2ma N 1.00e+00
#> 3 ery2ma clim NA
#> 4 ery2ma dbhmean 1.40e+01
#> 5 ery2ma time 4.89e+00
#> 6 ery2ma date1 1.66e+04
#> # ... with 29 more rows
# From long to wide format
library(tidyr)
wide_format <- tidyr::spread(long_format, key = split, value = value)
wide_format
#> # A tibble: 7 x 6
#> metric ery2ma ingaqu ingaum protpa psycg3
#> * <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 clim NA NA 4.28e-01 3.10e-01 NA
#> 2 date1 1.66e+04 NA 1.65e+04 1.66e+04 NA
#> 3 date2 1.84e+04 NA 1.83e+04 1.84e+04 NA
#> 4 dbhmean 1.40e+01 NA 4.45e+01 2.88e+01 NA
#> 5 N 1.00e+00 0 2.00e+00 5.00e+00 0
#> 6 rate 2.05e-01 NA 6.97e-01 2.82e-01 NA
#> 7 time 4.89e+00 NA 5.02e+00 4.95e+00 NA
# THE SIMPLEST CASE
# split1 defaults to `NULL`; output keeps the variable `split` for consistency
split_null <- growth_df(census1, census2)
split_null
#> # A tibble: 7 x 3
#> split metric value
#> <int> <chr> <dbl>
#> 1 1 rate 1.13e+00
#> 2 1 N 3.83e+02
#> 3 1 clim 2.17e-01
#> 4 1 dbhmean 4.96e+01
#> 5 1 time 4.97e+00
#> 6 1 date1 1.66e+04
#> 7 1 date2 1.84e+04
# Using the traditional functions -----------------------------------------
# The output is more ackward to explore
traditional_by_sp <- suppressWarnings(
growth(census1, census2, split1 = census1$sp)
)
str(traditional_by_sp)
#> List of 7
#> $ rate : Named num [1:144] 0.709 NA NA 1.209 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ N : Named num [1:144] 2 0 0 1 0 13 0 0 0 0 ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ clim : Named num [1:144] 1.32 NA NA NA NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ dbhmean: Named num [1:144] 11 NA NA 12 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ time : Named num [1:144] 4.95 NA NA 4.96 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ date1 : Named num [1:144] 16671 NA NA 16533 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> $ date2 : Named num [1:144] 18477 NA NA 18345 NA ...
#> ..- attr(*, "names")= chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
# In RStudio 1.1 the function `View()` will help you see ackward lists.
# Here showing a simple trick
lapply(traditional_by_sp, head)
#> $rate
#> acaldi aegipa alchco alibed allops alsebl
#> 0.709 NA NA 1.209 NA 0.755
#>
#> $N
#> acaldi aegipa alchco alibed allops alsebl
#> 2 0 0 1 0 13
#>
#> $clim
#> acaldi aegipa alchco alibed allops alsebl
#> 1.32 NA NA NA NA 0.61
#>
#> $dbhmean
#> acaldi aegipa alchco alibed allops alsebl
#> 11.0 NA NA 12.0 NA 43.8
#>
#> $time
#> acaldi aegipa alchco alibed allops alsebl
#> 4.95 NA NA 4.96 NA 4.97
#>
#> $date1
#> acaldi aegipa alchco alibed allops alsebl
#> 16671 NA NA 16533 NA 16573
#>
#> $date2
#> acaldi aegipa alchco alibed allops alsebl
#> 18477 NA NA 18345 NA 18389
# `growth()` can take up to two splitting variables; not `growth_df()`
traditional_by_sp_and_quadrat <- suppressWarnings(
growth(census1, census2, split1 = census1$sp, split2 = census1$quadrat)
)
str(traditional_by_sp_and_quadrat)
#> List of 7
#> $ rate : num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ N : num [1:144, 1:664] 0 0 0 0 0 0 0 0 0 0 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ clim : num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ dbhmean: num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ time : num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ date1 : num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
#> $ date2 : num [1:144, 1:664] NA NA NA NA NA NA NA NA NA NA ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:144] "acaldi" "aegipa" "alchco" "alibed" ...
#> .. ..$ : chr [1:664] "0000" "0001" "0002" "0006" ...
# This intentionally fails -- to keep the output easy to view and manipulate
growth_df(census1, census2, split1 = census1$sp, split2 = census1$quadrat)
#> Error in growth_df(census1, census2, split1 = census1$sp, split2 = census1$quadrat): unused argument (split2 = census1$quadrat)
# Warnings and errors -----------------------------------------------------
# Notice that the original `growth()` from ctfs gives no warnings
# No warning
with_ctfs <- ctfs::growth(census1, census2, split1 = census1$sp)
# Warning
with_forestr <- forestr::growth(census1, census2, split1 = census1$sp)
#> Warning in check_if_all_dbh_is_na(census1, census2, split = split1): At least one split-group contains all `dbh` values equal to NA.
#> * Consider removing those groups before running this function.
# What happens if input is wrong?
# Introducing a wrong name for an important variable
wrong_nm1 <- census1
wrong_nm2 <- census2
wrong_nm1$DBH <- wrong_nm1$dbh
wrong_nm1$dbh <- NULL
wrong_nm2$DBH <- wrong_nm2$dbh
wrong_nm2$dbh <- NULL
# Uninformative error (gets picked up too far into the function)
with_ctfs <- ctfs::growth(wrong_nm1, wrong_nm2, split1 = wrong_nm1$sp)
#> Error in `[.data.frame`(census1, , growthcol): undefined columns selected
# More informative error (gets picket up at the top of the function)
with_forestr <- forestr::growth(wrong_nm1, wrong_nm2, split1 = wrong_nm1$sp)
#> Error: Check that your data contains these names: dbh, pom, status, date, stemID
# Some arguments exclusive of `growth()` ----------------------------------
# Here are exclusive arguments of growth; read about the others at `?growth()`
# Calculate not annual growth rate but relative growth rate
# Unlisting to fit all metrics in one line and save space
unlist(growth(census1, census2, method = "E"))
#> rate N clim dbhmean time date1 date2
#> 2.60e-02 3.83e+02 3.45e-03 4.96e+01 4.97e+00 1.66e+04 1.84e+04
# Return `sd` instead of `clim`
unlist(growth(census1, census2, stdev = TRUE))
#> rate N sd dbhmean time date1 date2
#> 1.13 383.00 2.16 49.56 4.97 16578.33 18393.57
# Include all living trees
unlist(growth(census1, census2, mindbh = NULL))
#> rate N clim dbhmean time date1 date2
#> 1.13e+00 3.83e+02 2.17e-01 4.96e+01 4.97e+00 1.66e+04 1.84e+04
# Measure growth not based on `dbh` but on `agb`
unlist(growth(census1, census2, growthcol = "agb"))
#> rate N clim dbhmean time date1 date2
#> 1.74e-03 3.83e+02 9.43e-04 4.96e+01 4.97e+00 1.66e+04 1.84e+04
All three functions, recruitment()
, mortality()
and growth()
, could be defined with less arguments and the same arguments. That would make the functions easier to use and easier to document. The removed arguments could be either dropped completely or transformed into a function on their own. For example, the argument mindbh
filters stems which dbh
is at least a given value; such filtering is useful in many other situations. If we extract it into its own function, if can be reused without the need of duplicating code inside each function that currently has the mindbh
argument. That way, a census would be filtered before it is passed to any demography function. That helps not only users but also code-readers to see more-obviously what’s going on.
If all demography functions have (almost) the same arguments [not the case now – growth()
has many more arguments], all demography functions could share the same documentation. That will help users to understand how each function relates to the oters, and also help the maintainers to keep the documentation synchronized [although roxygen2 provides tools for inheriting documentation].
Some arguments’ defaults may change to more conservative values. For example, mindbh = 10
is the current default; that may change to mindbh = NULL
, which conservatively includes all stems (even if excluding stems is common, in this case safety may be more important than comfort).
If time allows all demography functions shown here should be reviewed and either refactored to make them easier to maintain or completely rewritten (as we did with abundance()
, basal_area()
and biomass()
).
The single biggest way to improve both the quality of your code and your productivity isto reuse good code.
― from “Code Complete (Developer Best Practices)” (https://goo.gl/83hsHb)
The forestr package evolves from the CTFS R Package (http://ctfs.si.edu/Public/CTFSRPackage/). In general, functions in forestr and the CTFS R Package may or may not share the same names or code. But the functions shown here, in particular, are almost identical to those in the CTFS R Package, both in name and code. While small, the changes in forestr make the functions considerably more reliable: They provide more informative warnings and when inputs are wrong they throw clearer error messages. The differences may become bigger with time, but now, by reusing big chunks of code from the CTFS R Package, users can get good a number of useful futures relatively quickly.
That gain in development-speed comes at a cost: With the inherited code comes inherited complexity. Consider the functions used here. The following three figures show that recruitment()
and mortality()
depend on just a few functions, but the dependencies of growth()
are many more.
recruitment()
:mortality()
:growth()
:While reducing code complexity is central to developing good software, providing users with the functionality they need is arguably the priority. Taking time into account, I propose to add the most important features first and to reduce the system’s complexity second – maybe after we release first version.