2.7 Other language engines
A less well-known fact about R Markdown is that many other languages are also supported, such as Python, Julia, C++, and SQL. The support comes from the knitr package, which has provided a large number of language engines. Language engines are essentially functions registered in the object knitr::knit_engine
. You can list the names of all available engines via:
names(knitr::knit_engines$get())
## [1] "awk" "bash" "coffee"
## [4] "gawk" "groovy" "haskell"
## [7] "lein" "mysql" "node"
## [10] "octave" "perl" "php"
## [13] "psql" "Rscript" "ruby"
## [16] "sas" "scala" "sed"
## [19] "sh" "stata" "zsh"
## [22] "asis" "asy" "block"
## [25] "block2" "bslib" "c"
## [28] "cat" "cc" "comment"
## [31] "css" "ditaa" "dot"
## [34] "embed" "eviews" "exec"
## [37] "fortran" "fortran95" "go"
## [40] "highlight" "js" "julia"
## [43] "python" "R" "Rcpp"
## [46] "sass" "scss" "sql"
## [49] "stan" "targets" "tikz"
## [52] "verbatim" "theorem" "lemma"
## [55] "corollary" "proposition" "conjecture"
## [58] "definition" "example" "exercise"
## [61] "hypothesis" "proof" "remark"
## [64] "solution" "marginfigure"
Most engines have been documented in Chapter 11 of Xie (2015). The engines from theorem
to solution
are only available when you use the bookdown package, and the rest are shipped with the knitr package. To use a different language engine, you can change the language name in the chunk header from r
to the engine name, e.g.,
```{python}
x = 'hello, python world!'
print(x.split(' '))
```
For engines that rely on external interpreters such as python
, perl
, and ruby
, the default interpreters are obtained from Sys.which()
, i.e., using the interpreter found via the environment variable PATH
of the system. If you want to use an alternative interpreter, you may specify its path in the chunk option engine.path
. For example, you may want to use Python 3 instead of the default Python 2, and we assume Python 3 is at /usr/bin/python3
(may not be true for your system):
```{python, engine.path = '/usr/bin/python3'}
import sys
print(sys.version)
```
You can also change the engine interpreters globally for multiple engines, e.g.,
::opts_chunk$set(engine.path = list(
knitrpython = '~/anaconda/bin/python',
ruby = '/usr/local/bin/ruby'
))
Note that you can use a named list to specify the paths for different engines.
Most engines will execute each code chunk in a separate new session (via a system()
call in R), which means objects created in memory in a previous code chunk will not be directly available to latter code chunks. For example, if you create a variable in a bash
code chunk, you will not be able to use it in the next bash
code chunk. Currently the only exceptions are r
, python
, and julia
. Only these engines execute code in the same session throughout the document. To clarify, all r
code chunks are executed in the same R session, all python
code chunks are executed in the same Python session, and so on, but the R session and the Python session are independent.4
I will introduce some specific features and examples for a subset of language engines in knitr below. Note that most chunk options should work for both R and other languages, such as eval
and echo
, so these options will not be mentioned again.
2.7.1 Python
The python
engine is based on the reticulate package (Ushey, Allaire, and Tang 2023), which makes it possible to execute all Python code chunks in the same Python session. If you actually want to execute a certain code chunk in a new Python session, you may use the chunk option python.reticulate = FALSE
. If you are using a knitr version lower than 1.18, you should update your R packages.
Below is a relatively simple example that shows how you can create/modify variables, and draw graphics in Python code chunks. Values can be passed to or retrieved from the Python session. To pass a value to Python, assign to py$name
, where name
is the variable name you want to use in the Python session; to retrieve a value from Python, also use py$name
.
---
title: "Python code chunks in R Markdown"
date: 2018-02-22
---
## A normal R code chunk
```{r}
library(reticulate)
x = 42
print(x)
```
## Modify an R variable
`x` on the right hand side
In the following chunk, the value of `r x`, which was defined in the previous chunk.
is
```{r}
x = x + 12
print(x)
```
## A Python chunk
This works fine and as expected.
```{python}
x = 42 * 2
print(x)
```
`x` in the Python session is `r py$x`.
The value of `x` as the one in R.
It is not the same
## Modify a Python variable
```{python}
x = x + 18
print(x)
```
`x` from the Python session again:
Retrieve the value of
```{r}
py$x
```
Assign to a variable in the Python session from R:
```{r}
py$y = 1:5
```
`y` in the Python session:
See the value of
```{python}
print(y)
```
## Python graphics
You can draw plots using the **matplotlib** package in Python.
```{python}
import matplotlib.pyplot as plt
plt.plot([0, 2, 1, 4])
plt.show()
```
You may learn more about the reticulate package from https://rstudio.github.io/reticulate/.
2.7.2 Shell scripts
You can also write Shell scripts in R Markdown, if your system can run them (the executable bash
or sh
should exist). Usually this is not a problem for Linux or macOS users. It is not impossible for Windows users to run Shell scripts, but you will have to install additional software (such as Cygwin or the Linux Subsystem).
```{bash}
echo "Hello Bash!"
cat flights1.csv flights2.csv flights3.csv > flights.csv
```
Shell scripts are executed via the system2()
function in R. Basically knitr passes a code chunk to the command bash -c
to run it.
2.7.3 SQL
The sql
engine uses the DBI package to execute SQL queries, print their results, and optionally assign the results to a data frame.
To use the sql
engine, you first need to establish a DBI connection to a database (typically via the DBI::dbConnect()
function). You can make use of this connection in a sql
chunk via the connection
option. For example:
```{r}
library(DBI)
db = dbConnect(RSQLite::SQLite(), dbname = "sql.sqlite")
```
```{sql, connection=db}
SELECT * FROM trials
```
By default, SELECT
queries will display the first 10 records of their results within the document. The number of records displayed is controlled by the max.print
option, which is in turn derived from the global knitr option sql.max.print
(e.g., knitr::opts_knit$set(sql.max.print = 10)
; N.B. it is opts_knit
instead of opts_chunk
). For example, the following code chunk displays the first 20 records:
```{sql, connection=db, max.print = 20}
SELECT * FROM trials
```
You can specify no limit on the records to be displayed via max.print = -1
or max.print = NA
.
By default, the sql
engine includes a caption that indicates the total number of records displayed. You can override this caption using the tab.cap
chunk option. For example:
```{sql, connection=db, tab.cap = "My Caption"}
SELECT * FROM trials
```
You can specify that you want no caption all via tab.cap = NA
.
If you want to assign the results of the SQL query to an R object as a data frame, you can do this using the output.var
option, e.g.,
```{sql, connection=db, output.var="trials"}
SELECT * FROM trials
```
When the results of a SQL query are assigned to a data frame, no records will be printed within the document (if desired, you can manually print the data frame in a subsequent R chunk).
If you need to bind the values of R variables into SQL queries, you can do so by prefacing R variable references with a ?
. For example:
```{r}
subjects = 10
```
```{sql, connection=db, output.var="trials"}
SELECT * FROM trials WHERE subjects >= ?subjects
```
If you have many SQL chunks, it may be helpful to set a default for the connection
chunk option in the setup chunk, so that it is not necessary to specify the connection on each individual chunk. You can do this as follows:
```{r setup}
library(DBI)
db = dbConnect(RSQLite::SQLite(), dbname = "sql.sqlite")
knitr::opts_chunk$set(connection = "db")
```
Note that the connection
option should be a string naming the connection object (not the object itself). Once set, you can execute SQL chunks without specifying an explicit connection:
```{sql}
SELECT * FROM trials
```
2.7.4 Rcpp
The Rcpp
engine enables compilation of C++ into R functions via the Rcpp sourceCpp()
function. For example:
```{Rcpp}
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
return x * 2;
}
```
Executing this chunk will compile the code and make the C++ function timesTwo()
available to R.
You can cache the compilation of C++ code chunks using standard knitr caching, i.e., add the cache = TRUE
option to the chunk:
```{Rcpp, cache=TRUE}
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
return x * 2;
}
```
In some cases, it is desirable to combine all of the Rcpp
code chunks in a document into a single compilation unit. This is especially useful when you want to intersperse narrative between pieces of C++ code (e.g., for a tutorial or user guide). It also reduces total compilation time for the document (since there is only a single invocation of the C++ compiler rather than multiple).
To combine all Rcpp chunks into a single compilation unit, you use the ref.label
chunk option along with the knitr::all_rcpp_labels()
function to collect all of the Rcpp
chunks in the document. Here is a simple example:
All C++ code chunks will be combined to the chunk below:
```{Rcpp, ref.label=knitr::all_rcpp_labels(), include=FALSE}
```
`Rcpp.h`:
First we include the header
```{Rcpp, eval=FALSE}
#include <Rcpp.h>
```
Then we define a function:
```{Rcpp, eval=FALSE}
// [[Rcpp::export]]
int timesTwo(int x) {
return x * 2;
}
```
The two Rcpp
chunks that include code will be collected and compiled together in the first Rcpp
chunk via the ref.label
chunk option. Note that we set the eval = FALSE
option on the Rcpp
chunks with code in them to prevent them from being compiled again.
2.7.5 Stan
The stan
engine enables embedding of the Stan probabilistic programming language within R Markdown documents.
The Stan model within the code chunk is compiled into a stanmodel
object, and is assigned to a variable with the name given by the output.var
option. For example:
```{stan, output.var="ex1"}
parameters {
real y[2];
}
model {
y[1] ~ normal(0, 1);
y[2] ~ double_exponential(0, 2);
}
```
```{r}
library(rstan)
fit = sampling(ex1)
print(fit)
```
2.7.6 JavaScript and CSS
If you are using an R Markdown format that targets HTML output (e.g., html_document
and ioslides_presentation
, etc.), you can include JavaScript to be executed within the HTML page using the JavaScript engine named js
.
For example, the following chunk uses jQuery (which is included in most R Markdown HTML formats) to change the color of the document title to red:
```{js, echo=FALSE}
$('.title').css('color', 'red')
```
Similarly, you can embed CSS rules in the output document. For example, the following code chunk turns text within the document body red:
```{css, echo=FALSE}
body {
color: red;
}
```
Without the chunk option echo = FALSE
, the JavaScript/CSS code will be displayed verbatim in the output document, which is probably not what you want.
2.7.7 Julia
The Julia language is supported through the JuliaCall package (Li 2022). Similar to the python
engine, the julia
engine runs all Julia code chunks in the same Julia session. Below is a minimal example:
```{julia}
a = sqrt(2); # the semicolon inhibits printing
```
2.7.8 C and Fortran
For code chunks that use C or Fortran, knitr uses R CMD SHLIB
to compile the code, and load the shared object (a *.so
file on Unix or *.dll
on Windows). Then you can use .C()
/ .Fortran()
to call the C / Fortran functions, e.g.,
```{c, test-c, results='hide'}
void square(double *x) {
*x = *x * *x;
}
```
`square()` function:
Test the
```{r}
.C('square', 9)
.C('square', 123)
```
You can find more examples on different language engines in the GitHub repository https://github.com/yihui/knitr-examples (look for filenames that contain the word “engine”).
References
This is not strictly true, since the Python session is actually launched from R. What I mean here is that you should not expect to use R variables and Python variables interchangeably without explicitly importing/exporting variables between the two sessions.↩︎