Chapter 5 Converting to and from non-tidy formats
Non-tidy data structures, in particular matrcies, is essential in topic modeling where other packages for NLP in R play a major role.
The book has a diagram describing the “glue” part functions in this chapter play:
data:image/s3,"s3://crabby-images/4c4dd/4c4ddca19c7bd42da666e47b61bc21fcd1fcc2af" alt="Taken from the book, Chapter 5"
Figure 5.1: Taken from the book, Chapter 5
As shown in the figure, a tidied DTM is typically equivalent with a one-token-per-row data frame after counting.