--- title: "Markdown Literal Programming" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Printing Markdown} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r libraries, warning=FALSE, message=FALSE, error=FALSE} library(gluedown) library(stringr) library(rvest) library(glue) ``` In literate programming, the typical paradigm of source code is reversed; instead of a wall of code with the occasional comment, the user writes _human_ readable text (like this paragraph) with source code interspersed. In the R language, this is primarily done with the [`rmarkdown`][rmd] package, which takes a plaint text _R_ markdown file (`.Rmd`) containing code "chunks" and executes that code when converting to a regular markdown file (`.md`) and then possibly some other format (`.html`, `.pdf`, etc). [rmd]: https://rmarkdown.rstudio.com/ Markdown is a lightweight plain-text language used to format text. Let's look at the original description of markdown from John Gruber's website, the creator of the markdown standard. Using the `rvest` package, we can programmatically scrape Gruber's blog, extract HTML paragraph tags, and convert those tags to character vectors. ```{r markdown_desc, eval=FALSE} markdown_blog <- read_html("https://daringfireball.net/projects/markdown/") %>% html_elements("p") %>% html_text() ``` ```{r echo=FALSE} x <- tryCatch( expr = read_html("https://daringfireball.net/projects/markdown/"), error = function(e) NULL ) if (!is.null(x)) { markdown_blog <- html_text(html_elements(x, "p")) } else { markdown_blog <- LETTERS } ``` Gruber first explains _what_ exactly his markdown language is. ```{r quote_what, results='asis'} md_quote(markdown_blog[4]) ``` He continues by outlining _why_ markdown was created, his rationale for it's format, and some inspiration for it's syntax. ```{r quote_why, results='asis'} md_quote(markdown_blog[6]) ``` This entire vignette was written in markdown and converted to HTML using [pandoc][pandoc]. However, as you may have noticed, we haven't exactly been conforming to this original desire for markdown to be readable as is. We didn't copy the text from his blog and past it as text into this vignette. This is where the `gluedown` package comes in. [pandoc]: https://pandoc.org/ The `gluedown` package helps ease the transition between the incredibly powerful vector support in R and the readability of markdown. Since this vignette was written in _R_ Markdown (`.Rmd`), we are able to (1) use the power of packages like `rvest` to collect, process, and/or analyze some kind of data and then (2) transition that result to the human readable markdown format. When writing this vignette, _three_ kinds of files are used. 1. The `.Rmd` file containing source code is a programming environment 2. The `.md` file created by `rmarkdown` is a human-readable plain text version of that input code also containing the output text. 3. The `.html` format of this vignette created by `pandoc` is the final presentation format. In the rest of this vignette, we will see some of the various use cases for `gluedown`. We will see how easy it is to transition between R vectors and readable results in markdown/HTML. ## Vector Lists Printing vectors as markdown lists was the initial inspiration for the package. In R, atomic vectors the fundamental object type that composes more complex objects like lists and dataframes. The `state.name` vector built into base R is a character vector of all 50 state names. ```{r state.name} str(state.name, vec.len = 3) ``` If we as a user want to use those state names as _text_ in our markdown document we can use the `cat()` function and tell `rmarkdown` to print the results of that function "as is" (rather than as code output). ```{r cat_plain, results='asis'} cat(state.name[1:3]) ``` That output obviously isn't very appealing. We could tweak our use of `cat()` a little to separate them on new lines. ```{r cat_newline, results='asis'} cat(state.name[1:3], sep = "\n\n") ``` This is more readable, but with some more work, we can use `cat()` to print an ordered list. ```{r cat_order, results='asis'} cat(paste0(1:3, ". ", state.name[1:3]), sep = "\n") ``` This workflow gets tiresome, although it's made slightly more simple with the fantastic [`glue`][glue] package from Jim Hester. ```{r glue_order, results='asis'} glue("{1:3}. {state.name[1:3]}") ``` This is the technique used in this package. Vector inputs are passed to `glue::glue()` and the appropriate markdown syntax is implemented. [glue]: https://github.com/tidyverse/glue The `md_order()` function simplifies the `glue::glue()` workflow and allows users to more easily customize the appearance of the list in _markdown_ format. ```{r order_list_raw} # markdown only cares about the first number md_order(state.name[1:3], seq = FALSE) # markdown ignored padding and allows for use of parentheses md_order(state.name[1:10], seq = TRUE, pad = TRUE, marker = ")") ``` Although, as we can see below, all these different options are rendered as the same kind of HTML `
` tag 1. Convert that tag to a character vector 1. Remove Wikipedia's bracketed note 1. Print that vector as a markdown block quote ```{r blockquote, results='asis', eval=FALSE} read_html("https://w.wiki/A58") %>% # 1 html_element("blockquote") %>% # 2 html_text(trim = TRUE) %>% # 3 str_remove("\\[(.*)\\]") %>% # 4 md_quote() # 5 ``` ```{r results='asis', echo=FALSE} w <- "https://en.wikipedia.org/wiki/Preamble_to_the_United_States_Constitution" x <- tryCatch( expr = read_html(w), error = function(e) NULL ) if (!is.null(x)) { x %>% html_element("blockquote") %>% html_text(trim = TRUE) %>% str_remove("\\[(.*)\\]") %>% md_quote() } ``` [pipe]: https://magrittr.tidyverse.org/reference/pipe.html ## Extensions The package primarily uses [GitHub Flavored Markdown][gfm] (GFM), a site-specific version of the [CommonMark specification][cm], an unambiguous implementation of John Gruber's [original Markdown][df]. With this flavor, some useful extensions like [task lists][task] are supported on GitHub. Elsewhere, like this HTML vignette, a task list will just render as a bullet list. You can learn more about how GFM us implemented in this package's other vignette. ```{r ex_task, results='asis'} legislation <- c("Houses passes", "Senate concurs", "President signs") md_task(legislation, check = 1:2) ``` [task]: https://help.github.com/en/articles/about-task-lists [gfm]: https://github.github.com/gfm/ [cm]: https://spec.commonmark.org/ [df]: https://daringfireball.net/projects/markdown/ Markdown tables are another extremely useful extension. The `md_table()` functions wraps around the much more powerful `knitr::kable()` function, which allows data frames to be printed in a number of alternative formats. Printing data frames is a very typical use case for documenting the process of data science. With small summary tables like the one below, a markdown table is much more readable than the plain text tibble or data frame printed by default. ```{r print_mass} print(head(state.x77)) ``` ```{r table_mass, results='asis'} md_table(head(state.x77), digits = 2) ``` ## Inlines You can also use `gluedown` to format R [inline code results][inline]. First, use R to calculate a result. ```{r ex_inline} rand <- sample(state.name, 1) # `r md_bold(rand)` var <- sample(colnames(state.x77), 1) # `r md_code(var)` ``` Then, you can easily print that result in the middle of regular text with markdown formatting. In this case, our randomly selected state is... `r md_bold(rand)` and the `r md_code(var)` variable was randomly selected from the `state.x77` dataframe. Calculating results and using those calculations in the body of a text document increases reproducibility. In a [meta-study of psychology journals][pmc], researchers found that "around 15% of the articles contained at least one statistical conclusion that proved, upon recalculation, to be incorrect." These errors can be mitigated by using inline printing of results like we did above. With the `gluedown` package, programmers can **emphasize** those results without worry. [pmc]: https://web.archive.org/web/20220723000032/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3174372/