--- title: "Color data frame output in R terminal" output: html_document: toc: true toc_float: true number_sections: true theme: sandstone vignette: > %\VignetteIndexEntry{Color data frame output in R terminal} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "", R.options=list(width=100) ) ``` ```{r echo=FALSE,include=FALSE,eval=TRUE} options(crayon.enabled = TRUE) options(crayon.colors = 256) knitr::knit_hooks$set(output = function(x, options){ paste0( '
',
    fansi::sgr_to_html(x = htmltools::htmlEscape(x), warn = FALSE),
    '
' ) }) ``` ```{r setup,echo=FALSE,results="hide",include=FALSE} library(colorDF) library(dplyr) library(data.table) ``` # Quick start ```{r} library(dplyr) data(starwars) sw <- starwars[, c(1:3, 7:8)] sw %>% colorDF colorDF(sw) %>% summary ``` # Colorful data frames Your average terminal in which you run R is capable of displaying colors, styles and unicode characters. Wouldn't it be nice to add some color to the data frames you are displaying? For example, that factors are shown in a distinct color (no confusing of strings and factors any more!) or that significant p-values are colored in red? This was my motivation when writing this tiny package. Of course, changing default method for printing data frames is nothing a package is allowed to do (but read on!). However, this package defines everything you need to get dynamic, colorful output when viewing data frames. There are two things about colorDF which are important: 1. colorDF *never* modifies the behavior of the data frame like object or its contents (i.e. it does not redefine methods like `[<-`, removes row names etc.^[Strictly speaking, that is not true, as there is a `[.colorDF` method which serves only as a wrapper around the respective `[` of the underlying data frame, tibble or data table. The only reason is exist is that otherwise the style and column type attributes will be lost.]). The only two things that change are (i) the default print method (visualization), and (ii) the ".style" and ".coltp" attributes of the object, and that only if you really change the class of the object, which is often unnecessary. 2. Any data frame like object can be used in colorDF, and you don't need to modify these objects to use the colorful visualizations. Yes, you can color *any* object that can be cast into a data frame with this or related functions! For example, you can apply it to both tibbles and data.table objects: ```{r,results="hide"} ## works with standard data.frames colorDF(mtcars) ## works with tidyverse tibbles mtcars %>% as_tibble %>% colorDF ## works with data.table colorDF(data.table(mtcars)) ``` The output of these three commands is identical: ```{r echo=FALSE} colorDF(mtcars) ``` # Column types Column types are mostly like classes, but colorDF introduces some additional distinctions, specifically "identifier" (such that character columns which contain identifiers can be shown with a particular, distinct style) and "pval", to show significant p-values in a different color (and use `format.pval()` for formatting). Column types are stored in the `.coltp` attribute of the colorDF object. colorDF tries to guess how each column should be displayed. First it checks whether any column types have been assigned explicitely using the `col_type<-` function and stored in the `.coltp` attribute of the object. Next, it looks up whether it can guess the contents of the column by looking at the column name (ID, p-value). Finally, it determines the class of the column (character, integer, numeric, logical, factor). To assign a particular column type, you need first to turn a data frame colorful and then modify the column type: ```{r} sw <- sw %>% as.colorDF col_type(sw, "name") <- "identifier" col_type(sw, "gender") <- "factor" sw$probability <- runif(nrow(sw), 0, 0.1) col_type(sw, "probability") <- "pval" sw ``` Note that changing the column type **does not** change the class of the column in the data frame! colorDF never touches the data frame contents, the only operations concern the "class", ".style" and ".coltp" attributes. So while you may set a column type to "character" instead of "factor", even though it will be looking like a character type on the terminal output, the column class will still be a factor. You can also hide a column: ```{r} sw <- colorDF(starwars) col_type(sw, c("vehicles", "films", "starships")) <- "hidden" sw ``` # Styles and Themes I am a bit confused when it comes to distinguishing the two. Themes are basically internally predefined styles. Styles are simply lists that hold information how different columns, column and row headers, separators between the columns and highlighted rows are displayed. Themes can be set using the `options(colorDF_theme="")` command or by directly specifying the option in a call to `colorDF`: ```{r} colorDF(sw, theme="bw") ``` Here is an overview of the themes. Some of them are intended for dark background and will not look great on a light background, which is why we use `force_bg=TRUE` to force black on white background for these themes: ```{r} colorDF_themes_show(force_bg=TRUE) ``` You can add your own themes using `add_colorDF_theme()` (see the example section on the help page). # Column styles Styles of a colorDF object can be directly manipulated using `df_style`: ```{r} mtcars.c <- colorDF(mtcars) df_style(mtcars.c, "sep") <- "; " ``` If interested, read the help file for `df_style()`. # Utilities ## Summaries colorDF comes with a couple of utility functions. Firstly, it defines a summary method for colorful data frames which can also be used for any other data frame like object and which I find much more useful than the regular summary: ```{r} starwars %>% as.colorDF %>% summary ``` There is a directly visible (exported) version of the colorful summary called `summary_colorDF`: ```{r eval=FALSE} starwars %>% summary_colorDF ``` As you can see, the summary is much more informative than the default `summary.data.frame` function. Not only this, but the object does not need to be a data frame – any list can do! ```{r summary_mtcars,eval=TRUE} mtcars_cyl <- split(mtcars$mpg, mtcars$cyl) sapply(mtcars_cyl, length) ``` The list `mtcars_cyl` is the miles per gallon column split by number of cylinders. We can use `summary_colorDF` to create a (semi)graphical summary of this list: ```{r summary_mtcars2,eval=TRUE} summary_colorDF(mtcars_cyl, numformat="g", width=90) ``` In fact, this is so useful (especially if an interactive graphic device is not practical, e.g. when running R over ssh/screen) that I implemented a terminal boxplotting function: ```{r summary_iris} term_boxplot(Sepal.Length ~ Species, data=iris, width=90) ``` ## Highlighting The `highlight()` function allows to mark selected rows from the table: ```{r, eval=TRUE} foo <- starwars %>% select(name, species, homeworld) %>% highlight(.$homeworld == "Tatooine") ``` (Unfortunately, the HTML representation of the ANSI terminal doesn't show that one correctly). ## Data frame search The `df_search()` function looks through a data frame for occurence of a pattern in all columns (or a subset, if the parameter `cols` is used) and where the pattern matches, it colors the contents of the cell in red: ```{r, eval=TRUE} starwars %>% df_search("blue") ``` # Setting up colorDF as the default data frame print method You can use colorDF as the default method for displaying data frames and similar objects. For this, you need to use the `colorDF:::print.colorDF` function: ```{r eval=FALSE} ## for regular data frames print.data.frame <- colorDF:::print_colorDF ## for tidyverse tibbles print.tbl <- colorDF:::print_colorDF ## for data.tables print.data.table <- colorDF:::print_colorDF ``` This will not replace or modify the original functions from data.table or tibble packages, but merely mask these. And from now on, every data frame like object will be shown in color, but otherwise, its behavior will not change. Should you want to go back to the original print functions, just remove these new functions: ```{r eval=FALSE} rm(print.data.frame, print.tbl, print.data.table) ``` This is a bit more complicated in case of S4 objects. One such object type is a DataFrame defined in the S4Vectors package. It is commonly used in many Bioconductor packages such as DESeq2. Unfortunately, the `show` method defined for DataFrames is not convenient, for example it always displays a ridiculous number of significant digits, cluttering the output. `print_colorDF` can print these classes, as it can work on anything that can be cast into a data frame using an `as.data.frame` method. To take over the output of DataFrames and all other objects inheriting from it (such as DESeqResults), we need to use the S4 convention of defining the methods: ```{r eval=FALSE} setMethod("show", "DataFrame", function(object) colorDF::print_colorDF(object)) ``` Since methods can be only defined for existing classes, if you want to put it in your `.Rprofile`, you need to first load (but not necessarily attach) the S4Vectors package: ```{r eval=FALSE} loadNamespace("S4Vectors") setMethod("show", "DataFrame", function(object) colorDF::print_colorDF(object)) ``` # Global options There is a number of options which override whatever has been defined in a particular theme / style. You can view them with `colorDF_options()`: ```{r alloptions,eval=TRUE,results="markdown",R.options=list(width=100)} colorDF_options() ``` To change these options, use `options()` just like with any other global option. For example, ```{r problemchild, eval=TRUE,results="markdown"} options(colorDF_tibble_style=TRUE) options(colorDF_sep= " ") options(colorDF_n=5) colorDF(starwars) ``` # Rmarkdown The package is intended to be used in terminal. However, as you see above, it is possible to get the colored tables also in an rmarkdown document. For this, include the following chunk at the beginning of your document: ````markdown `r ''````{r echo=FALSE} options(crayon.enabled = TRUE) knitr::knit_hooks$set(output = function(x, options){ paste0( '
',
    fansi::sgr_to_html(x = htmltools::htmlEscape(x), warn = FALSE),
    '
' ) }) ``` ````