---
title: "Color data frame output in R terminal"
output:
html_document:
toc: true
toc_float: true
number_sections: true
theme: sandstone
vignette: >
%\VignetteIndexEntry{Color data frame output in R terminal}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "",
R.options=list(width=100)
)
```
```{r echo=FALSE,include=FALSE,eval=TRUE}
options(crayon.enabled = TRUE)
options(crayon.colors = 256)
knitr::knit_hooks$set(output = function(x, options){
paste0(
'
',
fansi::sgr_to_html(x = htmltools::htmlEscape(x), warn = FALSE),
'
'
)
})
```
```{r setup,echo=FALSE,results="hide",include=FALSE}
library(colorDF)
library(dplyr)
library(data.table)
```
# Quick start
```{r}
library(dplyr)
data(starwars)
sw <- starwars[, c(1:3, 7:8)]
sw %>% colorDF
colorDF(sw) %>% summary
```
# Colorful data frames
Your average terminal in which you run R is capable of displaying colors,
styles and unicode characters. Wouldn't it be nice to add some color to the
data frames you are displaying? For example, that factors are shown in a
distinct color (no confusing of strings and factors any more!) or that
significant p-values are colored in red?
This was my motivation when writing this tiny package. Of course, changing
default method for printing data frames is nothing a package is allowed to
do (but read on!). However, this package defines everything you need to get
dynamic, colorful output when viewing data frames. There are two things
about colorDF which are important:
1. colorDF *never* modifies the behavior of the data frame like object or
its contents (i.e. it does not redefine methods like `[<-`, removes row names
etc.^[Strictly speaking, that is not true, as there is a
`[.colorDF` method which serves only as a wrapper around the respective
`[` of the underlying data frame, tibble or data table. The only reason
is exist is that otherwise the style and column type attributes will be
lost.]).
The only two things that change are (i) the default print method
(visualization), and (ii) the ".style" and ".coltp" attributes of the
object, and that only if you really change the class of the object,
which is often unnecessary.
2. Any data frame like object can be used in colorDF, and you don't need
to modify these objects to use the colorful visualizations.
Yes, you can color *any* object that can be cast into a data frame
with this or related functions! For example, you can apply it to both
tibbles and data.table objects:
```{r,results="hide"}
## works with standard data.frames
colorDF(mtcars)
## works with tidyverse tibbles
mtcars %>% as_tibble %>% colorDF
## works with data.table
colorDF(data.table(mtcars))
```
The output of these three commands is identical:
```{r echo=FALSE}
colorDF(mtcars)
```
# Column types
Column types are mostly like classes, but colorDF introduces some
additional distinctions, specifically "identifier" (such that character
columns which contain identifiers can be shown with a particular, distinct
style) and "pval", to show significant p-values in a different color (and
use `format.pval()` for formatting). Column types are stored in the
`.coltp` attribute of the colorDF object.
colorDF tries to guess how each column should be displayed. First it checks
whether any column types have been assigned explicitely using the
`col_type<-` function and stored in the `.coltp` attribute of the object.
Next, it looks up whether it can guess the contents of the column by
looking at the column name (ID, p-value). Finally, it determines the class
of the column (character, integer, numeric, logical, factor).
To assign a particular column type, you need first to turn a data frame
colorful and then modify the column type:
```{r}
sw <- sw %>% as.colorDF
col_type(sw, "name") <- "identifier"
col_type(sw, "gender") <- "factor"
sw$probability <- runif(nrow(sw), 0, 0.1)
col_type(sw, "probability") <- "pval"
sw
```
Note that changing the column type **does not** change the class of the
column in the data frame! colorDF never touches the data frame contents,
the only operations concern the "class", ".style" and ".coltp" attributes.
So while you may set a column type to "character" instead of "factor", even
though it will be looking like a character type on the terminal output, the
column class will still be a factor.
You can also hide a column:
```{r}
sw <- colorDF(starwars)
col_type(sw, c("vehicles", "films", "starships")) <- "hidden"
sw
```
# Styles and Themes
I am a bit confused when it comes to distinguishing the two. Themes are
basically internally predefined styles. Styles are simply lists that hold
information how different columns, column and row headers, separators
between the columns and highlighted rows are displayed.
Themes can be set using the `options(colorDF_theme="")` command
or by directly specifying the option in a call to `colorDF`:
```{r}
colorDF(sw, theme="bw")
```
Here is an overview of the themes. Some of them are intended for dark
background and will not look great on a light background, which is why we
use `force_bg=TRUE` to force black on white background for these themes:
```{r}
colorDF_themes_show(force_bg=TRUE)
```
You can add your own themes using `add_colorDF_theme()` (see the example
section on the help page).
# Column styles
Styles of a colorDF object can be directly manipulated using `df_style`:
```{r}
mtcars.c <- colorDF(mtcars)
df_style(mtcars.c, "sep") <- "; "
```
If interested, read the help file for `df_style()`.
# Utilities
## Summaries
colorDF comes with a couple of utility functions. Firstly, it defines a summary
method for colorful data frames which can also be used for any other data
frame like object and which I find much more useful than the regular
summary:
```{r}
starwars %>% as.colorDF %>% summary
```
There is a directly visible (exported) version of the colorful summary
called `summary_colorDF`:
```{r eval=FALSE}
starwars %>% summary_colorDF
```
As you can see, the summary is much more informative than the default
`summary.data.frame` function. Not only this, but the object does not need
to be a data frame – any list can do!
```{r summary_mtcars,eval=TRUE}
mtcars_cyl <- split(mtcars$mpg, mtcars$cyl)
sapply(mtcars_cyl, length)
```
The list `mtcars_cyl` is the miles per gallon column split by number of
cylinders. We can use `summary_colorDF` to create a (semi)graphical summary
of this list:
```{r summary_mtcars2,eval=TRUE}
summary_colorDF(mtcars_cyl, numformat="g", width=90)
```
In fact, this is so useful (especially if an interactive graphic device is
not practical, e.g. when running R over ssh/screen) that I implemented a
terminal boxplotting function:
```{r summary_iris}
term_boxplot(Sepal.Length ~ Species, data=iris, width=90)
```
## Highlighting
The `highlight()` function allows to mark selected rows from the table:
```{r, eval=TRUE}
foo <- starwars %>% select(name, species, homeworld) %>%
highlight(.$homeworld == "Tatooine")
```
(Unfortunately, the HTML representation of the ANSI terminal doesn't show
that one correctly).
## Data frame search
The `df_search()` function looks through a data frame for occurence of a
pattern in all columns (or a subset, if the parameter `cols` is used) and
where the pattern matches, it colors the contents of the cell in red:
```{r, eval=TRUE}
starwars %>% df_search("blue")
```
# Setting up colorDF as the default data frame print method
You can use colorDF as the default method for displaying data frames and
similar objects. For this, you need to use the `colorDF:::print.colorDF`
function:
```{r eval=FALSE}
## for regular data frames
print.data.frame <- colorDF:::print_colorDF
## for tidyverse tibbles
print.tbl <- colorDF:::print_colorDF
## for data.tables
print.data.table <- colorDF:::print_colorDF
```
This will not replace or modify the original functions from data.table or
tibble packages, but merely mask these. And from now on, every data frame
like object will be shown in color, but otherwise, its behavior will not
change.
Should you want to go back to the original print functions, just remove
these new functions:
```{r eval=FALSE}
rm(print.data.frame, print.tbl, print.data.table)
```
This is a bit more complicated in case of S4 objects. One such object type
is a DataFrame defined in the S4Vectors package. It is commonly used in
many Bioconductor packages such as DESeq2. Unfortunately, the `show` method
defined for DataFrames is not convenient, for example it always displays a
ridiculous number of significant digits, cluttering the output.
`print_colorDF` can print these classes, as it can work on anything that
can be cast into a data frame using an `as.data.frame` method.
To take over the output of DataFrames and all other objects inheriting from
it (such as DESeqResults), we need to use the S4 convention of defining the
methods:
```{r eval=FALSE}
setMethod("show", "DataFrame", function(object) colorDF::print_colorDF(object))
```
Since methods can be only defined for existing classes, if you want to put
it in your `.Rprofile`, you need to first load (but not necessarily attach)
the S4Vectors package:
```{r eval=FALSE}
loadNamespace("S4Vectors")
setMethod("show", "DataFrame", function(object) colorDF::print_colorDF(object))
```
# Global options
There is a number of options which override whatever has been defined in
a particular theme / style. You can view them with `colorDF_options()`:
```{r alloptions,eval=TRUE,results="markdown",R.options=list(width=100)}
colorDF_options()
```
To change these options, use `options()` just like with any other global
option. For example,
```{r problemchild, eval=TRUE,results="markdown"}
options(colorDF_tibble_style=TRUE)
options(colorDF_sep= " ")
options(colorDF_n=5)
colorDF(starwars)
```
# Rmarkdown
The package is intended to be used in terminal. However, as you see above,
it is possible to get the colored tables also in an rmarkdown document. For
this, include the following chunk at the beginning of your document:
````markdown
`r ''````{r echo=FALSE}
options(crayon.enabled = TRUE)
knitr::knit_hooks$set(output = function(x, options){
paste0(
'',
fansi::sgr_to_html(x = htmltools::htmlEscape(x), warn = FALSE),
'
'
)
})
```
````