---
title: "Counts"
output: rmarkdown::html_vignette
fig.width: 8
fig.height: 5
vignette: >
%\VignetteIndexEntry{Counts}
%\VignetteEncoding{UTF-8}
%\VignetteEngine{knitr::rmarkdown}
editor_options:
markdown:
wrap: 80
---
## Counts
A first step in many transportation safety analyses involves counting the number
of relevant crashes, fatalities, or people involved. `counts()` lets users
specify *what* to count, *where* to count them (rural/urban and/or in specified
states or regions), *who* to include, the *interval* over which to count
(annually or monthly), and factors *involved* in the crashes. It returns a
simple tibble that can be easily piped into `ggplot()` to quickly visualize
counts.
First we load the required libraries:
```{r, message=F}
library(dplyr)
library(ggplot2)
library(tidyr)
library(rfars)
```
### annual_counts
`rfars` includes `annual_counts`, a table of annual crash counts:
```{r}
rfars::annual_counts %>%
filter(what == "crashes", involved == "any") %>%
ggplot(aes(x=year, y=n)) +
geom_col() +
facet_wrap(.~source, nrow=1, scales = "free_y") +
labs(title = "Total annual crashes by type (FARS = fatal, CRSS = general)", x=NULL, y=NULL) +
theme_minimal()
rfars::annual_counts %>%
filter(source=="FARS", involved != "any") %>%
ggplot(aes(x=year, y=n)) +
geom_col() +
facet_wrap(.~involved, scales = "free_y") +
labs(title = "Annual fatal crashes by factor involved", subtitle = "Derived from FARS data files", x=NULL, y=NULL) +
theme_minimal() +
theme(plot.title.position = "plot")
rfars::annual_counts %>%
filter(source=="CRSS", involved != "any") %>%
ggplot(aes(x=year, y=n)) +
geom_col() +
facet_wrap(.~involved, scales = "free_y") +
labs(title = "Annual crashes of all severity levels by factor involved", subtitle = "Derived from CRSS data files", x=NULL, y=NULL) +
theme_minimal() +
theme(plot.title.position = "plot")
```
### Generating Custom Counts
We can use get_fars() and then counts() to generate a variety of custom counts.
Below we pull two years of data:
```{r, message=FALSE}
myFARS <- get_fars(years = c(2022, 2023), proceed = T)
```
Then we can use `counts()` to get the monthly number of crashes in Virginia:
```{r, results='asis'}
my_counts <- counts(
df = myFARS,
where = list(states = "VA"),
what = "crashes",
interval = c("year", "month")
)
```
This returns the following dataframe:
```{r, results='asis'}
knitr::kable(my_counts, format = "html")
```
Which we can graph:
```{r}
my_counts %>%
mutate_at("year", factor) %>%
ggplot(aes(x=month, y=n, group=year, color=year, label=scales::comma(n))) +
geom_line(linewidth = 1.5) +
labs(x=NULL, y=NULL, title = "Fatal Crashes in Virginia") +
theme_minimal() +
theme(plot.title.position = "plot")
```
or
```{r}
my_counts %>%
mutate(date = lubridate::make_date(year, month)) %>%
ggplot(aes(x=date, y=n, label=scales::comma(n))) +
geom_col() +
labs(x=NULL, y=NULL, title = "Fatal Crashes in Virginia") +
theme(plot.title.position = "plot")
```
We could alternatively count annual fatalities:
```{r, results='asis'}
counts(
myFARS,
where = list(states = "VA"),
what = "fatalities",
interval = c("year")
) %>%
knitr::kable(format = "html")
```
Or fatalities involving speeding:
```{r, results='asis'}
counts(
df = myFARS,
where = list(states = "VA"),
what = "fatalities",
interval = c("year"),
involved = "speeding"
) %>%
knitr::kable(format = "html")
```
Or fatalities involving speeding in rural areas:
```{r, results='asis'}
counts(
myFARS,
where = list(states = "VA", urb="rural"),
what = "fatalities",
interval = c("year"),
involved = "speeding"
) %>%
knitr::kable(format = "html")
```
Or we can use involved = 'each' to see all of the problems in one state:
```{r, results='asis'}
counts(
df = myFARS,
where = list(states = "VA"),
what = "crashes",
interval = "year",
involved = "each"
) %>%
pivot_wider(names_from = "year", values_from = "n") %>%
arrange(desc(`2023`)) %>%
knitr::kable(format = "html")
```
### Comparing Counts
We can use `compare_counts()` to quickly produce comparison graphs. Below we
compare speeding-related fatalities in rural and urban areas:
```{r}
compare_counts(
df = myFARS,
interval = "year",
involved = "speeding",
what = "fatalities",
where = list(states = "VA", urb="rural"),
where2 = list(states = "VA", urb="urban")
) %>%
ggplot(aes(x=factor(year), y=n, label=scales::comma(n))) +
geom_col() +
geom_label(vjust=1.2) +
facet_wrap(.~urb) +
labs(x=NULL, y=NULL, title = "Speeding-Related Fatalities in Virginia", fill=NULL)
```
And here we compare speeding-related crashes to those related to distraction:
```{r}
compare_counts(
df = myFARS,
where = list(states = "VA"),
interval = "year",
involved = "speeding",
involved2 = "distracted driver",
what = "crashes",
) %>%
ggplot(aes(x=factor(year), y=n, label=scales::comma(n))) +
geom_col() +
geom_label(vjust=1.2) +
facet_wrap(.~involved) +
labs(x=NULL, y=NULL, title = "Speeding- and Distraction-Related Crashes in Virginia", fill=NULL)
```