--- title: "Specifying alerting rules" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Specifying alerting rules} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` While visual inspection is probably the most reliable way of identifying anomalies in time series, things can be missed when there are large numbers of time series to inspect. In this case, it can be useful to additionally test the individual time series for certain conditions, such as the presence of missing values or for values breaching pre-set limits, and to alert the user to them. The alert results are displayed in different ways depending on the chosen type of output. Tabs containing time series which failed at least one alert are highlighted. A separate tab containing all the alert results is added at the end. It is also possible to just generate the alert results as a data frame. Note: This functionality is still in the early stages of development, and can increase the document size and rendering time of the report considerably. ## Overview One or more alerting rules can be specified within an `alert_rules()` object, which is then passed into the `alert_rules` parameter. When creating a standard mantis report, the `alertspec` parameter also controls how the results are displayed in the final tab. ## Built-in rules There are a number of built-in rules that can be used to test for simple conditions, the simplest of which is testing for the presence of missing values. The `extent_type` and `extent_value` parameters can be used to adjust the tolerance. E.g. ```{r, eval=FALSE} # alert if all values are NA ars <- alert_rules(alert_missing(extent_type = "all")) # alert if there are 10 or more missing values in total # or if the last 3 or more values are missing # or if 5 or more values in a row are missing ars <- alert_rules( alert_missing(extent_type = "any", extent_value = 10), alert_missing(extent_type = "last", extent_value = 3), alert_missing(extent_type = "consecutive", extent_value = 5) ) ``` The `alert_equals()`, `alert_below()`, `alert_above()` rules work similarly, but with an extra parameter `rule_value` to compare against. It is also possible to compare a range of values with another range of values using the `alert_difference_above_perc()` and `alert_difference_below_perc()` rules. This can be useful for checking if recent values are lower/higher than in a previous period, over a particular percentage. This comparison is based on the mean of values in the two periods. The ranges should be contiguous, and denote positions from the end of the time series. E.g. ```{r, eval=FALSE} # alert if the mean of last 3 values is more than 20% higher than # the mean of the preceding 12 values ars <- alert_rules( alert_difference_above_perc( current_period = 1:3, previous_period = 4:15, rule_value = 20 ) ) ``` ## Custom rules If you want to apply a more complex rule, you can specify your own using `alert_custom()`. E.g. ```{r, eval=FALSE} ars <- alert_rules( alert_custom( short_name = "my_rule_combo", description = "Over 3 missing values and max value is > 10", function_call = quote( sum(is.na(value)) > 3 && max(value, na.rm = TRUE) > 10 ) ), alert_custom( short_name = "my_rule_doubled", description = "Last value is over double the first value", function_call = quote( rev(value)[1] > 2 * value[1] ) ) ) ``` The function_call is evaluated per time series, and the expression inside `quote()` must return either `TRUE` or `FALSE`. A return value of `TRUE` means the alert result is "FAIL" Column names that can be used explicitly in the expression are `value` and `timepoint`, and which refer to the values in the `value_col` and `timepoint_col` columns of the data respectively. Before evaluating the `function_call`, the dataframe is grouped by the `item_col` and ordered by the `timepoint_col`. ## Applying different rules to different time series By default, the specified rules will be run on all time series in the supplied dataframe. You can specify different sets of rules for different time series by using the `items` parameter, and providing it a named list of character vectors corresponding to the `item_cols` columns and values that the particular rule should be applied to. ## Walkthrough Generate an interactive `mantis` report for the `example_prescription_numbers` dataset supplied with the `mantis` package. ```{r} library(mantis) data("example_prescription_numbers") ``` ```{r, eval=FALSE} mantis_report( df = example_prescription_numbers, file = "example_prescription_numbers_interactive.html", inputspec = inputspec( timepoint_col = "PrescriptionDate", item_cols = c("Antibiotic", "Spectrum", "Location"), value_col = "NumberOfPrescriptions", tab_col = "Location" ), outputspec = outputspec_interactive( sync_axis_range = FALSE ), report_title = "mantis report", dataset_description = "Antibiotic prescriptions by site", show_progress = TRUE ) ``` In the SITE1 tab, the Vancomycin time series contains a block of NA values but it is not easy to distinguish between NA values and zero values unless you zoom in on the plot (by selecting a section of it with the mouse) or hover the mouse over the relevant time points and read the tooltips. We can easily add a rule to test for any missing values. While we're here, we will also add a custom rule to try to catch the fact that Coamoxiclav prescriptions have become much higher than they were at the start. ```{r, eval=FALSE} ars <- alert_rules( alert_missing(extent_type = "any", extent_value = 1), alert_custom( short_name = "my_rule_doubled", description = "Last 7 values are over double the first 7 values", function_call = quote( mean(rev(value)[1:7], na.rm = TRUE) > 2 * (mean(value[1:7], na.rm = TRUE)) ) ) ) # create a new report mantis_report( df = example_prescription_numbers, file = "example_prescription_numbers_interactive_alerts.html", inputspec = inputspec( timepoint_col = "PrescriptionDate", item_cols = c("Antibiotic", "Spectrum", "Location"), value_col = "NumberOfPrescriptions", tab_col = "Location" ), outputspec = outputspec_interactive( sync_axis_range = FALSE ), alertspec = alertspec( alert_rules = ars, show_tab_results = c("FAIL", "NA") ), report_title = "mantis report with alerts", dataset_description = "Antibiotic prescriptions by site", show_progress = TRUE ) ``` The report now includes a column for the alert results, and details can be seen by clicking on the triangles. The tab labels will include an exclamation mark if any items in that tab have failed an alert rule. An additional tab is also created, which lists the alert results. In this example we have decided to exclude any alerts that have passed. If for some reason we only wanted to apply the custom rule to broad spectrum antibiotics, we can indicate this using the `items` parameter of the rule. ```{r, eval=FALSE} ars_restricted <- alert_rules( alert_missing(extent_type = "any", extent_value = 1), alert_custom( short_name = "my_rule_doubled", description = "Last 7 values are over double the first 7 values", function_call = quote( mean(rev(value)[1:7], na.rm = TRUE) > 2 * (mean(value[1:7], na.rm = TRUE)) ), items = list("Spectrum" = c("Broad")) ) ) # create a new report mantis_report( df = example_prescription_numbers, file = "example_prescription_numbers_interactive_alerts_restricted.html", inputspec = inputspec( timepoint_col = "PrescriptionDate", item_cols = c("Antibiotic", "Spectrum", "Location"), value_col = "NumberOfPrescriptions", tab_col = "Location" ), outputspec = outputspec_interactive( sync_axis_range = FALSE ), alertspec = alertspec( alert_rules = ars_restricted ), report_title = "mantis report with alerts", dataset_description = "Antibiotic prescriptions by site", show_progress = TRUE ) ``` The report now shows two rules for Coamoxiclav and one rule for Metronidazole.