--- title: "Specifying Priors" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Specifying Priors} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} \usepackage{xcolor} \usepackage{bbding} bibliography: "`r here::here('vignettes', 'library.bib')`" --- ```{r setup, include=FALSE, message=FALSE, warning=FALSE} knitr::opts_chunk$set( collapse = TRUE, warning = FALSE, message = FALSE, fig.retina = 3, comment = "#>" ) set.seed(123) ``` Prior specifications define your assumptions about respondent preferences before collecting data. These priors serve two key purposes in **cbcTools**: optimizing experimental designs (for D-optimal methods) and simulating choices. This article shows how to create and work with prior specifications using `cbc_priors()`. # What Are Priors? Priors represent your beliefs about how attributes influence respondent choices before collecting data. They specify: - **Direction of effects**: Positive values increase utility, negative values decrease it. - **Magnitude of effects**: Larger absolute values indicate stronger preferences. - **Heterogeneity in effects**: Random parameters can be used to specify heterogeneity in preferences across the population. ## Sources of Prior Information - **Literature review**: Published studies in similar contexts. - **Expert judgment**: Domain knowledge from researchers or practitioners. - **Pilot studies**: Small preliminary studies to estimate effects. - **Previous studies**: Your own past research in related areas. - **Theoretical expectations**: Economic theory or behavioral assumptions. # Basic Prior Specification For the purposes of this article, we'll keep working with the same set of profiles about apples: ```{r} library(cbcTools) profiles <- cbc_profiles( price = c(1, 1.5, 2, 2.5, 3), type = c('Fuji', 'Gala', 'Honeycrisp'), freshness = c('Poor', 'Average', 'Excellent') ) profiles ``` ## Fixed Parameters Fixed parameters assume you know the exact coefficient values. Start with the simplest case: ```{r} # Basic fixed priors priors_fixed <- cbc_priors( profiles = profiles, price = -0.25, # Negative = prefer lower prices type = c(0.5, 1.0), # Preferences relative to reference level freshness = c(0.6, 1.2) # Preferences relative to reference level ) priors_fixed ``` ### Understanding Categorical Variables For categorical attributes, **the reference level is set by the first level defined in `cbc_profiles()`**, which in this case is `"Fuji"` for **Type** and `"Poor"` for **Freshness**. This would imply the following for the above set of priors: **Type**: - **Fuji**: coefficient = 0 (Reference level) - **Gala**: coefficient = 0.5 (preferred over Fuji) - **Honeycrisp**: coefficient = 1.0 (most preferred) **Freshness**: - **Poor**: coefficient = 0 (Reference level) - **Average**: coefficient = 0.6 - **Excellent**: coefficient = 1.2 ### Using Named Specifications You can specify the levels for categorical priors using names: ```{r} priors_named <- cbc_priors( profiles = profiles, price = -0.25, type = c("Gala" = 0.5, "Honeycrisp" = 1.0), freshness = c("Average" = 0.6, "Excellent" = 1.2) ) ``` This produces the same set of priors as the `priors_fixed` example above: ```{r} identical(priors_fixed$pars, priors_named$pars) ``` ## Random Parameters Random parameters allow for preference heterogeneity across respondents. Use `rand_spec()` to define random parameters. In the example below, we use `"n"` to specify normal distributions for the `price` and `freshness` attributes. Note that for `freshness` the `mean` and `sd` are vectors since it is a categorical attribute with three levels (the first level is the reference level): ```{r} priors_random <- cbc_priors( profiles = profiles, price = rand_spec( dist = "n", mean = -0.25, sd = 0.1 ), type = c(0.5, 1.0), freshness = rand_spec( dist = "n", mean = c(0.6, 1.2), sd = c(0.1, 0.1) ) ) priors_random ``` Three distributions are supported: - `"n"`: normal - `"ln"`: log-normal (forces positivity) - `"cn"`: censored normal (forces positivity) ## Parameter Correlations Model correlations between random parameters can be included using `cor_spec()`: ```{r} priors_correlated <- cbc_priors( profiles = profiles, price = rand_spec( dist = "n", mean = -0.1, sd = 0.05, correlations = list( cor_spec( with = "type", with_level = "Honeycrisp", value = 0.3 ) ) ), type = rand_spec( dist = "n", mean = c("Gala" = 0.1, "Honeycrisp" = 0.2), sd = c("Gala" = 0.05, "Honeycrisp" = 0.1) ), freshness = c(0.1, 0.2) ) # View the correlation matrix priors_correlated$correlation ``` ### Types of Correlations General correlation between all levels of two attributes: ```{r eval=FALSE} cor_spec( with = "type", value = -0.2 ) ``` Correlation with a specific level of a categorical attribute: ```{r eval=FALSE} cor_spec( with = "type", with_level = "Honeycrisp", value = 0.3 ) ``` Correlation from a specific level to another specific level: ```{r eval=FALSE} cor_spec( with = "freshness", level = "Gala", with_level = "Excellent", value = 0.4 ) ``` ## Interaction Effects You can include interaction terms in your priors using `int_spec()`: ```{r} # Create priors with interaction effects priors_interactions <- cbc_priors( profiles = profiles, price = -0.25, type = c("Fuji" = 0.5, "Honeycrisp" = 1.0), freshness = c("Average" = 0.6, "Excellent" = 1.2), interactions = list( # Price sensitivity varies by apple type int_spec( between = c("price", "type"), with_level = "Fuji", value = 0.1 ), int_spec( between = c("price", "type"), with_level = "Honeycrisp", value = 0.2 ), # Type preferences vary by freshness int_spec( between = c("type", "freshness"), level = "Honeycrisp", with_level = "Excellent", value = 0.3 ) ) ) priors_interactions ``` ### Interaction Types Continuous × Categorical: Must specify categorical level ```{r eval=FALSE} int_spec( between = c("price", "type"), with_level = "Fuji", value = 0.05 ) ``` Categorical × Categorical: Must specify both levels: ```{r eval=FALSE} int_spec( between = c("type", "freshness"), level = "Gala", with_level = "Excellent", value = 0.1 ) ``` Continuous × Continuous: No level specification needed: ```{r eval=FALSE} int_spec( between = c("price", "weight"), value = 0.02 ) ``` ## No-Choice Priors For designs with no-choice options, specify the no-choice utility with the `no_choice` argument: ```{r} # Fixed no-choice prior priors_nochoice_fixed <- cbc_priors( profiles = profiles, price = -0.25, type = c(0.5, 1.0), freshness = c(0.6, 1.2), no_choice = -0.5 # Negative values make no-choice less attractive ) # Random no-choice prior priors_nochoice_random <- cbc_priors( profiles = profiles, price = -0.25, type = c(0.5, 1.0), freshness = c(0.6, 1.2), no_choice = rand_spec(dist = "n", mean = -0.5, sd = 0.2) ) priors_nochoice_fixed ``` # Parameter Draws for Bayesian Analysis When you specify random parameters, `cbc_priors()` automatically generates parameter draws for Bayesian D-error calculation. You can conrol both the draw type with the `draw_type` argument (`"halton"` or `"sobol"`) and the number of draws with the `n_draws` argument, e.g.: ```{r, fig.width=6, fig.height=4, fig.alt="Histogram showing the distribution of price parameter draws. The distribution appears roughly normal, centered around -0.25, with values ranging from approximately -0.5 to 0, indicating negative price sensitivity as expected."} priors_bayesian <- cbc_priors( profiles = profiles, price = rand_spec( dist = "n", mean = -0.25, sd = 0.1 ), type = rand_spec( dist = "n", mean = c(0.5, 1.0), sd = c(0.1, 0.2) ), freshness = c(0.6, 1.2), n_draws = 500, # Default = 100 draw_type = "sobol" # Default = "halton" ) # Inspect the parameter draws price_draws <- priors_bayesian$par_draws[, 1] cat("Parameter draws dimensions:", dim(priors_bayesian$par_draws), "\n") cat("Mean of price draws:", mean(price_draws), "\n") cat("SD of price draws:", sd(price_draws), "\n") # Plot distribution of one parameter hist( price_draws, main = "Distribution of Price Parameter Draws", xlab = "Price Coefficient" ) ``` # Common Pitfalls ## Mismatched Scales Ensure prior magnitudes match your attribute scales: ```{r eval=FALSE} # Problem: Price in dollars, prior assumes price in cents profiles_dollars <- cbc_profiles(price = c(1.00, 2.00, 3.00), ...) priors_cents <- cbc_priors(profiles_dollars, price = -10, ...) # Too large! # Solution: Match scales priors_dollars <- cbc_priors(profiles_dollars, price = -0.10, ...) # Appropriate ``` ## Wrong Reference Levels Remember the first level is always the reference: ```{r} # If you want "Excellent" as reference, reorder profiles profiles_reordered <- cbc_profiles( price = c(1, 1.5, 2, 2.5, 3), type = c('Fuji', 'Gala', 'Honeycrisp'), freshness = c('Excellent', 'Average', 'Poor') # Excellent now reference ) priors_reordered <- cbc_priors( profiles_reordered, price = -0.1, type = c(0.1, 0.2), freshness = c(-0.1, -0.2) # Negative = worse than excellent ) ``` ## Incompatible Restrictions Ensure your priors are compatible with restricted profiles: ```{r eval=FALSE} # If you've restricted certain profile combinations, # make sure your priors don't assume those combinations are common restricted_profiles <- cbc_restrict( profiles, type == "Fuji" & price > 2.5 ) # Prior should reflect that expensive Fuji combinations don't exist priors_compatible <- cbc_priors(restricted_profiles, ...) ``` # Using Priors in Practice Once created, priors are used in: 1. **Design optimization**: Pass to `cbc_design()` for D-optimal methods. See the [Generating Designs](design.html) article for more details. 2. **Choice simulation**: Pass to `cbc_choices()` for realistic choice patterns. See the [Simulating Choices](choices.html) article for more details.