---
title: "multiScaleR User Guide"
author: "Bill Peterman"
output:
  rmarkdown::html_vignette:
    fig_width: 6.5
    fig_height: 4.25
    number_sections: true
    toc: true
vignette: >
  %\VignetteIndexEntry{multiScaleR User Guide}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE, echo=FALSE}
library(multiScaleR)
# Define the base URL for the extdata directory
base_url <- "https://raw.githubusercontent.com/wpeterman/multiScaleR/master/inst/extdata/"

f <- c("opt_exp.RData", "opt_expow.RData", "opt_fixed.RData",
       "opt_umf_counts.RData", "opt_umf_occ.RData", "opt1.RData",
       "opt2.RData", "opt3.RData", "opt4.RData", "opt5.RData",
       "opt6.RData", "sim_opt.RData", "zinb.RData")

# Download each file
for (file in f) {
  download.file(
    url = paste0(base_url, file),
    destfile = file.path(tempdir(), file),
    mode = "wb"
  )
  }

rdat <- list.files(tempdir(), pattern = "*.RData", full.names = T)
invisible(lapply(rdat, load, .GlobalEnv))
```

# Background

## The scale-of-effect problem

Many abiotic and ecological processes are shaped by variables that operate at multiple spatial scales. A central challenge in landscape ecology is determining the *scale of effect*: the spatial extent over which a landscape variable influences a biological response measured at a sampling location.

Consider a practical example. You are modeling songbird abundance across a forested region, and you suspect that the proportion of forest cover in the surrounding landscape matters. But how large should that surrounding area be? Should you measure forest cover within 100 m of each survey point? 500 m? 2 km? The answer is not known in advance. It depends on how far an individual moves across the landscape, how far away habitat features can affect local conditions, and on the strength of the biological relationship itself.

Traditional approaches test a fixed set of candidate buffer radii, fit a separate model for each, and compare them. This approach has two key limitations. First, the investigator must guess plausible scales in advance, and the true scale might fall between the tested values. Second, a fixed-width buffer treats every landscape cell within the radius equally, as if a forest patch 50 m away matters just as much as one at 499 m.

`multiScaleR` addresses both limitations by representing landscape influence as a **distance-weighted kernel**: a weighting function that assigns higher weight to nearby landscape cells and smoothly decreasing weight to cells farther away. The width of that kernel, σ (sigma, in map units), controls how rapidly influence decays with distance. Crucially, σ is estimated from the data as a parameter of the statistical model, alongside the familiar regression coefficients. This means you do not need to pre-specify the scale; the data identify it for you.

## What `multiScaleR` does

With advanced coding skills, Bayesian analytical frameworks can be used to estimate scales of effect (e.g., Stuber & Gruber 2020; Amirkhiz et al. 2023). More recently, user-friendly R packages have been developed: `Siland` (Carpentier & Martin 2021) and `Scalescape` (Lowe et al. 2022) both estimate distance-weighted landscape effects. `multiScaleR` extends this family of tools with several practical advances:

-   Optimization is implemented in C++ with sparse matrix support, making it efficient for large landscapes and sample sizes.
-   Standard errors and confidence intervals on σ are reported alongside point estimates.
-   Raster surfaces can be smoothed at the optimized scale and used directly for spatial prediction with `terra::predict()`.
-   Model selection tables (AIC, BIC) account for the additional parameters estimated during scale optimization.
-   Functions to simulate rasters and response data facilitate power analysis and methods testing.

`multiScaleR` is fully supported for models of class `glm`, `glm.nb`, `nlme` (including `gls`), `lme4`, and `unmarked`. It should work with any model class that has an `update` function and is a supported class in the R package [insight](https://CRAN.R-project.org/package=insight). Let me know if you encounter issues or would like to see support for other model classes.

**References**

-   Amirkhiz, R. G., R. John, and D. L. Swanson. 2023. A Bayesian approach for multiscale modeling of the influence of seasonal and annual habitat variation on relative abundance of ring-necked pheasant roosters. Ecological Informatics 75:102003.

-   Carpentier, F., and O. Martin. 2021. Siland a R package for estimating the spatial influence of landscape. Scientific Reports 11:7488.

-   Lowe, E. B., B. Iuliano, C. Gratton, and A. R. Ives. 2022. 'Scalescape': an R package for estimating distance-weighted landscape effects on an environmental response. Landscape Ecology 37:1771--1785.

-   Stuber, E. F., and L. F. Gruber. 2020. Recent methodological solutions to identifying scales of effect in multi-scale modeling. Current Landscape Ecology Reports 5:127--139.

------------------------------------------------------------------------

# A Conceptual Road Map

Before writing any code, it helps to see how the pieces fit together. The core workflow has four stages:

```
[Optional: simulate data]       [Your field data]
  sim_rast()                  Spatial Point Data
  sim_dat()                   (GPS coordinates)
  sim_dat_unmarked()               +
        |                    Raster Layers
        |                  (habitat covariates)
        |                          |
        +--------> kernel_prep() <-+
                        |
                        |   Computes distance-weighted raster
                        |   values at each sample point;
                        |   output feeds directly into model
                        v
             [Fit initial regression model]
              glm() / lme4::lmer() / gls()
              pscl::zeroinfl() / unmarked / ...
                        |
                        v
              multiScale_optim()
              (simultaneously estimates sigma for
               each raster layer and regression
               coefficients via maximum likelihood)
                        |
         +--------------+--------------+-----------+
         |              |              |            |
      summary()      plot()    plot_marginal_   aic_tab()
      (sigma CIs,    (kernel   effects()        bic_tab()
       regression    weight    (covariate
       output)       decay)    effect plots)
         |
         v
   kernel_scale.raster()
   (apply optimized kernel to full raster extent;
    optionally scale/center and clamp values)
         |
         v
   terra::predict()
   (spatially explicit predictions across the landscape)
```

The key function is `multiScale_optim()`. It takes your fitted model and the `kernel_prep` output, then optimizes the kernel width (σ) for each spatial covariate. Everything else flows from that fitted object: model summaries, plots, model selection, and spatial projection.

**Key concepts to keep in mind:**

-   **σ (sigma)**: The scale parameter of the kernel function (in map units, e.g., meters). It controls how quickly landscape influence decays with distance. With a Gaussian kernel and σ = 300 m, 90% of the distance-weighted landscape average at a sample point is drawn from cells within roughly 493 m; cells beyond that radius collectively account for the remaining 10%.
-   **`max_D`**: The maximum distance (in map units) to extract raster values for each sample point. This should exceed your expected scale of effect. If `multiScale_optim()` warns that an estimated σ is near `max_D`, re-run `kernel_prep()` with a larger value.
-   **Initial model**: The model you fit *before* optimization uses unsmoothed (initial σ) kernel-weighted covariate values. Its structure (formula, family, random effects) is inherited by the optimization. Get the model structure right before optimizing.

# Distance-weighted Effects

## Choosing a kernel

There are *many* different kernel transformations that could be used to quantify the contribution of environmental variables as a function of distance. The kernel function determines the *shape* of that decay: how quickly nearby cells lose influence relative to the focal point.

Four kernel functions are implemented in `multiScaleR`. In each case, σ (sigma) is estimated during optimization and controls the spatial scale (width) of the kernel. All plots below show the weight assigned to a landscape cell as a function of its distance from a sample point, with a vertical line marking the distance that captures 90% of the total weight.

-   **Gaussian** (recommended for most uses): smooth, symmetric bell-curve decay. Governed by a single parameter σ.
-   **Negative Exponential**: faster initial decay than Gaussian; also governed by σ alone.
-   **Exponential Power**: flexible shape controlled by σ (scale) and β (shape). Experience with simulated data suggests this kernel is prone to overfitting and optimization can fail to converge; use with caution.
-   **Fixed buffer**: a hard-edged buffer. All cells within the buffer contribute equally; all beyond it contribute nothing. This replicates the traditional buffer-based approach.

The Gaussian kernel is the default and performs well across a wide range of ecological scenarios. When in doubt, start with it.

Note: Figures in this document may render at lower quality than running code locally.

1.  Gaussian (`gaus`): Decay in space governed by a single parameter (sigma)

    ```{r echo=FALSE}
    plot_kernel(prob = 0.9,
                sigma = 100,
                kernel = "gaus",
                scale_dist = F,
                add_label = F)
    ```

2.  Negative Exponential (`exp`): Decay in space governed by a single parameter (sigma)

    ```{r echo=FALSE}
    plot_kernel(prob = 0.9,
                sigma = 50,
                kernel = "exp",
                scale_dist = F,
                add_label = F)
    ```

3.  Exponential Power (`expow`): Decay in space governed by a scale parameter (sigma) and a shape parameter (beta)

    ```{r echo=FALSE}
    plot_kernel(prob = 0.9,
                sigma = 250,
                beta = 5,
                kernel = "expow",
                scale_dist = F,
                add_label = F)
    ```

4.  Fixed width buffer (`fixed`): Effect does not decay with distance

    ```{r echo=FALSE}
    plot_kernel(prob = 0.9,
                sigma = 300,
                kernel = "fixed",
                scale_dist = F,
                add_label = F)
    ```

# Preparing Data

Prior to using `multiScaleR` to identify distance-weighted scales of effect, data must be appropriately formatted. These steps will be demonstrated with simulated data provided with the package.

```{r}
library(multiScaleR)

## Read in data
data("landscape_counts")
dat <- landscape_counts

data("surv_pts")
pts <- vect(surv_pts)

land_rast <- terra::rast(system.file("extdata",
                                     "landscape.tif",
                                     package = 'multiScaleR'))
```

The `landscape_counts` data frame contains simulated counts from 100 spatial point locations as well as a scaled and centered site-level covariate ('site'). The `landscape.tif` file is a `spatRaster` object consisting of three surfaces (land1 = binary habitat; land2 = continuous habitat / environment; land3 = continuous / environment). The counts were simulated using the following parameters:

-   Poisson process

    -   Intercept (alpha) of regression --\> **0.25**

-   `site`

    -   Effect --\> **0.3**

-   `land1`

    -   Effect --\> **-0.5**

    -   Gaussian sigma --\> **250**

-   `land2`

    -   Effect --\> **0.7**

    -   Gaussian sigma --\> **500**

-   `land3`

    -   Effect --\> **0** (does not affect counts)

## Explore Data

```{r }
summary(dat)

pts

land_rast

plot(land_rast)

## Plot with points
plot(land_rast$land1)
plot(pts, add = T, pch = 19)
```

## `kernel_prep`

`kernel_prep()` is the first step of every `multiScaleR` analysis. It does two things:

1.  For each sample point, it extracts raster values at every cell within `max_D` (the maximum search radius) and computes their distances. These are stored as sparse matrices for efficiency.
2.  It computes an initial set of distance-weighted covariate values at each sample point using a starting σ (by default, `max_D / 3`). These initial values are used to fit the starting regression model.

**Choosing `max_D`**: Set `max_D` to a value you are confident exceeds the true scale of effect. If you are uncertain, err on the side of being too large; the optimization will find the right σ within that range. The tradeoff is computational: larger `max_D` means more cells to process, so keep it reasonable. If `multiScale_optim()` warns that a σ estimate is near `max_D`, increase `max_D` and re-run `kernel_prep()`.

```{r}
kernel_inputs <- kernel_prep(pts = pts,
                             raster_stack = land_rast,
                             max_D = 1700,
                             kernel = 'gaussian',
                             verbose = FALSE)
kernel_inputs
```

The `kernel_inputs` object holds the distance data and the initial kernel-weighted covariate values (in `kernel_inputs$kernel_dat`). These kernel-weighted values are scaled and centered (mean 0, SD 1 across sample points) so that regression coefficients are comparable across covariates.

Next, combine the kernel-weighted covariate values with the response data to build the data frame for model fitting:

```{r}
df <- data.frame(dat,
                 kernel_inputs$kernel_dat)
str(df)
```

**Fit the initial (starting) model.** This model uses the initial kernel-weighted values and establishes the model structure (formula, family, random effects) that will be inherited by `multiScale_optim()`. The coefficients from this model are only starting values; optimizing with `multiScale_optim()` will update both the regression coefficients and σ simultaneously.

```{r}
mod0 <- glm(counts ~ site + land1 + land2 + land3,
            family = poisson(),
            data = df)
summary(mod0)
```

# Analysis

## `multiScale_optim`

We are now ready to optimize the scales of effect with `multiScale_optim()`. This function searches over σ values for each raster layer (and simultaneously updates regression coefficients) to find the combination that maximizes the likelihood of the fitted model.

> **Parallelization**: For this vignette, the function is run without parallelization. In practice, optimization will be substantially faster when you specify `n_cores` (e.g., `n_cores = 4`). The optimization below takes approximately 50 seconds on a single core.

```{r opt1, eval=FALSE}
opt1 <- multiScale_optim(fitted_mod = mod0,
                         kernel_inputs = kernel_inputs)
```

We get two helpful warning messages:

1.  The estimated σ for one raster layer is near `max_D`, meaning the optimization is pushing against the boundary of the search space. This suggests that either the true scale of effect is large, or that the variable has a weak or absent relationship with the response and optimization is poorly constrained. The warning suggests increasing `max_D` in `kernel_prep()`.
2.  The standard error for one or more σ estimates is large relative to the point estimate: a sign of imprecision, often caused by weak signal or too small a sample size.

Let's look at our results.

```{r}
summary(opt1)
```

**How to read the `summary()` output:**

The top of the output shows the estimated σ (scale parameter) for each raster layer, with standard errors and 95% confidence intervals. Below each σ estimate is the *effective distance*: the distance that captures a specified proportion (default: 90%) of the total kernel weight. This converts σ into an ecologically interpretable distance in map units (meters in this case). A Gaussian kernel with σ = 250 has 90% of its cumulative weight within approximately 412 m of the focal point.

The bottom of the output is the standard model summary from your regression (GLM in this case), showing regression coefficients, standard errors, and p-values for each covariate, exactly as you would see from `summary(glm(...))`.

**Confidence intervals on σ: delta method vs. profile likelihood**

By default, confidence intervals on σ are computed using the **delta method**: a first-order Taylor approximation that treats the log-likelihood surface as locally quadratic near the optimum and derives the CI from the curvature (Hessian) of that surface. Delta-method CIs are fast to compute and work well when the likelihood surface is smooth and roughly symmetric around the estimate. They are reported alongside the standard error in the default `summary()` output.

However, the log-likelihood surface for σ is often asymmetric, especially when the scale parameter is near `max_D`, when sample sizes are small, or when the effect size is weak. In these situations the delta-method CI can be misleading: the lower and upper bounds may not accurately reflect the true uncertainty in each direction. **Profile-likelihood CIs** address this by tracing the likelihood surface directly: for each boundary value of σ, the optimizer finds the configuration of all other parameters that maximizes the likelihood, then identifies where that profile drops by the critical chi-squared quantile (1.92 log-likelihood units for a 95% CI). Profile CIs are therefore more reliable when the surface is irregular, at the cost of additional computation.

To request profile-likelihood CIs, pass `profile = TRUE` to `summary()`:

```{r eval=FALSE}
summary(opt1, profile = TRUE)
```

Profile CIs are recommended whenever:

-   a delta-method CI is wide or asymmetric relative to the point estimate,
-   an estimated σ is near `max_D` (boundary constraint),
-   sample size is small (fewer than \~75 sites), or
-   you intend to report formal uncertainty bounds in a manuscript.

Wide CIs on σ in either method indicate that the data do not strongly constrain the scale of effect for that variable. This may mean the variable has a weak relationship with the response, the sample size is insufficient, or the variable should be excluded from the model.

For our full model: `site` has no apparent effect, `land1` has a negative effect, `land2` has a positive effect, and `land3` shows a weak positive effect. Crucially, the σ for `land3` is large *and* imprecisely estimated (wide CI). This is a red flag: when a variable has no true relationship with the response, the optimization often fails to find a well-defined σ and will wander toward large values.

With this simulated data, we know that `land3` has no effect on counts. Let's fit a model without it.

```{r opt2}
## New model
mod0_2 <- glm(counts ~ site + land1 + land2,
              family = poisson(),
              data = df)
```

```{r eval=FALSE}
## Optimize
opt2 <- multiScale_optim(fitted_mod = mod0_2,
                         kernel_inputs = kernel_inputs)
```

No warnings! Let's look at our results.

```{r}
summary(opt2)
```

The default `summary()` uses delta-method confidence intervals. Because this model is well-behaved (no boundary warnings, reasonable sample size), the delta-method CIs should be reliable here. For a more conservative check, you can compare them to profile-likelihood CIs:

```{r eval=FALSE}
summary(opt2, profile = TRUE)
```

The true data-generating values were (see section 3):

| Parameter | True value | Recovery? |
|-----------|------------|-----------|
| `land1` σ | 250 m      | Check CI  |
| `land2` σ | 500 m      | Check CI  |
| `land1` β | −0.5       | Check CI  |
| `land2` β | 0.7        | Check CI  |

Overall, we do a good job recovering both the regression coefficients (β) and the scales of effect (σ) within reasonable uncertainty bounds. This is reassuring but also a good reminder: with real data, you will not have ground truth to check against. Careful model selection and biological reasoning are your guides.

**Visualizing the kernel decay.** The `plot()` method shows how the influence of each landscape variable falls off with distance. The vertical line marks the distance enclosing 90% of the cumulative kernel weight (the *effective distance*).

```{r}
## Kernel function
plot(opt2)

## Kernel function; 99% contribution
plot(opt2, prob = 0.99)
```

**Visualizing marginal covariate effects.** `plot_marginal_effects()` shows the estimated effect of each covariate on the response, holding other covariates at their mean values. These are the standard partial effect plots from the regression model, applied to the optimally-scaled covariate values.

```{r}
plot_marginal_effects(opt2)
```

**Profiling the likelihood surface.** `profile_sigma()` provides a diagnostic view of how the model fit changes as σ varies across its search range for each covariate, while all other σ values are held at their optimized values. Rather than reporting a single point estimate with an interval, it shows the full shape of the AICc (or log-likelihood) surface across the parameter space.

This is useful in several situations:

-   **Verifying a well-defined optimum**: A clear trough in the AICc surface centered near the optimized σ suggests that the scale of effect is precisely estimated. A flat or monotone profile indicates the data do not strongly constrain σ for that variable.
-   **Asymmetry diagnostics**: If the surface drops steeply on one side and gradually on the other, the likelihood is asymmetric and delta-method CIs may not capture the uncertainty accurately. Profile-likelihood CIs (via `summary(..., profile = TRUE)`) are more appropriate in this case.
-   **Boundary concerns**: When the trough is near `max_D`, the surface may be truncated by the search boundary rather than identified by the data. This is a visual confirmation of the warning message from `multiScale_optim()`.

Intervals are log-spaced so that evaluation is denser at smaller σ values, where the surface tends to change more rapidly, and sparser at larger values where it is typically flatter.

```{r sigma-profile, fig.cap = "AICc profile across sigma for each spatial covariate. The red dashed line marks the optimized sigma."}
prof2 <- profile_sigma(opt2, n_pts = 15, verbose = FALSE)
plot(prof2)
```

For `opt2`, the profile confirms a clear, well-defined minimum in AICc near the optimized σ for both `land1` and `land2`, consistent with the absence of boundary warnings and the narrow confidence intervals reported by `summary()`. Compare this to the full model (`opt1`), which included the uninformative `land3` covariate: that profile would show a shallow or absent trough, reflecting the poor identifiability of σ when a variable has no true relationship with the response.

We can also apply the optimized kernel to the full raster surfaces. This is a key step if you want to generate spatially explicit predictions across the landscape.

`kernel_scale.raster()` smooths each raster layer using the estimated σ from the optimization. When `scale_center = TRUE`, the smoothed values are standardized using the mean and SD from the sample points, matching the scaling used during model fitting. This ensures that `terra::predict()` receives covariate values in the same space as the training data.

```{r kernel_raster}
rast_opt <- kernel_scale.raster(raster_stack = land_rast,
                                multiScaleR = opt2)
plot(rast_opt)
```

> **Note**: Because `site` is a non-spatial covariate (a site-level variable, not a raster), a constant dummy raster is created for it. This allows the `spatRaster` object to be passed directly to `terra::predict()`, which requires all model covariates to be present. See the `Spatial Projection and Clamping` vignette for a detailed treatment of projection, including how to handle cases where projected covariate values fall outside the range of the training data.

## Kernels

Data were simulated from a Gaussian kernel, but we can optimize using other kernels and compare model performance. Starting values had to be specified when using the exponential power kernel due to convergence issues.

```{r other_kernel, eval=FALSE}
## Negative Exponential
## Note: these kernel_prep objects are pre-loaded in the setup chunk.
## eval=FALSE keeps the vignette from re-running these computationally
## intensive calls during knitting.
exp_inputs <- kernel_prep(pts = pts,
                          raster_stack = land_rast,
                          max_D = 1700,
                          kernel = 'exp',
                          verbose = FALSE)

## Exponential Power
expow_inputs <- kernel_prep(pts = pts,
                            raster_stack = land_rast,
                            max_D = 1700,
                            kernel = 'expow',
                            verbose = FALSE)

## Fixed width buffer
fixed_inputs <- kernel_prep(pts = pts,
                            raster_stack = land_rast,
                            max_D = 1700,
                            kernel = 'fixed',
                            verbose = FALSE)
```

Optimize model with alternative kernels

```{r eval=FALSE}
opt_exp <- multiScale_optim(fitted_mod = mod0_2,
                            kernel_inputs = exp_inputs)

## Starting values needed
opt_expow <- multiScale_optim(fitted_mod = mod0_2,
                              kernel_inputs = expow_inputs,
                              par = c(500/expow_inputs$unit_conv,
                                      500/exp_inputs$unit_conv,
                                      5,10))

opt_fixed <- multiScale_optim(fitted_mod = mod0_2,
                              kernel_inputs = fixed_inputs)
```

If you explore each of the outputs, you'll see that we generally arrive at similar results. There are some warnings, but we won't worry about those right now. Next, we'll try to compare the relative support of these models using different weighted kernels.

# Model Selection

## Why standard AIC is not enough

When you optimize σ, you are estimating an additional parameter beyond the regression coefficients. Standard AIC from the base model does not account for this extra complexity. `multiScaleR` uses AICc (corrected for small samples) and BIC tables that account for the σ parameters in the penalty term. This is important: comparing models with `aic_tab()` or `bic_tab()` gives a fair comparison across models with different numbers of spatial covariates.

With `multiScaleR` it is possible to create AIC(c) or BIC tables from lists of fitted models.

```{r}
mod_list <- list(opt2,
                 opt_exp,
                 opt_expow,
                 opt_fixed)

## AIC table
aic_tab(mod_list)

## BIC table
bic_tab(mod_list)
```

The questionably-fitting exponential power model is best-supported by AICc ranking while the Gaussian model is best-supported by BIC. Because of the added complexity (extra parameter) with the exponential power kernel, uncertain optimization, and general equivalency to other models, we should probably lean toward the simpler Gaussian kernel model. This example highlights the challenges that may be encountered when trying to parse different kernels for estimating scales of effect. From observations of performance with simulated data, use of different kernels tends to result in a similar final model and interpretation of variable effects. This, in part, is why the Gaussian kernel is the default, and likely will meet most researcher needs.

Of greater interest is using model selection to identify the best supported model in terms of parameterization (not kernel used).

```{r}
## Landscape only effect
mod0_3 <- glm(counts ~ land1 + land2,
              family = poisson(),
              data = df)

## Landscape 1 only effect
mod0_4 <- glm(counts ~ land1,
              family = poisson(),
              data = df)

## Landscape 2 only effect
mod0_5 <- glm(counts ~ land2,
              family = poisson(),
              data = df)

## Landscape 3 only effect
mod0_6 <- glm(counts ~ land3,
              family = poisson(),
              data = df)

## Site only effect
## No multiScaleR optimization
mod0_7 <- glm(counts ~ site,
              family = poisson(),
              data = df)
```

Optimize scale for each alternative model

```{r eval=FALSE}
opt3 <- multiScale_optim(fitted_mod = mod0_3,
                         kernel_inputs = kernel_inputs)
opt4 <- multiScale_optim(fitted_mod = mod0_4,
                         kernel_inputs = kernel_inputs)
opt5 <- multiScale_optim(fitted_mod = mod0_5,
                         kernel_inputs = kernel_inputs)
opt6 <- multiScale_optim(fitted_mod = mod0_6,
                         kernel_inputs = kernel_inputs)
```

Put models into list and assess.

```{r}
mod_list2 <- list(opt1, opt2, opt3, opt4, opt5, opt6, mod0_7)

aic_tab(mod_list2)

bic_tab(mod_list2)
```

Model selection tables clearly show that the inclusion of scaled landscape variables is important, however information criterion may not have the greatest resolution or power to differentiate among competing models. Analyses will likely require a critical, holistic assessment of information criterion, parameter effect sizes, and precision in scale of effect estimates (i.e. sigma).

The challenge of identifying 'significant' scale relationships was also noted by Lowe et al., and the R package `scalescape` has a bootstrap function to determine the significance of parameters within the regression model. Such procedures are not currently implemented in `multiScaleR`.

# Optimization with `unmarked`

Among the model classes that `multiScaleR` can be used to optimize are those from `unmarked`. We will use the simulation functions from the package to create data for this analysis. First, we will simulate and visualize raster surfaces.

```{r}
rs <- sim_rast(user_seed = 999, dim = 250)
plot(rs)
```

## Poisson Count Model

We'll use the `bin1` and `cont2` surfaces for our data simulation. First, we'll simulate count data with a Poisson distribution. For the `unmarked` data simulation, we need to specify the count model intercept (`alpha`), and regression coefficients that describe the effect of the raster layers (`beta`), the scale of effect (`sigma`), number of survey points on the landscape (`n_points`), the number of replicate surveys conducted (`n_surv`), the detection probability (`det`; here, the probability of detecting an individual), and the maximum distance to consider in the analysis (`max_D`).

```{r}
rs <- terra::subset(rs, c(1,4))
s_dat <- sim_dat_unmarked(alpha = 1,
                          beta = c(0.75,-0.75),
                          kernel = 'gaussian',
                          sigma = c(75, 150),
                          n_points = 75,
                          n_surv = 5,
                          det = 0.5,
                          type = 'count',
                          raster_stack = rs,
                          max_D = 550,
                          user_seed = 555)
plot(s_dat$df$y ~ s_dat$df$bin1)
plot(s_dat$df$y ~ s_dat$df$cont2)
```

We can see that we have simulated a positive effect of `bin1` on counts and negative effect of `cont2` on counts. We can now prepare data for `unmarked` and optimize scale of effect with `multiScaleR`.

```{r}
library(unmarked)
kernel_inputs <- kernel_prep(pts = s_dat$pts,
                             raster_stack = rs,
                             max_D = 550,
                             kernel = 'gaus',
                             verbose = FALSE)

umf <- unmarkedFramePCount(y = s_dat$y,
                           siteCovs = kernel_inputs$kernel_dat)

## Base unmarked model
mod0_umf.p <- unmarked::pcount(~1 ~bin1 + cont2,
                               data = umf,
                               K = 100)
```

```{r eval=FALSE}
opt_umf.p <- multiScale_optim(fitted_mod = mod0_umf.p,
                              kernel_inputs = kernel_inputs,
                              n_cores = 8)
```

Compare optimized results with simulated values. Note that the detection reported by `unmarked` is on the logit scale, so must be back-transformed. Overall, both the scale parameters and the Poisson count model parameters were accurately recovered.

```{r}
summary(opt_umf.p)

plogis(opt_umf.p$opt_mod@estimates@estimates$det@estimates[[1]])
```

Assess kernel and effects plots

```{r}
plot(opt_umf.p)
plot_marginal_effects(opt_umf.p)
```

### Project model

Finally, we may want to project our fitted model to the landscape to make spatial predictions. To do this, we need to scale and center our raster surfaces after applying our kernel transformations. The `spatRaster` object can then be used with the `terra::predict` function. Depending upon the spatial coverage of your sample points, you may find predicted values across a broader region unrealistic. In these instances, you may need to clamp the scaled rasters to be within a reasonable range of those observed in your sample data.

```{r}
## Project model
rast_scale.center <- kernel_scale.raster(raster_stack = rs,
                                         multiScaleR = opt_umf.p,
                                         scale_center = TRUE,
                                         clamp = TRUE,
                                         pct_mx = 0.00)

plot(rast_scale.center)


ab.mod_pred <- terra::predict(rast_scale.center,
                             opt_umf.p$opt_mod,
                             type = 'state')

plot(ab.mod_pred)
```

## Binomial Occurrence Model

Now we'll simulate occurrence data suitable for analysis with `unmarked`. Preliminary experience has shown that simulation parameters can be harder to recover when using an occupancy model in `unmarked` with `multiScaleR`. We'll use the same raster surfaces.

```{r}
s_dat.occ <- sim_dat_unmarked(alpha = 0.25,
                              beta = c(-0.75,1),
                              kernel = 'gaussian',
                              sigma = c(225, 75),
                              n_points = 250,
                              n_surv = 5,
                              det = 0.5,
                              type = 'occ',
                              raster_stack = rs,
                              max_D = 800,
                              user_seed = 555)

plot(s_dat.occ$df$y ~ s_dat.occ$df$bin1)
plot(s_dat.occ$df$y ~ s_dat.occ$df$cont2)
```

Prepare inputs for use with `multiScaleR`

```{r}
kernel_inputs <- kernel_prep(pts = s_dat.occ$pts,
                             raster_stack = rs,
                             max_D = 800,
                             kernel = 'gaus',
                             verbose = FALSE)

## Occupancy frame
umf <- unmarkedFrameOccu(y = s_dat.occ$y,
                         siteCovs = kernel_inputs$kernel_dat)

## Base unmarked model
(mod0_umf.occ <- unmarked::occu(~1 ~bin1 + cont2,
                                data = umf))
```

```{r eval=FALSE}
opt_umf.occ <- multiScale_optim(fitted_mod = mod0_umf.occ,
                                kernel_inputs = kernel_inputs,
                                n_cores = 8)
```

```{r}
summary(opt_umf.occ)
plogis(opt_umf.occ$opt_mod@estimates@estimates$det@estimates[[1]])
```

We did a decent job of recovering simulated values. Note that the sample size was increased for this simulation. Previous explorations suggested that smaller sample sizes (e.g., 100) did not contain enough information to optimize the scales of effect (`sigma`). There is opportunity for further exploration of data needs / limitations for estimating scales of effect, especially related to more complex models such those fit with `unmarked`.

### Project model

```{r}
## Project model
rast_scale.center <- kernel_scale.raster(raster_stack = rs,
                                         multiScaleR = opt_umf.occ,
                                         scale_center = TRUE,
                                         clamp = T)
plot(rast_scale.center)

occ.mod_pred <- terra::predict(rast_scale.center,
                             opt_umf.occ$opt_mod,
                             type = 'state')

plot(occ.mod_pred)
```

# Other model classes

`multiScaleR` has been developed to work with any model class that has an update function. It has been tested with models fit using several different packages. If a particular model class does not work, please let me know so I can trouble shoot and try to incorporate functionality for it. In the example below, I demonstrate the fitting of a zero-inflated model with the `pscl` package. Notably, the zero-inflated term is a spatial variable that is also smoothed during the analysis. This requires some manual processing to simulate data.

```{r}
set.seed(12345)
rs <- sim_rast(user_seed = 999,
               dim = 500, resolution = 30)
rs <- terra::subset(rs, c(1,3))

## Zero-inflated parameters
zi_alpha <- -0.25
zi_beta <- -1

n_points <- 400
alpha <- 0.5
beta <- 0.75
kernel <- 'gaussian'
sigma <- c(400, # For main effect, bin1
           200) # For zero-inflated, cont1
StDev <- 2  # Negative binomial dispersion

## Generate random points
bnd <- buffer(vect(ext(rs[[1]])), -1000)
pts <- spatSample(x = rs[[1]],
                  size = n_points,
                  method = 'random',
                  ext = ext(bnd),
                  replace = F,
                  as.points = T)

kernel_out <- kernel_prep(pts = pts,
                          raster_stack = rs,
                          max_D = 1500,
                          kernel = kernel,
                          sigma = sigma,
                          verbose = FALSE)

## ZINB simulation
zi_prob <- plogis(zi_alpha + zi_beta*kernel_out$kernel_dat$cont1)

# Simulate zero-inflated counts
# 1 = excess zero, 0 = from Poisson
is_zero <- rbinom(length(zi_prob), size = 1, prob = zi_prob)

obs <- exp(alpha + beta*kernel_out$kernel_dat$bin1)

obs.zinb <- ifelse(is_zero, 0,
                   rnbinom(length(is_zero),
                           mu = obs,
                           size = StDev))

dat <- data.frame(cnt = obs.zinb,
                  kernel_out$kernel_dat)
with(dat, plot(cnt ~ bin1))
```

With data generated, we can now fit our zero-inflated model and `kernel_prep` and optimize with `multiScaleR`.

```{r}
library(pscl)
zinb_mod <- pscl::zeroinfl(cnt ~ bin1 | cont1,
                           dist = 'negbin',
                           data = dat)

kernel_inputs <- kernel_prep(pts = pts,
                             kernel = kernel,
                             max_D = 1500,
                             raster_stack = rs,
                             verbose = FALSE)
```

**NOTE: It is highly encouraged to fit your model as `pscl::zeroinfl`.**

```{r eval=FALSE}
zinb_opt <- multiScale_optim(zinb_mod,
                             kernel_inputs,
                             n_cores = 4)
```

We did a pretty good job in this simple scenario. Note the larger sample size. This was necessary to get good estimates for all parameters. Also, while `cont1` was estimated relatively accurately, there is much greater uncertainty when compared to `bin1`.

```{r}
summary(zinb_opt)
plot(zinb_opt)
plot_marginal_effects(zinb_opt)
```

# Other Functions & Features

Below is a brief walk through the other functions related to scales of effect that are included with the `multiScaleR` package.

## `kernel_dist`

In some cases, a direct assessment of scale of effect distance is desired.

```{r}
## With fitted model
kernel_dist(opt2)

## Distance of 95% kernel
kernel_dist(opt2, prob = 0.95)

```

It's also possible to calculate the distance encompassed by specified kernels without a fitted model.

```{r}
kernel_dist(kernel = "gaussian", sigma = 100, prob = 0.9)

kernel_dist(kernel = "exp", sigma = 100, prob = 0.9)

kernel_dist(kernel = "expow", sigma = 100, beta = 5, prob = 0.9)
```

## `plot_kernel`

This is a generic function to visualize how kernel weight decays with distance

```{r}
plot_kernel(kernel = 'exp',
            sigma = 50)

plot_kernel(kernel = 'expow',
            beta = 5,
            sigma = 50)

plot_kernel(kernel = 'gaussian',
            sigma = 50)

plot_kernel(kernel = 'gaussian',
            sigma = 50)

plot_kernel(kernel = 'gaussian',
            sigma = 50,
            scale_dist = F)

plot_kernel(kernel = 'gaussian',
            sigma = 50,
            add_label = F)
```

## `sim_rast`

Raster surfaces can be simulated that are amenable for use the `sim_dat` function.

```{r error=FALSE, message=FALSE}
r_sim1 <- sim_rast(dim = 100,
                   resolution = 30,
                   user_seed = 555)

r_sim2 <- sim_rast(dim = 100,
                   resolution = 30,
                   autocorr_range1 = 100,
                   autocorr_range2 = 1,
                   sill = 10,
                   user_seed = 555)

plot(r_sim1)
plot(r_sim2)
```

## `sim_dat`

This function will simulate data from scaled raster surfaces.

```{r}
s_dat <- sim_dat(alpha = 0.25,
                 beta = c(0.75, -0.75),
                 kernel = 'gaussian',
                 sigma = c(350, 200),
                 type = 'count',
                 n_points = 100,
                 raster_stack = terra::subset(r_sim1, c(1,4)),
                 min_D = 250,
                 user_seed = 999)
```

Look at the simulated covariate relationships with counts

```{r}
plot(s_dat$df$y ~ s_dat$df$bin1)
plot(s_dat$df$y ~ s_dat$df$cont2)
```

Use the simulated data with `multiScaleR`

```{r}
sim_mod <- glm(y ~ bin1 + cont2,
               family = 'poisson',
               data = s_dat$df)

kernel_inputs <- kernel_prep(pts = s_dat$pts,
                             raster_stack = r_sim1,
                             max_D = 1500,
                             kernel = 'gaussian',
                             verbose = FALSE)
```

```{r eval=FALSE}
sim_opt <- multiScale_optim(fitted_mod = sim_mod,
                            kernel_inputs = kernel_inputs)
```

Check results

```{r}
summary(sim_opt)

plot(sim_opt)

plot(kernel_scale.raster(raster_stack = r_sim1,
                         multiScaleR = sim_opt))

plot_marginal_effects(sim_opt)
```