Processing math: 100%

Deviance Residuals

Recall that the likelihood of a model is the probability of the data set given the model (P(D|θ)).

The deviance of a model is defined by

D(θ,D)=2(log(P(D|θs))log(P(D|θ)))

where θs is the saturated model which is so named because it perfectly fits the data.

In the case of normally distributed errors the likelihood for a single prediction (μi) and data point (yi) is given by

P(yi|μi)=1σ2πexp(12(yiμiσ)2) and the log-likelihood by

log(P(yi|μi))=log(σ)12(log(2π))12(yiμiσ)2

The log-likelihood for the saturated model, which is when μi=yi, is therefore simply

log(P(yi|μsi))=log(σ)12(log(2π))

It follows that the unit deviance is

di=2(log(P(yi|μsi))log(P(yi|μi)))

di=2(12(yiμiσ)2)

di=(yiμiσ)2

As the deviance residual is the signed squared root of the unit deviance,

ri=sign(yiμi)di in the case of normally distributed errors we arrive at ri=yiμiσ which is the Pearson residual.

To confirm this consider a normal distribution with a ˆμ=2 and σ=0.5 and a value of 1.

library(extras)
mu <- 2
sigma <- 0.5
y <- 1

(y - mu) / sigma
#> [1] -2
dev_norm(y, mu, sigma, res = TRUE)
#> [1] -2
sign(y - mu) * sqrt(dev_norm(y, mu, sigma))
#> [1] -2
sign(y - mu) * sqrt(2 * (log(dnorm(y, y, sigma)) - log(dnorm(y, mu, sigma))))
#> [1] -2