% File apc/vignettes/CheckPrediction.Rnw
% Part of the apc package, http://www.R-project.org
% Copyright 2015 Bent Nielsen
% Distributed under GPL 3

%\VignetteIndexEntry{Identification: illustrate and check identification used in plot fit function}

\documentclass[a4paper,twoside,12pt]{article}

\usepackage[english]{babel}
\usepackage{booktabs,rotating,graphicx,amsmath,verbatim,fancyhdr,Sweave}
\usepackage[colorlinks,linkcolor=red,urlcolor=blue]{hyperref}
\newcommand{\R}{\textsf{\bf R}}
\renewcommand{\topfraction}{0.95}
\renewcommand{\bottomfraction}{0.95}
\renewcommand{\textfraction}{0.1}
\renewcommand{\floatpagefraction}{0.9}
\DeclareGraphicsExtensions{.pdf,.jpg}
\setcounter{secnumdepth}{1}
\setcounter{tocdepth}{1}

\oddsidemargin 1mm
\evensidemargin 1mm
\textwidth 160mm
\textheight 230mm
\topmargin -5mm
\headheight 8mm
\headsep 5mm
\footskip 15mm

\begin{document}
\SweaveOpts{concordance=TRUE}

\raggedleft
\pagestyle{empty}
\vspace*{0.1\textheight}
\Huge
{\bf Identification in \texttt{apc} package}
\noindent\rule[-1ex]{\textwidth}{5pt}\\[2.5ex]
\Large
12 April 2015
\vfill
\normalsize
\begin{tabular}{rl}
 Bent Nielsen  & Department of Economics, University of Oxford \\
               & \small \& Nuffield College \\
               & \small \& Institute for Economic Modelling \\
               & \normalsize \texttt{bent.nielsen@nuffield.ox.ac.uk} \\
               & \url{http://users.ox.ac.uk/~nuff0078}
\end{tabular}
\normalsize
\newpage
\raggedright
\parindent 3ex
\parskip 0ex
\tableofcontents
\cleardoublepage
\setcounter{page}{1}
\pagestyle{fancy}
\renewcommand{\sectionmark}[1]{\markboth{\thesection #1}{\thesection \ #1}}
\fancyhead[OL,ER]{\sl Identification in \texttt{apc} package.}
%\fancyhead[ER]{\sl \rightmark}
\fancyhead[EL,OR]{\bf \thepage}
\fancyfoot{}
\renewcommand{\headrulewidth}{0.1pt}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}

The
\texttt{apc} package
works with a number of parametrisations.
The estimation is done in terms of the canonical parameter defined in
Kuang, Nielsen and Nielsen (2008).
This parametrisation is invariant to the group of transformations characterising
the identification problem.

The plots of the fit do, however, allow the display of a two ad hoc identifications of
the time effects.  These time effects are not invariant, but may nonetheless be helpful in
applications.  The purpose with the vignette is to demonstrate that these parametrisations
are equivalent. This is done by showing that they give the same fit.

The two ad hoc identifications are as follows.

The first, "sum.sum", displays the components that appear in the representation theorem in
Nielsen (2014),
which is a development of the representation in 
Kuang, Nielsen and Nielsen (2008).

The second, "detrend" is possibly more useful in practice.
This is the default option in
\texttt{apc.plot.fit}.
It displays detrended versions of
the double sums, so that they start and end in zero.  Thus, it displays the cumulated deviations from
linearity arising from the double difference parameters in canonical parameter.
Moreover, the three time effects are treated symmetrically so that the identification of one time
trend has no bearing on the other time trends. This is based on the development in
Nielsen (2014).

The two ad hoc identifications have the feature that the number of non-zero elements 
is the same as the number of elements in the canonical parameter.

We use the Belgian lung cancer data from
Clayton and Schifflers (1987a)
as an example.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Ad hoc identification of the time effects}

The age-period-cohort model is given by
\begin{equation}
    \mu_{ik}
    =\alpha_i
    +\beta_j
    +\gamma_k
    +\delta
    ,
\end{equation}
for age $i$, cohort $k$ and period $j=i+k-1$.
The time effects
$\alpha_i$,
$\beta_j$,
$\gamma_k$
are not fully identified.

\textbf{Canonical parameter.}
The
\texttt{apc} package works around the
canonical parameter
from
Kuang, Nielsen, Nielsen (2008)
\begin{equation}
    \xi = (\mu_{UU}, \mu_{U+1,U} , \mu_{U,U+1},
        \dots , \Delta^2\alpha_i , \dots ,
        \dots , \Delta^2\beta _j , \dots ,
        \dots , \Delta^2\gamma_k , \dots )'
    .    
\end{equation}
The design matrix is defined from the following representation
taken from
Nielsen (2014)
\begin{equation}
    \mu_{ik}
    =
    \nu_0
    +(i-U)\nu_a
    +(k-U)\nu_c
    +A_i
    +B_j
    +C_k
    \label{mu:xi}
\end{equation}
where
$j=i+k-1$
and
$U=\textrm{integer}\{(L+3)/2\}$
while
\begin{eqnarray}
    A_i
    &=&
        1_{(i<U)}   \sum_{t=i+2}^{U+1}\sum_{s=t}^{U+1}\Delta^2\alpha_s
    +   1_{(i>U+1)} \sum_{t=U+2}^i    \sum_{s=U+2}^t  \Delta^2\alpha_s
    \label{A}
\\  
    B_j
    &=&
        1_{(L\text{ odd \& }j=2U-2)}\Delta^2\beta_{2U}
    +   1_{(j>2U)}  \sum_{t=2U+1}^j   \sum_{s=2U+1}^t  \Delta^2\beta_s
    \label{B}
\\  
    C_k
    &=&
        1_{(k<U)}   \sum_{t=k+2}^{U+1}\sum_{s=t}^{U+1}\Delta^2\gamma_s
    +   1_{(k>U+1)} \sum_{t=U+2}^k    \sum_{s=U+2}^t  \Delta^2\gamma_s
    \label{C}
\end{eqnarray}
Six of these coefficients are zero. That is
$A_U=A_{U+1}=C_U=C_{U+1}=0$,
if $L$ is odd  then $B_{2U+1}=B_{2U+2}=0$, whereas
if $L$ is even then $B_{2U}=B_{2U+1}=0$.
We get the canonical parameter through the command
<<>>=
# attach apc library
library(apc)
# get data from precoded function
data <- data.Belgian.lung.cancer()
# Estimate APC model
model.family <- "poisson.dose.response"
model.design <- "APC"
fit <- apc.fit.model(data,model.family,model.design)
c.c <- fit$coefficients.canonical
@

\textbf{ss.dd}
or
\textbf{sum.sum}
parametrisation.
We can also parametrise in terms of the coefficients $A_i$, $B_j$, $C_k$.
These coefficients are ad hoc identified versions of 
the time effects
$\alpha_i$,
$\beta_j$,
$\gamma_k$.
We get these parameters through the command
\texttt{apc.identify}.
Note, that this command is called indirectly by
\texttt{apc.plot.fit},
so there is no need to use this command when working with plots only.
The command is
<<>>=
id <- apc.identify(fit)
c.ssdd <- id$coefficients.ssdd
@

\textbf{detrend}
parametrisation.
In practise it is easier to see deviations from linearity through a different
ad hoc identification where the time effects are constrained to start and end in zero;
see
Nielsen (2014b)
for discussion of the interpretation.
From $A_i$, $B_j$ and $C_k$ defined in
(\ref{A})-(\ref{C})
we form
\begin{eqnarray}
    A^{detrend}_i
    &=&
    A_i-A_1 - \frac{i-1}{I-1}(A_I-A_1)
    ,
    \label{Adetrend}
\\
    B^{detrend}_{L+j}
    &=&
    B_{L+j}-B_{L+1} - \frac{j-1}{J-1}(B_{L+J}-B_{L+1})
    ,
\\
    C^{detrend}_k
    &=&
    C_k-C_1 - \frac{k-1}{K-1}(C_K-C_1)
    .
    \label{Cdetrend}
\end{eqnarray}
The constants and linear slopes are corrected correspondingly giving
\begin{eqnarray}
    \nu_0^{detrend}
    &=&
    \nu_0 + A_1 + B_{L+1} - \frac{L}{J+1}(B_{L+J}-B_{L+1}) + C_1
    ,
\\    
    \nu_a^{detrend}
    &=&
        \nu_a + \frac{A_I-A_1}{I-1} + \frac{B_{L+J}-B_{L+1}}{J-1}
\\    
    \nu_c^{detrend}
    &=&
        \nu_c + \frac{B_{L+J}-B_{L+1}}{J-1} + \frac{C_K-C_1}{K-1}
\end{eqnarray}
We then get the representation
\begin{eqnarray}
    \mu_{ik}
    &=&
    \nu_0^{detrend}
    +(i-1)
    \nu_a^{detrend} 
    +(k-1)
    \nu_c^{detrend}
    +A_i^{detrend}
    +B_{L+j}^{detrend}
    +C_k^{detrend}
    \label{mu:xi_detrend}
\end{eqnarray}
These coefficients are also found using the identification command
\texttt{apc.identify},
so that
<<>>=
c.detrend <- id$coefficients.detrend
@


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Ad hoc identification for 2-factor models}

When one of the three time effects,
$\alpha_i$,
$\beta_j$
or
$\gamma_k$
is absent the linear trends can be attributed uniquely to the time effects,
whereas the level is still unidentified.
The
\texttt{apc} package
offers a parametrisation of that type.

\textbf{demean}
parametrisation.
The model design "AC" is a 
two factor model.
The representation derives from the detrended representation
(\ref{mu:xi_detrend}).
With
$\Delta^2\beta_j=0$
then $B_j=0$
and hence
$B^{detrend}_j=0$
in that formula.
Thus we can attribute the linear trends to the age and cohort time effect.
This is done so that the first element of each time effect is zero. This identification choice
is called the demeaned parameter.
\begin{eqnarray}
    \mu_{ik}
    &=&
    \nu_0^{demean}
    +A_i^{demean}
    +C_k^{demean}
    \label{mu:xi_demean}
\end{eqnarray}
where
\begin{eqnarray}
    A_i^{demean}
    &=&
    A_i^{detrend}
    +(i-1)
    \nu_a^{detrend}
    \qquad
    C_k^{demean}
    =
    C_k^{detrend}
    +(k-1)
    \nu_c^{detrend}
\end{eqnarray}
Note that
$A_1^{demean}=C_1^{demean}=0$.
We get the single sum of difference parameters through the following command
<<>>=
# fit AC model
model.design <- "AC"
fit.ac <- apc.fit.model(data,model.family,model.design)
# identify to get sums of difference parameters
id.ac <- apc.identify(fit.ac)
c.demean <- id.ac$coefficients.demean
@

We get an exactly, invariant identified parameter by taking first differences of 
$A_i^{demean},C_k^{demean}$.
This could also have been used as canonical parameter for this two factor model.
The level is as before. 
<<>>=
# get difference parameters
c.dif <- id.ac$coefficients.dif
@


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Check canonical parametrisation}

The fit of the canonical parametrisation is discussed.

In the background the generalized linear model package is used.
\texttt{apc} supplies a design matrix that is fed into 
\texttt{glm.fit}.
This delivers the following standard output including coefficients and linear predictors.
<<>>=
# Coefficients
fit$coefficients
# Arrange linear predictors as matrix in original format
# Create matrix of original dimension
m.fit <- fit$response       
m.fit[fit$index.data] <-fit$linear.predictors
m.fit
@

The coefficients are equal to the canonical parameters.
We get the summary of these and check that the coefficients are the same as for
the generalized linear model estimation.
<<>>=
# Canonical paramters
c.c
# Check canonical coefficients are the same as the standard coefficients
sum(abs(c.c[,1]-fit$coefficients))
@

We check that the canonical parameters give the same fit as the standard
generalized linear model fit.  We get the design matrix and multiply with
the canonical parameter
<<>>=
# get design matrix
m.design <- apc.get.design(fit)$design
# create matrix of original dimension
m.fit.canonical.no.dose <- fit$response     
m.fit.canonical.no.dose[fit$index.data] <- m.design %*% c.c[,1]
if(is.null(data$dose)==TRUE)
 m.fit.canonical <- m.fit.canonical.no.dose
if(is.null(data$dose)==FALSE)
 m.fit.canonical <- m.fit.canonical.no.dose + log(data$dose)
m.fit.canonical
# Check canonical coefficients give same fit as standard fit
sum(abs(m.fit-m.fit.canonical),na.rm=TRUE)
@

%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Check identification of double sums of double differences}

We check the ad hoc identification in terms of the double sums of double differences
(\ref{A})-(\ref{C}).

We first get the cooefficients through
<<>>=
c.ssdd
@
Compare the output for
\texttt{c.ssdd}
with that for
\texttt{c.c}
above.
\texttt{c.ssdd} has more elements as some zero elements are squeezed in according
to the indicator functions in
(\ref{mu:xi}).
The coefficients and standard errors neighbouring the zero elements can be recognised directly from
\texttt{c.ssdd}.
The number of non-zero elements is the same. 

The prediction based on these parameters is found as follows.
An age-cohort indexation is needed for the data.
This is generated by the command
\texttt{apc.get.index}.
It is automatically linked as an object for fit.
We can therefore generate the linear prediction as follows and
compare with the previous predictions.
Note, this first check requires that model design is APC.
<<>>=
age <- fit$index.trap[,1]
coh <- fit$index.trap[,2]
# From this we get the period. Need to correct for lowest period value.
per.zero <- fit$per.zero
per <- age+coh-1-per.zero
U <- fit$U
# Then we can compute the prediction as a vector
if(model.design=="APC")
{
 prediction <- c.ssdd[1,1] +
  + c.ssdd[2,1]*(age-U) +
  + c.ssdd[3,1]*(coh-U) +
  + c.ssdd[id$index.age.max[age],1] +
  + c.ssdd[id$index.per.max[per],1] +
  + c.ssdd[id$index.coh.max[coh],1]
 # Then we embed it into a matrix 
 m.fit.ssdd <- fit$response      
 m.fit.ssdd[fit$index.data] <- prediction
 # Add dose
 m.fit.ssdd  <- m.fit.ssdd + log(data$dose)
 # Check fit is correct
 sum(abs(m.fit.canonical-m.fit.ssdd),na.rm=TRUE)
} 
@

The above code will fail if
\texttt{model.design}
is not APC.
We then need to introduce various checks as in the following snippet,
that recycles
\texttt{age},
\texttt{per},
\texttt{coh},
\texttt{per.zero},
\texttt{U}
defined above.
<<>>=
# We need two further variables
slopes <- fit$slopes
difdif <- fit$difdif
# Compute the prediction as a vector
prediction <- c.ssdd[1,1]
# Add the age double differences and age slope
if(difdif[1]) # TRUE if age double differences
 prediction <- prediction + c.ssdd[id$index.age.max[age],1]
if(difdif[2]) # TRUE if period double differences
 prediction <- prediction + c.ssdd[id$index.per.max[per],1]
if(difdif[3]) # TRUE if cohort double differences
 prediction <- prediction + c.ssdd[id$index.coh.max[coh],1]
if(slopes[1]) # TRUE if age linear trend
{ prediction <- prediction + c.ssdd[2,1]*(age-U)
  if(slopes[3])
  prediction <- prediction + c.ssdd[3,1]*(coh-U)
}  
if(slopes[1]==FALSE)
{ if(slopes[2])
   prediction <- prediction + c.ssdd[2,1]*(per-1)
  if(slopes[3])
   prediction <- prediction + c.ssdd[2,1]*(coh-U)
}
# Then we embed it into a matrix 
m.fit.ssdd <- fit$response      
m.fit.ssdd[fit$index.data] <- prediction
# Add dose 
if(is.null(data$dose)==FALSE)
 m.fit.ssdd  <- m.fit.ssdd + log(data$dose)
# Check fit is correct 
sum(abs(m.fit.canonical-m.fit.ssdd),na.rm=TRUE)
@

We can plot these coefficients using
\texttt{apc.plot.fit}
and the reconcile with the prediction as follows.
<<>>=
apc.plot.fit(fit,type="sum.sum")
m.fit.canonical.no.dose
@
The anchoring point is for
$age=cohort=U=6$ and $period=L+1$,
or in real coordinates
$age="50-54"$
and
$period="1955-1959"$
corresponding to cohort
$1905$.
The predictor is $1.9574$.
This is the sum of the entries in panels
(d)-(i) of the plot, which are all zero apart from the constant.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Check identification of detrended double sums of double differences}

The double differences capture variation in the time trends over and above
linear trends. Thus, when considering double sums of double differences
only variation over and above linear trends should be interpreted.
To emphasize this variation,
Nielsen (2014)
suggests to detrend the double sums so that they start and end in zero as outlined in
(\ref{mu:xi:detrend}).
Some interpretations of those detrended double sums are also offered in that paper.

The prediction based on these parameters is found as follows.
<<>>=
# We use age, coh, per, per.zero, id, slopes, difdif defined above.
# Compute the prediction as a vector
prediction <- c.detrend[1,1]
# Add the age double differences and age slope
if(difdif[1]) # TRUE if age double differences
 prediction <- prediction + c.detrend[id$index.age.max[age],1]
if(difdif[2]) # TRUE if period double differences
 prediction <- prediction + c.detrend[id$index.per.max[per],1]
if(difdif[3]) # TRUE if cohort double differences
 prediction <- prediction + c.detrend[id$index.coh.max[coh],1]
if(slopes[1]) # TRUE if age linear trend
{ prediction <- prediction + c.detrend[2,1]*(age-1)
  if(slopes[3])
  prediction <- prediction + c.detrend[3,1]*(coh-1)
}  
if(slopes[1]==FALSE)
{ if(slopes[2])
   prediction <- prediction + c.detrend[2,1]*(per-1)
  if(slopes[3])
   prediction <- prediction + c.detrend[2,1]*(coh-1)
}
# Then we embed it into a matrix 
m.fit.detrend <- fit$response      
m.fit.detrend[fit$index.data] <- prediction
# Add dose 
if(is.null(data$dose)==FALSE)
 m.fit.detrend  <- m.fit.detrend + log(data$dose)
# Check fit is correct 
sum(abs(m.fit.canonical-m.fit.detrend),na.rm=TRUE)
@

We can plot these coefficients using
\texttt{apc.plot.fit}
and the reconcile with the prediction as follows.
<<>>=
apc.plot.fit(fit)
m.fit.canonical.no.dose
@
We consider the top left point of the prediction matrix with real coordinates
$age="25-29"$
and
$period="1970-74"$
corresponding to cohort
$1945$.
In mathematical coordinates, that is
$age=1$, $cohort=coh.max$, $period=per.max$.
Thus all double difference sums in panels (g)-(i) and age slope in panel (d) is zero.
The level in panel (e) is -2.34.
The cohort slope value in (f) is $0.68=(14-1)*0.052$.
In total this is $-1.66$.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Check identification for model design "AC"}

We now check the identification for the two factor model.
To check the parametrisation we first recalculate the fit using the canonical
parameter. Then the alternative parametrisations are checked
<<>>=
if(model.design=="AC")
{
 ################################### 
 # Get fit of canonical parameters
 # get the canonical parameters
 c.c.ac <- fit.ac$coefficients.canonical
 # Get design matrix
 m.design.ac <- apc.get.design(fit.ac)$design
 # Create matrix of original dimension
 m.fit.canonical.ac <- fit.ac$response
 m.fit.canonical.ac[fit.ac$index.data] <- m.design.ac %*% c.c.ac[,1]
 #####################################
 # Get fit of sum of difference parameters
 prediction <- c.demean[1,1] +
  + c.demean[id.ac$index.age.sub[age],1] +       
  + c.demean[id.ac$index.coh.sub[coh],1]
 # Create matrix of original dimension
 m.fit.demean.ac <- fit.ac$response
 m.fit.demean.ac[fit.ac$index.data] <- prediction 
}
@


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Summary of all checks}
<<>>=
# ALL CHECKS
sum(abs(c.c[,1]-fit$coefficients))
sum(abs(m.fit-m.fit.canonical),na.rm=TRUE)
sum(abs(m.fit.canonical-m.fit.ssdd),na.rm=TRUE)
sum(abs(m.fit.canonical-m.fit.detrend),na.rm=TRUE)
sum(abs(m.fit.canonical.ac-m.fit.demean.ac),na.rm=TRUE)
@

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{References}
\begin{description}
  \item
    Clayton, D., Schifflers, E. (1987a) 
    Models for temperoral variation in cancer rates. I: age-period and age-cohort models. 
    \textit{Statistics in Medicine} 6, 449-467.
  \item 
    Kuang, D., Nielsen, B. and Nielsen, J.P. (2008)
    Identification of the age-period-cohort model and the extended chain ladder model.
    \textit{Biometrika} 95, 979-986.
    \textit{Download}:
    Earlier version:
    \url{http://www.nuffield.ox.ac.uk/economics/papers/2007/w5/KuangNielsenNielsen07.pdf}.
  \item 
    Nielsen, B. (2014)
    Deviance analysis of age-period-cohort models. 
    \textit{Download}:
    \url{http://www.nuffield.ox.ac.uk/economics/papers/2014/apc_deviance.pdf}.
\end{description}


\end{document}