% File apc/vignettes/Exploiting.apc.get.design.Rnw % Part of the apc package, http://www.R-project.org % Copyright 2015 Bent Nielsen % Distributed under GPL 3 %\VignetteIndexEntry{Generating new models from design matrix function} \documentclass[a4paper,twoside,12pt]{article} \usepackage[english]{babel} \usepackage{booktabs,rotating,graphicx,amsmath,verbatim,fancyhdr,Sweave} \usepackage[colorlinks,linkcolor=red,urlcolor=blue]{hyperref} \newcommand{\R}{\textsf{\bf R}} \renewcommand{\topfraction}{0.95} \renewcommand{\bottomfraction}{0.95} \renewcommand{\textfraction}{0.1} \renewcommand{\floatpagefraction}{0.9} \DeclareGraphicsExtensions{.pdf,.jpg} \setcounter{secnumdepth}{1} \setcounter{tocdepth}{1} \oddsidemargin 1mm \evensidemargin 1mm \textwidth 160mm \textheight 230mm \topmargin -5mm \headheight 8mm \headsep 5mm \footskip 15mm \begin{document} \SweaveOpts{concordance=TRUE} \raggedleft \pagestyle{empty} \vspace*{0.1\textheight} \Huge {\bf Generating new models from design matrix function\\ \texttt{apc} package} \noindent\rule[-1ex]{\textwidth}{5pt}\\[2.5ex] \Large 18 March 2015 \vfill \normalsize \begin{tabular}{rl} Bent Nielsen & Department of Economics, University of Oxford \\ & \small \& Nuffield College \\ & \small \& Institute for Economic Modelling \\ & \normalsize \texttt{bent.nielsen@nuffield.ox.ac.uk} \\ & \url{http://users.ox.ac.uk/~nuff0078} \end{tabular} \normalsize \newpage \raggedright \parindent 3ex \parskip 0ex \tableofcontents \cleardoublepage \setcounter{page}{1} \pagestyle{fancy} \renewcommand{\sectionmark}[1]{\markboth{\thesection #1}{\thesection \ #1}} \fancyhead[OL,ER]{\sl Generating new models with \texttt{apc} package.} %\fancyhead[ER]{\sl \rightmark} \fancyhead[EL,OR]{\bf \thepage} \fancyfoot{} \renewcommand{\headrulewidth}{0.1pt} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Introduction} The \texttt{apc} package has a set number of models and hypotheses that can be explored. This document describes how alternative hypotheses can be analysed. An important feature in the \texttt{apc} package is that it generates a design matrix. Normally this is kept in the background. The design matrix can, however, be called using the function \texttt{apc.get.design} and then modified. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Impose functional form on age effect for Belgian lung cancer data} Here we impose particular functional forms on the age effect for Belgian lung cancer data from Clayton and Schifflers (1987a). The analysis follows Nielsen (2014). Initially we apply the standard analysis to the Belgian lung cancer data, focusing on the age drift model. Subsequently we construct a sub-models of the age drift model with the age-effect is restricted to be first cubic and then quadratic. First we set up the data <<>>= # attach apc library library(apc) # get data from precoded function data.list <- data.Belgian.lung.cancer() @ The precoded deviance analysis can be run as follows. <>= # Get a deviance table apc.fit.table(data.list,"poisson.dose.response") @ We also get the fitted parameters of the age drift model. <<>>= # Estimate selected model apc.fit.ad <- apc.fit.model(data.list,"poisson.dose.response","Ad") apc.fit.ad$coefficients.canonical @ We now want to construct a new design matrix. In order to do this we start by vectorising the data. We then get the corresponding design matrix for the age drift model. <<>>= # Vectorise data index <- apc.get.index(data.list) v.response <- data.list$response[index$index.data] v.dose <- data.list$dose[index$index.data] # Get design matrix for "Ad" model get.design <- apc.get.design(index,"Ad") m.design.ad <- get.design$design p <- ncol(m.design.ad) @ As an aside we should think about the structure of the age drift design matrix. One approach is to inspect the canonical coefficient estimates above, which keep tract of parameter labels. The other approach is to think through the dimensions of the problem. The data has 11 age groups, hence we generate 9 age double differences. With the age drift model we do not generate period and cohort parameters. This is described by the object \texttt{get.design\$difdif}. The linear plane of the model is unrestricted, hence it needs a level and 2 slopes. The two slopes are set in the age and cohort direction as indicates by \texttt{get.design\$slopes}. In total we have $p=12$ parameters. <<>>= # Explore this design matrix index$age.max p get.design$difdif get.design$slopes @ We achieve a \textit{quadratic} age structure by imposing all age double difference parameters to be equal, see Nielsen and Nielsen (2014, equations 10, 80). We do this by post-multiplying the $(n\times p)$ age drift design matrix, $X$ say with the following $(p \times 4)$-matrix, $H_\perp$ say. This gives the new design matrix $X_H = X H_\perp$. In the code we use the names \texttt{m.design.ad}, \texttt{m.rest.q}, and \texttt{m.design.adq} for $X$, $H_\perp$ for $X_H$ \begin{equation} H_\perp = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & 0 & 1 \end{pmatrix} \end{equation} <<>>= # Quadractic age effect: restrict double differences to be equal m.rest.q <- matrix(data=0,nrow=p,ncol=4) m.rest.q[1,1] <- 1 m.rest.q[2,2] <- 1 m.rest.q[3,3] <- 1 m.rest.q[4:p,4] <- 1 m.design.adq <- m.design.ad %*% m.rest.q @ Similarly, we achieve a \textit{cubic} age structure by imposing that the age double difference parameters should grow linearly, see Nielsen and Nielsen (2014, equations 10, 81). We do by redefining the restriction matrix $H_\perp$ as a $(p \times 5)$-matrix. \begin{equation} H_\perp = \begin{pmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & 0 & 1 & 9 \end{pmatrix} \end{equation} <<>>= # Cubic age effect: restrict double differences to be linear m.rest.c <- matrix(data=0,nrow=p,ncol=5) m.rest.c[1,1] <- 1 m.rest.c[2,2] <- 1 m.rest.c[3,3] <- 1 m.rest.c[4:p,4] <- 1 m.rest.c[4:p,5] <- seq(1,p-3) m.design.adc <- m.design.ad %*% m.rest.c @ We can now refit the model with the new design matrices. <<>>= # Poisson regression for dose-response and with log link fit.ad <- glm.fit(m.design.ad,v.response, family=poisson(link="log"),offset=log(v.dose)) fit.adc <- glm.fit(m.design.adc,v.response, family=poisson(link="log"),offset=log(v.dose)) fit.adq <- glm.fit(m.design.adq,v.response, family=poisson(link="log"),offset=log(v.dose)) @ From this we get deviance test statistics. These are asymptotically $\chi^2$ under the Poisson assumption. So we need to find the corresponding degrees of freedom and compare with a $\chi^2$ table, and fiddle a bit with the output. <<>>= # Deviance test statistics dev.ad.c <- fit.adc$deviance - fit.ad$deviance dev.ad.q <- fit.adq$deviance - fit.ad$deviance # Degrees of freedom df.ad.c <- ncol(m.design.ad) - ncol(m.design.adc) df.ad.q <- ncol(m.design.ad) - ncol(m.design.adq) # p-values p.ad.c <- pchisq(dev.ad.c,df.ad.c,lower.tail=FALSE) p.ad.q <- pchisq(dev.ad.q,df.ad.q,lower.tail=FALSE) # Test for cubic restriction fit.tab<-matrix(nrow=2,ncol=3) colnames(fit.tab)<-c("LR.vs.Ad","df.vs.Ad","prob(>chi_sq)") rownames(fit.tab)<-c("cubic","quadratic") fit.tab[1,1:3] <- c(dev.ad.c,df.ad.c,p.ad.c) fit.tab[2,1:3] <- c(dev.ad.q,df.ad.q,p.ad.q) fit.tab @ The estimated coefficients are reported below. Note that the three first coordinates determining the linear plane are only little changed. <<>>= # Coefficients fit.ad$coefficients fit.adc$coefficients fit.adq$coefficients @ \section*{References} \begin{description} \item Clayton, D., Schifflers, E. (1987a) Models for temperoral variation in cancer rates. I: age-period and age-cohort models. \textit{Statistics in Medicine} 6, 449-467. \item Nielsen, B. (2014) Deviance analysis of age-period-cohort models. \textit{Download}: \url{http://www.nuffield.ox.ac.uk/economics/papers/2014/apc_deviance.pdf}. \item Nielsen, B., Nielsen, J.P. (2014) Identification and forecasting in mortality models. \textit{Scientific World Journal}. vol. 2014, Article ID 347043, 24 pages. doi:10.1155/2014/347043. \textit{Open access}: \url{http://www.hindawi.com/journals/tswj/2014/347043/}. \end{description} \end{document}