| Title: | Spatial Lag Model Trees | 
| Date: | 2019-04-06 | 
| Version: | 1.0-1 | 
| Description: | Model-based linear model trees adjusting for spatial correlation using a simultaneous autoregressive spatial lag, Wagner and Zeileis (2019) <doi:10.1111/geer.12146>. | 
| Depends: | R (≥ 3.3.0), partykit, spatialreg | 
| Imports: | Formula (≥ 1.2-1) | 
| License: | GPL-2 | GPL-3 | 
| NeedsCompilation: | no | 
| Packaged: | 2019-04-06 20:03:40 UTC; zeileis | 
| Author: | Martin Wagner [aut],
  Achim Zeileis  | 
| Maintainer: | Achim Zeileis <Achim.Zeileis@R-project.org> | 
| Repository: | CRAN | 
| Date/Publication: | 2019-04-08 08:36:49 UTC | 
Determinants of Regional Economic Growth
Description
Growth regression data for NUTS2 regions in the European Union.
Usage
data("GrowthNUTS2")
Format
A data frame containing 255 observations on 58 variables.
- ggdpcap
 numeric. Average annual growth rate of real GDP per capita over the period 1995-2005.
- accessair
 numeric. Measure for potential accessability by air.
- accessrail
 numeric. Measure for potential accessability by rail.
- accessroad
 numeric. Measure for potential accessability by road.
- airportdens
 numeric. Airport density (number of airports per sqkm).
- airports
 factor. Number of airports.
- arh0
 numeric. Initial activity rate, highly educated.
- arl0
 numeric. Initial activity rate, low educated.
- arm0
 numeric. Initial activity rate, medium educated.
- art0
 numeric. Initial activity rate, total.
- capital
 factor. Does the region host country capital city?
- connectair
 numeric. Connectivity to comm. airports by car of the capital or centroid of region.
- connectsea
 numeric. Connectivity to comm. seaports by car of the capital or centroid of region.
- distcap
 numeric. Distance to capital city of respective country.
- distde71
 numeric. Distance to Frankfurt.
- empdens0
 numeric. Initial employment density.
- ereh0
 numeric. Initial employment rate, highly educated.
- erel0
 numeric. Initial employment rate, low educated.
- erem0
 numeric. Initial employment rate, medium educated.
- eret0
 numeric. Initial employment rate, total.
- gdpcap0
 numeric. Real GDP per capita in logs in 1995.
- gpop
 numeric. Growth rate of population.
- hazard
 numeric. Sum of all weighted hazard values.
- hrstcore
 numeric. Human resources in science and technology (core).
- intf
 numeric. Proportion of firms with own website regression.
- outdens0
 numeric. Initial output density.
- popdens0
 numeric. Initial population density.
- raildens
 numeric. Rail density (length of railroad network in km per sqkm).
- regboarder
 factor. Border region?
- regcoast
 factor. Coastal region?
- regobj1
 factor. Is the region within an Objective 1 region?
- regpent27
 factor. Pentagon EU 27 region? (London, Paris, Munich, Milan, Hamburg.)
- roaddens
 numeric. Road density (length of road network in km per sqkm).
- seaports
 factor. Does the region have a seaport?
- settl
 factor. Settlement structure.
- shab0
 numeric. Initial share of NACE A and B (Agriculture) in GVA.
- shce0
 numeric. Initial share of NACE C to E (Mining, Manufacturing and Energy) in GVA.
- shgfcf
 numeric. Share of gross fixed capital formation in gross value added.
- shjk0
 numeric. Initial share of NACE J to K (Business services) in GVA.
- shsh
 numeric. Share of highly educated in working age population.
- shsl
 numeric. Share of low educated in working age population.
- shlll
 numeric. Life long learning.
- shsm
 numeric. Share of medium educated in working age population.
- telf
 factor. A typology of estimated levels of business telecommunications access and uptake.
- temp
 numeric. Extreme temperatures.
- urh0
 numeric. Initial unemployment rate, highly educated.
- url0
 numeric. Initial unemployment rate, low educated.
- urm0
 numeric. Initial unemployment rate, medium educated.
- urt0
 numeric. Initial unemployment rate, total.
- country
 factor. Country within which the region is located.
- cee
 factor. Is the region within a Central and Eastearn European country?
- piigs
 factor. Is the region within a PIIGS country? (Portugal, Ireland, Italy, Greece, Spain.)
- de
 factor. Is the region within Germany?
- es
 factor. Is the region within Spain?
- fr
 factor. Is the region within France?
- it
 factor. Is the region within Italy?
- pl
 factor. Is the region within Poland?
- uk
 factor. Is the region within the United Kingdom?
References
Schneider U, Wagner M (2012). Catching Growth Determinants with the Adaptive Lasso. German Economic Review, 13(1), 71-85. doi: 10.1111/j.1468-0475.2011.00541.x
Examples
data("GrowthNUTS2")
summary(GrowthNUTS2)
Spatial Weights for European Union NUTS2 Regions
Description
Spatial weight matrices for NUTS2 regions in the European Union.
Usage
data("WeightsNUTS2")
Format
A list containing 40 listw weight matrices.
Source
Journal of Applied Econometrics Data Archive.
http://qed.econ.queensu.ca/jae/2013-v28.4/cuaresma-feldkircher/
References
Crespo Cuaresma J, Feldkircher M (2013). Spatial Filtering, Model Uncertainty and the Speed of Income Convergence in Europe. Journal of Applied Econometrics, 28(4), 720-741. doi: 10.1002/jae.2277
Spatial Lag Model Trees
Description
Model-based recursive partitioning based on linear regression adjusting for a (global) spatial simultaneous autoregressive lag.
Usage
lagsarlmtree(formula, data, listw = NULL, method = "eigen",
  zero.policy = NULL, interval = NULL, control = list(),
  rhowystart = NULL, abstol = 0.001, maxit = 100, 
  dfsplit = TRUE, verbose = FALSE, plot = FALSE, ...)
Arguments
formula | 
 formula specifying the response variable and regressors and partitioning variables, respectively. For details see below.  | 
data | 
 data.frame to be used for estimating the model tree.  | 
listw | 
 a weights object for the spatial lag part of the model.  | 
method | 
 "eigen" (default) - the Jacobian is computed as the product 
of (1 - rho*eigenvalue) using   | 
zero.policy | 
 default NULL, use global option value; if TRUE assign zero to the lagged value of zones without 
neighbours, if FALSE (default) assign NA - causing   | 
interval | 
 default is NULL, search interval for autoregressive parameter  | 
control | 
 list of extra control arguments - see   | 
rhowystart | 
 numeric. A vector of length   | 
abstol | 
 numeric. The convergence criterion used for estimation of the model.
When the difference in log-likelihoods of the model from two consecutive
iterations is smaller than   | 
maxit | 
 numeric. The maximum number of iterations to be performed in estimation of the model tree.  | 
dfsplit | 
 logical or numeric.   | 
verbose | 
 Should the log-likelihood value of the estimated model be printed for every iteration of the estimation?  | 
plot | 
 Should the tree be plotted at every iteration of the estimation? Note that selecting this option slows down execution of the function.  | 
... | 
 Additional arguments to be passed to   | 
Details
Spatial lag trees learn a tree where each terminal node is associated with different regression coefficients while adjusting for a (global) spatial simultaneous autoregressive lag. This allows for detection of subgroup-specific coefficients with respect to selected covariates, while adjusting for spatial correlations in the data. The estimation algorithm iterates between (1) estimation of the tree given an offset of the spatial lag effect, and (2) estimation of the spatial lag model given the tree structure.
The code is still under development and might change in future versions.
Value
The function returns a list with the following objects:
formula | 
 The formula as specified with the   | 
call | 
 the matched call.  | 
tree | 
 The final   | 
lagsarlm | 
 The final   | 
data | 
 The dataset specified with the   | 
nobs | 
 Number of observations.  | 
loglik | 
 The log-likelihood value of the last iteration.  | 
df | 
 Degrees of freedom.  | 
dfsplit | 
 degrees of freedom per selected split as specified with the   | 
iterations | 
 The number of iterations used to estimate the   | 
maxit | 
 The maximum number of iterations specified with the   | 
rhowystart | 
 Offset in estimation of the first tree as specified in the   | 
abstol | 
 The prespecified value for the change in log-likelihood to evaluate
convergence, as specified with the   | 
listw | 
 The   | 
mob.control | 
 A list containing control parameters passed to
  | 
References
Wagner M, Zeileis A (2019). Heterogeneity and Spatial Dependence of Regional Growth in the EU: A Recursive Partitioning Approach. German Economic Review, 20(1), 67–82. doi: 10.1111/geer.12146 https://eeecon.uibk.ac.at/~zeileis/papers/Wagner+Zeileis-2019.pdf
See Also
Examples
## data and spatial weights
data("GrowthNUTS2", package = "lagsarlmtree")
data("WeightsNUTS2", package = "lagsarlmtree")
## spatial lag model tree
system.time(tr <- lagsarlmtree(ggdpcap ~ gdpcap0 + shgfcf + shsh + shsm |
    gdpcap0 + accessrail + accessroad + capital + regboarder + regcoast + regobj1 + cee + piigs,
  data = GrowthNUTS2, listw = WeightsNUTS2$invw,
  minsize = 12, alpha = 0.05))
print(tr)
plot(tr, tp_args = list(which = 1))
## query coefficients
coef(tr, model = "tree")
coef(tr, model = "rho")
coef(tr, model = "all")
system.time({
ev <- eigenw(WeightsNUTS2$invw)
tr1 <- lagsarlmtree(ggdpcap ~ gdpcap0 + shgfcf + shsh + shsm |
    gdpcap0 + accessrail + accessroad + capital + regboarder + regcoast + regobj1 + cee + piigs,
  data = GrowthNUTS2, listw = WeightsNUTS2$invw, method = "eigen",
  control = list(pre_eig = ev), minsize = 12, alpha = 0.05)
})
coef(tr1, model = "rho")