Help for package spData

Title:

Datasets for Spatial Analysis

Version:

2.3.4

Description:

Diverse spatial datasets for demonstrating, benchmarking and teaching spatial data analysis. It includes R data of class sf (defined by the package 'sf'), Spatial ('sp'), and nb ('spdep'). Unlike other spatial data packages such as 'rnaturalearth' and 'maps', it also contains data stored in a range of file formats including GeoJSON and GeoPackage, but from version 2.3.4, no longer ESRI Shapefile - use GeoPackage instead. Some of the datasets are designed to illustrate specific analysis techniques. cycle_hire() and cycle_hire_osm(), for example, is designed to illustrate point pattern analysis techniques.

Depends:

R (≥ 3.3.0)

Imports:

Suggests:

foreign, sf (≥ 0.9-1), spDataLarge (≥ 0.4.0), spdep, spatialreg

License:

CC0

RoxygenNote:

7.3.2

LazyData:

true

URL:

https://jakubnowosad.com/spData/

BugReports:

https://github.com/Nowosad/spData/issues

Additional_repositories:

https://jakubnowosad.com/drat

Encoding:

UTF-8

NeedsCompilation:

Packaged:

2025-01-07 17:30:43 UTC; jn

Author:

Roger Bivand

[aut], Jakub Nowosad

[aut, cre], Robin Lovelace

[aut], Angelos Mimis [ctb], Mark Monmonier [ctb] (author of the state.vbm dataset), Greg Snow [ctb] (author of the state.vbm dataset)

Maintainer:

Jakub Nowosad <nowosad.jakub@gmail.com>

Repository:

CRAN

Date/Publication:

2025-01-08 12:40:02 UTC

Data for Splash Dams in western Oregon

Description

Data for Splash Dams in western Oregon

Usage

SplashDams

Format

Formal class 'SpatialPointsDataFrame with 232 obs. of 6 variables:

streamName
locationCode
height
lastDate
owner
datesUsed

Source

R. R. Miller (2010) Is the Past Present? Historical Splash-dam Mapping and Stream Disturbance Detection in the Oregon Coastal Province. MSc. thesis, Oregon State University; packaged by Jonathan Callahan

Examples

if (requireNamespace("sp", quietly = TRUE)) {
  library(sp)
  data(SplashDams)
  plot(SplashDams, axes=TRUE)
}

Spatial patterns of conflict in Africa 1966-78

Description

The afcon data frame has 42 rows and 5 columns, for 42 African countries, exclusing then South West Africa and Spanish Equatorial Africa and Spanish Sahara. The dataset is used in Anselin (1995), and downloaded from before adaptation. The neighbour list object africa.rook.nb is the SpaceStat ‘rook.GAL’, but is not the list used in Anselin (1995) - paper.nb reconstructs the list used in the paper, with inserted links between Mauritania and Morocco, South Africa and Angola and Zambia, Tanzania and Zaire, and Botswana and Zambia. afxy is the coordinate matrix for the centroids of the countries.

Usage

afcon

Format

This data frame contains the following columns:

x: an easting in decimal degrees (taken as centroid of shapefile polygon)
y: an northing in decimal degrees (taken as centroid of shapefile polygon)
totcon: index of total conflict 1966-78
name: country name
id: country id number as in paper

Note

All source data files prepared by Luc Anselin, Spatial Analysis Laboratory, Department of Agricultural and Consumer Economics, University of Illinois, Urbana-Champaign.

Source

Anselin, L. and John O'Loughlin. 1992. Geography of international conflict and cooperation: spatial dependence and regional context in Africa. In The New Geopolitics, ed. M. Ward, pp. 39-75. Philadelphia, PA: Gordon and Breach. also: Anselin, L. 1995. Local indicators of spatial association, Geographical Analysis, 27, Table 1, p. 103.

Examples

data(afcon)
if (requireNamespace("spdep", quietly = TRUE)) {
  library(spdep)
  plot(africa.rook.nb, afxy)
  plot(diffnb(paper.nb, africa.rook.nb), afxy, col="red", add=TRUE)
  text(afxy, labels=attr(africa.rook.nb, "region.id"), pos=4, offset=0.4)
  moran.test(afcon$totcon, nb2listw(africa.rook.nb))
  moran.test(afcon$totcon, nb2listw(paper.nb))
  geary.test(afcon$totcon, nb2listw(paper.nb))
}

Alaska multipolygon

Description

The object loaded is a sf object containing the state of Alaska from the US Census Bureau with a few variables from American Community Survey (ACS)

Usage

alaska

Format

Formal class 'sf' [package "sf"]; the data contains a data.frame with 1 obs. of 7 variables:

GEOID: character vector of geographic identifiers
NAME: character vector of state names
REGION: character vector of region names
AREA: area in square kilometers of units class
total_pop_10: numerical vector of total population in 2010
total_pop_15: numerical vector of total population in 2015
geometry: sfc_MULTIPOLYGON

The object is in projected coordinates using Alaska Albers (EPSG:3467).

Source

https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  data(alaska)

  plot(alaska["total_pop_15"])
}

Marshall's infant mortality in Auckland dataset

Description

(Use example(auckland) to load the data from shapefile and generate neighbour list on the fly). The auckland data frame has 167 rows (census area units — CAU) and 4 columns. The dataset also includes the "nb" object auckland.nb of neighbour relations based on contiguity, and the "polylist" object auckpolys of polygon boundaries for the CAU. The auckland data frame includes the following columns:

Usage

auckland

Format

This data frame contains the following columns:

Easting: a numeric vector of x coordinates in an unknown spatial reference system
Northing: a numeric vector of y coordinates in an unknown spatial reference system
M77_85: a numeric vector of counts of infant (under 5 years of age) deaths in Auckland, 1977-1985
Und5_81: a numeric vector of population under 5 years of age at the 1981 Census

Details

The contiguous neighbours object does not completely replicate results in the sources, and was reconstructed from auckpolys; examination of figures in the sources suggests that there are differences in detail, although probably not in substance.

Source

Marshall R M (1991) Mapping disease and mortality rates using Empirical Bayes Estimators, Applied Statistics, 40, 283–294; Bailey T, Gatrell A (1995) Interactive Spatial Data Analysis, Harlow: Longman — INFOMAP data set used with permission.

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  auckland <- sf::st_read(system.file("shapes/auckland.gpkg", package="spData")[1])
  plot(sf::st_geometry(auckland))
  if (requireNamespace("spdep", quietly = TRUE)) {
    auckland.nb <- spdep::poly2nb(auckland)
  }
}

House sales prices, Baltimore, MD 1978

Description

House sales price and characteristics for a spatial hedonic regression, Baltimore, MD 1978. X,Y on Maryland grid, projection type unknown.

Usage

baltimore

Format

A data frame with 211 observations on the following 17 variables.

STATION: a numeric vector
PRICE: a numeric vector
NROOM: a numeric vector
DWELL: a numeric vector
NBATH: a numeric vector
PATIO: a numeric vector
FIREPL: a numeric vector
AC: a numeric vector
BMENT: a numeric vector
NSTOR: a numeric vector
GAR: a numeric vector
AGE: a numeric vector
CITCOU: a numeric vector
LOTSZ: a numeric vector
SQFT: a numeric vector
X: a numeric vector
Y: a numeric vector

Source

Prepared by Luc Anselin. Original data made available by Robin Dubin, Weatherhead School of Management, Case Western Research University, Cleveland, OH. http://sal.agecon.uiuc.edu/datasets/baltimore.zip

References

Dubin, Robin A. (1992). Spatial autocorrelation and neighborhood quality. Regional Science and Urban Economics 22(3), 433-452.

Examples

data(baltimore)
str(baltimore)

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  baltimore_sf <- baltimore %>% st_as_sf(., coords = c("X","Y"))
  plot(baltimore_sf["PRICE"])
}

Corrected Boston Housing Data

Description

The boston.c data frame has 506 rows and 20 columns. It contains the Harrison and Rubinfeld (1978) data corrected for a few minor errors and augmented with the latitude and longitude of the observations. Gilley and Pace also point out that MEDV is censored, in that median values at or over USD 50,000 are set to USD 50,000. The original data set without the corrections is also included in package mlbench as BostonHousing. In addition, a matrix of tract point coordinates projected to UTM zone 19 is included as boston.utm, and a sphere of influence neighbours list as boston.soi.

Format

This data frame contains the following columns:

TOWN: a factor with levels given by town names
TOWNNO: a numeric vector corresponding to TOWN
TRACT: a numeric vector of tract ID numbers
LON: a numeric vector of tract point longitudes in decimal degrees
LAT: a numeric vector of tract point latitudes in decimal degrees
MEDV: a numeric vector of median values of owner-occupied housing in USD 1000
CMEDV: a numeric vector of corrected median values of owner-occupied housing in USD 1000
CRIM: a numeric vector of per capita crime
ZN: a numeric vector of proportions of residential land zoned for lots over 25000 sq. ft per town (constant for all Boston tracts)
INDUS: a numeric vector of proportions of non-retail business acres per town (constant for all Boston tracts)
CHAS: a factor with levels 1 if tract borders Charles River; 0 otherwise
NOX: a numeric vector of nitric oxides concentration (parts per 10 million) per town
RM: a numeric vector of average numbers of rooms per dwelling
AGE: a numeric vector of proportions of owner-occupied units built prior to 1940
DIS: a numeric vector of weighted distances to five Boston employment centres
RAD: a numeric vector of an index of accessibility to radial highways per town (constant for all Boston tracts)
TAX: a numeric vector full-value property-tax rate per USD 10,000 per town (constant for all Boston tracts)
PTRATIO: a numeric vector of pupil-teacher ratios per town (constant for all Boston tracts)
B: a numeric vector of 1000*(Bk - 0.63)^2 where Bk is the proportion of blacks
LSTAT: a numeric vector of percentage values of lower status population

Note

Details of the creation of the tract GPKG file: tract boundaries for 1990 (formerly at: http://www.census.gov/geo/cob/bdy/tr/tr90shp/tr25_d90_shp.zip, counties in the BOSTON SMSA http://www.census.gov/population/metro/files/lists/historical/63mfips.txt); tract conversion table 1980/1970 (formerly at : https://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/7913?q=07913&permit[0]=AVAILABLE, http://www.icpsr.umich.edu/cgi-bin/bob/zipcart2?path=ICPSR&study=7913&bundle=all&ds=1&dups=yes). The shapefile contains corrections and extra variables (tract 3592 is corrected to 3593; the extra columns are:

units: number of single family houses
cu5k: count of units under USD 5,000
c5_7_5: counts USD 5,000 to 7,500
C*_*: interval counts
co50k: count of units over USD 50,000
median: recomputed median values
BB: recomputed black population proportion
censored: whether censored or not
NOXID: NOX model zone ID
POP: tract population

Source

Previously available from http://lib.stat.cmu.edu/datasets/boston_corrected.txt

References

Harrison, David, and Daniel L. Rubinfeld, Hedonic Housing Prices and the Demand for Clean Air, Journal of Environmental Economics and Management, Volume 5, (1978), 81-102. Original data.

Gilley, O.W., and R. Kelley Pace, On the Harrison and Rubinfeld Data, Journal of Environmental Economics and Management, 31 (1996),403-405. Provided corrections and examined censoring.

Pace, R. Kelley, and O.W. Gilley, Using the Spatial Configuration of the Data to Improve Estimation, Journal of the Real Estate Finance and Economics, 14 (1997), 333-340.

Bivand, Roger. Revisiting the Boston data set - Changing the units of observation affects estimated willingness to pay for clean air. REGION, v. 4, n. 1, p. 109-127, 2017. https://openjournals.wu.ac.at/ojs/index.php/region/article/view/107.

Examples

if (requireNamespace("spdep", quietly = TRUE)) {
  data(boston)
  hr0 <- lm(log(MEDV) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) + I(RM^2) +
                    AGE + log(DIS) + log(RAD) + TAX + PTRATIO + B + log(LSTAT), data = boston.c)
  summary(hr0)
  logLik(hr0)
  gp0 <- lm(log(CMEDV) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) + I(RM^2) +
                    AGE + log(DIS) + log(RAD) + TAX + PTRATIO + B + log(LSTAT), data = boston.c)
  summary(gp0)
  logLik(gp0)
  spdep::lm.morantest(hr0, spdep::nb2listw(boston.soi))
}
if (requireNamespace("sf", quietly = TRUE)) {
boston.tr <- sf::st_read(system.file("shapes/boston_tracts.gpkg",
                           package="spData")[1])
  if (requireNamespace("spdep", quietly = TRUE)) {
    boston_nb <- spdep::poly2nb(boston.tr)
  }
}

World coffee production data

Description

A tiny dataset containing estimates of global coffee in thousands of 60 kg bags produced by country. Purpose: teaching **not** research.

Usage

coffee_data

Format

A data frame (tibble) with 58 for the following 12 variables:

name_long: name of country or coffee variety
coffee_production_2016: production in 2016
coffee_production_2017: production in 2017

Details

The examples section shows how this can be joined with spatial data to create a simple map.

Source

The International Coffee Organization (ICO). See http://www.ico.org/ and http://www.ico.org/prices/m1-exports.pdf

Examples

head(coffee_data)
## Not run: 
if (requireNamespace("dplyr")) {
library(dplyr)
library(sf)
# found by searching for "global coffee data"
u = "http://www.ico.org/prices/m1-exports.pdf"
download.file(u, "data.pdf", mode = "wb")
if (requireNamespace("pdftables")) { # requires api key
pdftables::convert_pdf(input_file = "data.pdf", output_file = "coffee-data-messy.csv")
d = read_csv("coffee-data-messy.csv")
file.remove("coffee-data-messy.csv")
file.remove("data.pdf")
coffee_data = slice(d, -c(1:9)) %>% 
        select(name_long = 1, coffee_production_2016 = 2, coffee_production_2017 = 3) %>% 
        filter(!is.na(coffee_production_2016)) %>% 
        mutate_at(2:3, str_replace, " ", "") %>% 
        mutate_at(2:3, as.integer)
world_coffee = left_join(world, coffee_data)
plot(world_coffee[c("coffee_production_2016", "coffee_production_2017")])
b = c(0, 500, 1000, 2000, 3000)
library(tmap)
tm_shape(world_coffee) +
  tm_fill("coffee_production_2017", title = "Thousand 60kg bags", breaks = b,
          textNA = "No data", colorNA = NULL)
tmap_mode("view") # for an interactive version
}}

## End(Not run)

Columbus OH spatial analysis data set

Description

The columbus data frame has 49 rows and 22 columns. Unit of analysis: 49 neighbourhoods in Columbus, OH, 1980 data. In addition the data set includes a polylist object polys with the boundaries of the neighbourhoods, a matrix of polygon centroids coords, and col.gal.nb, the neighbours list from an original GAL-format file. The matrix bbs is DEPRECATED, but retained for other packages using this data set.

Usage

columbus

Format

This data frame contains the following columns:

AREA: computed by ArcView
PERIMETER: computed by ArcView
COLUMBUS_: internal polygon ID (ignore)
COLUMBUS_I: another internal polygon ID (ignore)
POLYID: yet another polygon ID
NEIG: neighborhood id value (1-49); conforms to id value used in Spatial Econometrics book.
HOVAL: housing value (in 1,000 USD)
INC: household income (in 1,000 USD)
CRIME: residential burglaries and vehicle thefts per thousand households in the neighborhood
OPEN: open space in neighborhood
PLUMB: percentage housing units without plumbing
DISCBD: distance to CBD
X: x coordinate (in arbitrary digitizing units, not polygon coordinates)
Y: y coordinate (in arbitrary digitizing units, not polygon coordinates)
NSA: north-south dummy (North=1)
NSB: north-south dummy (North=1)
EW: east-west dummy (East=1)
CP: core-periphery dummy (Core=1)
THOUS: constant=1,000
NEIGNO: NEIG+1,000, alternative neighborhood id value

Details

The row names of columbus and the region.id attribute of polys are set to columbus$NEIGNO.

Note

All source data files prepared by Luc Anselin, Spatial Analysis Laboratory, Department of Agricultural and Consumer Economics, University of Illinois, Urbana-Champaign, http://sal.agecon.uiuc.edu/datasets/columbus.zip.

Source

Anselin, Luc. 1988. Spatial econometrics: methods and models. Dordrecht: Kluwer Academic, Table 12.1 p. 189.

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  columbus <- sf::st_read(system.file("shapes/columbus.gpkg", package="spData")[1])
  plot(sf::st_geometry(columbus))
}

if (requireNamespace("spdep", quietly = TRUE)) {
  library(spdep)
  col.gal.nb <- read.gal(system.file("weights/columbus.gal", package="spData")[1])
}

Datasets to illustrate the concept of spatial congruence

Description

Sample of old (incongruent) and new (congruent) administrative zones from UK statistical agencies

Usage

congruent

Format

Simple feature geographic data in a projected CRS (OSGB) with random values assigned for teaching purposes.

Source

https://en.wikipedia.org/wiki/ONS_coding_system

Examples

if(requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  plot(aggregating_zones$geometry, lwd = 5)
  plot(congruent$geometry, add = TRUE, border = "green", lwd = 2)
  plot(incongruent$geometry, add = TRUE, border = "blue", col = NA)
  rbind(congruent, incongruent)
}
# Code used to download the data:
## Not run: 
#devtools::install_github("robinlovelace/ukboundaries")
library(sf)
library(tmap)
library(dplyr)
#library(ukboundaries)
sel = grepl("003|004", msoa2011_lds$geo_label)
aggregating_zones = st_transform(msoa2011_lds[sel, ], 27700)
# find lsoas in the aggregating_zones
lsoa_touching = st_transform(lsoa2011_lds, 27700)[aggregating_zones, ]
lsoa_cents = st_centroid(lsoa_touching)
lsoa_cents = lsoa_cents[aggregating_zones, ]
sel = lsoa_touching$geo_code %in% lsoa_cents$geo_code
# same for ed zones
ed_touching = st_transform(ed1981, 27700)[aggregating_zones, ]
ed_cents = st_centroid(ed_touching)
ed_cents = ed_cents[aggregating_zones, ]
incongruent_agg_ed = ed_touching[ed_cents, ]
set.seed(2017)
incongruent_agg_ed$value = rnorm(nrow(incongruent_agg_ed), mean = 5)
congruent = aggregate(incongruent_agg_ed["value"], lsoa_touching[sel, ], mean)
congruent$level = "Congruent"
congruent = congruent[c("level", "value")]
incongruent_cents = st_centroid(incongruent_agg_ed)
aggregating_value = st_join(incongruent_cents, congruent)$value.y
incongruent_agg = aggregate(incongruent_agg_ed["value"], list(aggregating_value), FUN = mean)
incongruent_agg$level = "Incongruent"
incongruent = incongruent_agg[c("level", "value")]
summary(st_geometry_type(congruent))
summary(st_geometry_type(incongruent))
incongruent = st_cast(incongruent, "MULTIPOLYGON")
summary(st_geometry_type(incongruent))
summary(st_geometry_type(aggregating_zones))
devtools::use_data(congruent, overwrite = TRUE)
devtools::use_data(incongruent, overwrite = TRUE)
devtools::use_data(aggregating_zones, overwrite = TRUE)

## End(Not run)

Cycle hire points in London

Description

Points representing cycle hire points accross London.

Usage

cycle_hire

Format

FORMAT:

id: Id of the hire point
name: Name of the point
area: Area they are in
nbikes: The number of bikes currently parked there
nempty: The number of empty places
geometry: sfc_POINT

Source

https://www.data.gov.uk/

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  data(cycle_hire)
  # or
  cycle_hire <- st_read(system.file("shapes/cycle_hire.geojson", package="spData"))
  
  plot(cycle_hire)
}

## Not run: 
# Download the data
cycle_hire = readr::read_csv("http://cyclehireapp.com/cyclehirelive/cyclehire.csv", 
  col_names = FALSE, skip = TRUE)
cycle_hire = cycle_hire[c_names]
c_names = c("id", "name", "area", "lat", "lon", "nbikes", "nempty")
cycle_hire = st_sf(cycle_hire, st_multipoint(c_names[c("lon", "lat")]))

## End(Not run)

Cycle hire points in London from OSM

Description

Dataset downloaded using the osmdata package representing cycle hire points accross London.

Usage

cycle_hire_osm

Format

osm_id: The OSM ID
name: The name of the cycle point
capacity: How many bikes it can take
cyclestreets_id: The ID linked to cyclestreets' photomap
description: Additional description of points
geometry: sfc_POINT

Source

https://www.openstreetmap.org

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  data(cycle_hire_osm)
  # or
  cycle_hire_osm <- st_read(system.file("shapes/cycle_hire_osm.geojson", package="spData"))
  
  plot(cycle_hire_osm)
}

# Code used to download the data:
## Not run: 
library(osmdata)
library(dplyr)
library(sf)
q = add_osm_feature(opq = opq("London"), key = "network", value = "tfl_cycle_hire")
lnd_cycle_hire = osmdata_sf(q)
cycle_hire_osm = lnd_cycle_hire$osm_points
nrow(cycle_hire_osm)
plot(cycle_hire_osm)
cycle_hire_osm = dplyr::select(cycle_hire_osm, osm_id, name, capacity, 
                               cyclestreets_id, description) %>%
  mutate(capacity = as.numeric(capacity))
names(cycle_hire_osm)
nrow(cycle_hire_osm)

## End(Not run)

Municipality departments of Athens (Sf)

Description

The geographic boundaries of departments (sf) of the municipality of Athens. This is accompanied by various characteristics in these areas.

Usage

depmunic

Format

An sf object of 7 polygons with the following 7 variables.

num_dep: An unique identifier for each municipality department.
airbnb: The number of airbnb properties in 2017
museums: The number of museums
population: The population recorded in census at 2011.
pop_rest: The number of citizens that the origin is a non european country.
greensp: The area of green spaces (unit: square meters).
area: The area of the polygon (unit: square kilometers).

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  data(depmunic)
  
  depmunic$foreigners <- 100*depmunic$pop_rest/depmunic$population
  plot(depmunic["foreigners"], key.pos=1)
}

Eire data sets

Description

The Eire data set has been converted to shapefile format and placed in the etc/shapes directory. The initial data objects are now stored as a SpatialPolygonsDataFrame object, from which the contiguity neighbour list is recreated. For purposes of record, the original data set is retained. The eire.df data frame has 26 rows and 9 columns. In addition, polygons of the 26 counties are provided as a multipart polylist in eire.polys.utm (coordinates in km, projection UTM zone 30). Their centroids are in eire.coords.utm. The original Cliff and Ord binary contiguities are in eire.nb.

Format

This data frame contains the following columns:

A: Percentage of sample with blood group A
towns: Towns/unit area
pale: Beyond the Pale 0, within the Pale 1
size: number of blood type samples
ROADACC: arterial road network accessibility in 1961
OWNCONS: percentage in value terms of gross agricultural output of each county consumed by itself
POPCHG: 1961 population as percentage of 1926
RETSALE: value of retail sales British Pound000
INCOME: total personal income British Pound000
names: County names

Source

Upton and Fingleton 1985, - Bailey and Gatrell 1995, ch. 1 for blood group data, Cliff and Ord (1973), p. 107 for remaining variables (also after O'Sullivan, 1968). Polygon borders and Irish data sourced from Michael Tiefelsdorf's SPSS Saddlepoint bundle, originally hosted at: http://geog-www.sbs.ohio-state.edu/faculty/tiefelsdorf/GeoStat.htm.

Examples


library(spdep)
eire <- sf::st_read(system.file("shapes/eire.gpkg", package="spData")[1])
eire.nb <- poly2nb(eire)

# Eire physical anthropology blood group data
summary(eire$A)
brks <- round(fivenum(eire$A), digits=2)
cols <- rev(heat.colors(4))
plot(eire, col=cols[findInterval(eire$A, brks, all.inside=TRUE)])
title(main="Percentage with blood group A in Eire")
legend(x=c(-50, 70), y=c(6120, 6050), 
  c("under 27.91", "27.91 - 29.26", "29.26 - 31.02", "over 31.02"),
  fill=cols, bty="n")

plot(st_geometry(eire))
plot(eire.nb, st_geometry(eire), add=TRUE)

lA <- lag.listw(nb2listw(eire.nb), eire$A)
summary(lA)
moran.test(eire$A, nb2listw(eire.nb))
geary.test(eire$A, nb2listw(eire.nb))
cor(lA, eire$A)
moran.plot(eire$A, nb2listw(eire.nb), labels=eire$names)
A.lm <- lm(A ~ towns + pale, data=eire)
summary(A.lm)
res <- residuals(A.lm)
brks <- c(min(res),-2,-1,0,1,2,max(res))
cols <- rev(cm.colors(6))

plot(eire, col=cols[findInterval(res, brks, all.inside=TRUE)])
title(main="Regression residuals")
legend(x=c(-50, 70), y=c(6120, 6050),
  legend=c("under -2", "-2 - -1", "-1 - 0", "0 - 1", "1 - 2", "over 2"),
  fill=cols, bty="n")

lm.morantest(A.lm, nb2listw(eire.nb))
lm.morantest.sad(A.lm, nb2listw(eire.nb))
lm.LMtests(A.lm, nb2listw(eire.nb), test="LMerr")

# Eire agricultural data
brks <- round(fivenum(eire$OWNCONS), digits=2)
cols <- grey(4:1/5)
plot(eire, col=cols[findInterval(eire$OWNCONS, brks, all.inside=TRUE)])
title(main="Percentage own consumption of agricultural produce")
legend(x=c(-50, 70), y=c(6120, 6050),
  legend=c("under 9", "9 - 12.2", "12.2 - 19", "over 19"), fill=cols, bty="n")

moran.plot(eire$OWNCONS, nb2listw(eire.nb))
moran.test(eire$OWNCONS, nb2listw(eire.nb))
e.lm <- lm(OWNCONS ~ ROADACC, data=eire)
res <- residuals(e.lm)
brks <- c(min(res),-2,-1,0,1,2,max(res))
cols <- rev(cm.colors(6))
plot(eire, col=cols[findInterval(res, brks, all.inside=TRUE)])
title(main="Regression residuals")
legend(x=c(-50, 70), y=c(6120, 6050),
  legend=c("under -2", "-2 - -1", "-1 - 0", "0 - 1", "1 - 2", "over 2"),
  fill=cm.colors(6), bty="n")

lm.morantest(e.lm, nb2listw(eire.nb))
lm.morantest.sad(e.lm, nb2listw(eire.nb))
lm.LMtests(e.lm, nb2listw(eire.nb), test="LMerr")
print(localmoran.sad(e.lm, eire.nb, select=seq(along=eire.nb)))

1980 Presidential election results

Description

A data set for 1980 Presidential election results covering 3,107 US counties using geographical coordinates. In addition, three spatial neighbour objects, k4 not using Great Circle distances, dll using Great Circle distances, and e80_queen of Queen contiguities for equivalent County polygons taken from file co1980p020.tar.gz on the USGS National Atlas site, and a spatial weights object imported from elect.ford - a 4-nearest neighbour non-GC row-standardised object, but with coercion to symmetry.

Usage

elect80

Format

A SpatialPointsDataFrame with 3107 observations on the following 7 variables.

FIPS: a factor of county FIPS codes
long: a numeric vector of longitude values
lat: a numeric vector of latitude values
pc_turnout: Votes cast as proportion of population over age 19 eligible to vote
pc_college: Population with college degrees as proportion of population over age 19 eligible to vote
pc_homeownership: Homeownership as proportion of population over age 19 eligible to vote
pc_income: Income per capita of population over age 19 eligible to vote

Source

Pace, R. Kelley and Ronald Barry. 1997. "Quick Computation of Spatial Autoregressive Estimators", in Geographical Analysis; sourced from the data folder in the Spatial Econometrics Toolbox for Matlab, formerly available from http://www.spatial-econometrics.com/html/jplv7.zip, files elect.dat and elect.ford (with the final line dropped).

Examples

if (requireNamespace("sp", quietly = TRUE)) {
  library(sp)
  data(elect80)
  summary(elect80)
  plot(elect80)
}

Artificial elevation raster data set

Description

The raster data represents elevation in meters and uses WGS84 as a coordinate reference system.

Examples

system.file("raster/elev.tif", package = "spData")

Getis-Ord remote sensing example data

Description

The go_xyz data frame has 256 rows and 3 columns. Vectors go_x and go_y are of length 16 and give the centres of the grid rows and columns, 30m apart. The data start from the bottom left, Getis and Ord start from the top left - so their 136th grid cell is our 120th.

Format

This data frame contains the following columns:

x: grid eastings
y: grid northings
val: remote sensing values

Source

Getis, A. and Ord, J. K. 1996 Local spatial statistics: an overview. In P. Longley and M. Batty (eds) Spatial analysis: modelling in a GIS environment (Cambridge: Geoinformation International), 266.

Examples

data(getisord)
image(go_x, go_y, t(matrix(go_xyz$val, nrow = 16, ncol=16, byrow = TRUE)), asp = 1)
text(go_xyz$x, go_xyz$y, go_xyz$val, cex = 0.7)
polygon(c(195, 225, 225, 195), c(195, 195, 225, 225), lwd = 2)
title(main = "Getis-Ord 1996 remote sensing data")

Artificial raster dataset representing grain sizes

Description

The ratified raster dataset represents grain sizes with the three classes clay, silt and sand, and WGS84 as a coordinate reference system.

Examples

system.file("raster/grain.tif", package = "spData")

Hawaii multipolygon

Description

The object loaded is a sf object containing the state of Hawaii from the US Census Bureau with a few variables from American Community Survey (ACS)

Usage

hawaii

Format

Formal class 'sf' [package "sf"]; the data contains a data.frame with 1 obs. of 7 variables:

GEOID: character vector of geographic identifiers
NAME: character vector of state names
REGION: character vector of region names
AREA: area in square kilometers of units class
total_pop_10: numerical vector of total population in 2010
total_pop_15: numerical vector of total population in 2015
geometry: sfc_MULTIPOLYGON

The object is in projected coordinates using Hawaii Albers Equal Area Conic (ESRI:102007).

Source

https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  data(hawaii)

  plot(hawaii["total_pop_15"])
}

Hopkins burnt savanna herb remains

Description

A 20m square is divided into 40 by 40 0.5m quadrats. Observations are in tens of grams of herb remains, 0 being from 0g to less than 10g, and so on. Analysis was mostly conducted using the interior 32 by 32 grid.

Usage

hopkins

Format

num [1:40, 1:40] 0 0 0 0 0 0 0 0 0 1 ...

Source

Upton, G., Fingleton, B. 1985 Spatial data analysis by example: point pattern and quatitative data, Wiley, pp. 38–39.

References

Hopkins, B., 1965 Observations on savanna burning in the Olokemeji Forest Reserve, Nigeria. Journal of Applied Ecology, 2, 367–381.

Examples

data(hopkins)
image(1:32, 1:32, hopkins[5:36,36:5], breaks=c(-0.5, 3.5, 20),
      col=c("white", "black"))

Lucas county OH housing

Description

Data on 25,357 single family homes sold in Lucas County, Ohio, 1993-1998 from the county auditor, together with an nb neighbour object constructed as a sphere of influence graph from projected coordinates.

Usage

house

Format

Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots. The data slot is a data frame with 25357 observations on the following 24 variables.

price: a numeric vector
yrbuilt: a numeric vector
stories: a factor with levels one bilevel multilvl one+half two two+half three
TLA: a numeric vector
wall: a factor with levels stucdrvt ccbtile metlvnyl brick stone wood partbrk
beds: a numeric vector
baths: a numeric vector
halfbaths: a numeric vector
frontage: a numeric vector
depth: a numeric vector
garage: a factor with levels no garage basement attached detached carport
garagesqft: a numeric vector
rooms: a numeric vector
lotsize: a numeric vector
sdate: a numeric vector
avalue: a numeric vector
s1993: a numeric vector
s1994: a numeric vector
s1995: a numeric vector
s1996: a numeric vector
s1997: a numeric vector
s1998: a numeric vector
syear: a factor with levels 1993 1994 1995 1996 1997 1998
age: a numeric vector

Its projection is CRS(+init=epsg:2834), the Ohio North State Plane.

Source

Dataset included in the Spatial Econometrics toolbox for Matlab, formerly available from http://www.spatial-econometrics.com/html/jplv7.zip.

Examples

if (requireNamespace("sp", quietly = TRUE)) {
  library(sp)
  data(house)
  str(house)
  plot(house)
}

Prevalence of respiratory symptoms

Description

Prevalence of respiratory symptoms in 71 school catchment areas in Huddersfield, Northern England

Usage

huddersfield

Format

A data frame with 71 observations on the following 2 variables.

cases: Prevalence of at least mild conditions
total: Number of questionnaires returned

Source

Martuzzi M, Elliott P (1996) Empirical Bayes estimation of small area prevalence of non-rare conditions, Statistics in Medicine 15, 1867–1873, pp. 1870–1871.

Examples

data(huddersfield)
str(huddersfield)

Illinois 1959 county gross farm product value per acre

Description

Classic data set for the choice of class intervals for choropleth maps.

Usage

jenks71

Format

A data frame with 102 observations on the following 2 variables.

jenks71: a numeric vector: Per acre value of gross farm products in dollars by county for Illinois in #' 1959
area: a numeric vector: county area in square miles

Source

Jenks, G. F., Caspall, F. C., 1971. "Error on choroplethic maps: definition, measurement, reduction". Annals, Association of American Geographers, 61 (2), 217–244

Examples

data(jenks71)
jenks71

The boroughs of London

Description

Polygons representing large administrative zones in London

Usage

lnd

Format

NAME: Borough name
GSS_CODE: Official code
HECTARES: How many hectares
NONLD_AREA: Area outside London
ONS_INNER: Office for national statistics code
SUB_2009: Empty column
SUB_2006: Empty column
geometry: sfc_MULTIPOLYGON

Source

https://github.com/Robinlovelace/Creating-maps-in-R

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  data(lnd)
  summary(lnd)
  plot(st_geometry(lnd))
}

North Carolina SIDS data

Description

(Use example(nc.sids) to read the data set from shapefile, together with import of two different list of neighbours). The nc.sids data frame has 100 rows and 21 columns. It contains data given in Cressie (1991, pp. 386-9), Cressie and Read (1985) and Cressie and Chan (1989) on sudden infant deaths in North Carolina for 1974-78 and 1979-84. The data set also contains the neighbour list given by Cressie and Chan (1989) omitting self-neighbours (ncCC89.nb), and the neighbour list given by Cressie and Read (1985) for contiguities (ncCR85.nb). The data are ordered by county ID number, not alphabetically as in the source tables sidspolys is a "polylist" object of polygon boundaries, and sidscents is a matrix of their centroids.

Usage

nc.sids

Format

This data frame contains the following columns:

SP_ID: SpatialPolygons ID
CNTY_ID: county ID
east: eastings, county seat, miles, local projection
north: northings, county seat, miles, local projection
L_id: Cressie and Read (1985) L index
M_id: Cressie and Read (1985) M index
names: County names
AREA: County polygon areas in degree units
PERIMETER: County polygon perimeters in degree units
CNTY_: Internal county ID
NAME: County names
FIPS: County ID
FIPSNO: County ID
CRESS_ID: Cressie papers ID
BIR74: births, 1974-78
SID74: SID deaths, 1974-78
NWBIR74: non-white births, 1974-78
BIR79: births, 1979-84
SID79: SID deaths, 1979-84
NWBIR79: non-white births, 1979-84

Source

Cressie, N (1991), Statistics for spatial data. New York: Wiley, pp. 386–389; Cressie, N, Chan NH (1989) Spatial modelling of regional variables. Journal of the American Statistical Association, 84, 393–401; Cressie, N, Read, TRC (1985) Do sudden infant deaths come in clusters? Statistics and Decisions Supplement Issue 2, 333–349; http://sal.agecon.uiuc.edu/datasets/sids.zip.

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  if (requireNamespace("spdep", quietly = TRUE)) {
    library(spdep)
    nc.sids <- sf::st_read(system.file("shapes/sids.gpkg", package="spData")[1])
    row.names(nc.sids) <- as.character(nc.sids$FIPS)
    rn <- row.names(nc.sids)
    ncCC89_nb <- read.gal(system.file("weights/ncCC89.gal", package="spData")[1],
                          region.id=rn)
    ncCR85_nb <- read.gal(system.file("weights/ncCR85.gal", package="spData")[1],
                          region.id=rn)
                          
    plot(sf::st_geometry(nc.sids), border="grey")
    plot(ncCR85_nb, sf::st_geometry(nc.sids), add=TRUE, col="blue")
    plot(sf::st_geometry(nc.sids), border="grey")
    plot(ncCC89_nb, sf::st_geometry(nc.sids), add=TRUE, col="blue")
  }
}

New York leukemia data

Description

New York leukemia data taken from the data sets supporting Waller and Gotway 2004 (the data should be loaded by running example(NY_data) to demonstrate spatial data import techniques)

Usage

nydata

Format

A data frame with 281 observations on the following 12 variables, and the binary coded spatial weights used in the source.

AREANAME: name of census tract
AREAKEY: unique FIPS code for each tract
X: x-coordinate of tract centroid (in km)
Y: y-coordinate of tract centroid (in km)
POP8: population size (1980 U.S. Census)
TRACTCAS: number of cases 1978-1982
PROPCAS: proportion of cases per tract
PCTOWNHOME: percentage of people in each tract owning their own home
PCTAGE65P: percentage of people in each tract aged 65 or more
Z: ransformed propoprtions
AVGIDIST: average distance between centroid and TCE sites
PEXPOSURE: "exposure potential": inverse distance between each census tract centroid and the nearest TCE site, IDIST, transformed via log(100*IDIST)
Cases: as TRACTCAS with more digits
Xm: X in metres
Ym: Y in metres
Xshift: feature offset
Yshift: feature offset

Details

The examples section shows how the DBF files from the book website for Chapter 9 were converted into the nydata data frame and the listw_NY spatial weights list. The shapes directory includes the original version of the UTM18 census tract boundaries imported from BNA format (http://sedac.ciesin.columbia.edu/ftpsite/pub/census/usa/tiger/ny/bna_st/t8_36.zip) before the OGR/GDAL BNA driver was available. The NY8_utm18 shapefile was constructed using a bna2mif converter and converted to shapefile format after adding data using writeOGR. The new file NY8_bna_utm18.gpkg has been constructed from the original BNA file, but read using the OGR BNA driver with GEOS support. The NY8 shapefile and GeoPackage NY8_utm18.gpkg include invalid polygons, but because the OGR BNA driver may have GEOS support (used here), the tract polygon objects in NY8_bna_utm18.gpkg are valid.

Source

http://www.sph.emory.edu/~lwaller/ch9index.htm

References

Waller, L. and C. Gotway (2004) Applied Spatial Statistics for Public Health Data. New York: John Wiley and Sons.

Examples

## NY leukemia

if (requireNamespace("sf", quietly = TRUE)) {
library(foreign)
nydata <- read.dbf(system.file("misc/nydata.dbf", package="spData")[1])
nydata <- sf::st_as_sf(nydata, coords=c("X", "Y"), remove=FALSE)
plot(sf::st_geometry(nydata))

nyadjmat <- as.matrix(read.dbf(system.file("misc/nyadjwts.dbf",
                                           package="spData")[1])[-1])
ID <- as.character(names(read.dbf(system.file("misc/nyadjwts.dbf",
                                              package="spData")[1]))[-1])
identical(substring(ID, 2, 10), substring(as.character(nydata$AREAKEY), 2, 10))

if (requireNamespace("sf", quietly = TRUE)) {
library(spdep)
listw_NY <- mat2listw(nyadjmat, as.character(nydata$AREAKEY), style="B")
}
}

Regions in New Zealand

Description

Polygons representing the 16 regions of New Zealand (2018). See https://en.wikipedia.org/wiki/Regions_of_New_Zealand for a description of these regions and https://www.stats.govt.nz for information on the data source

Usage

nz

Format

FORMAT:

Name: Name
Island: Island
Land_area: Land area
Population: Population
Median_income: Median income (NZD)
Sex_ratio: Sex ratio (male/female)
geom: sfc_MULTIPOLYGON

Source

https://www.stats.govt.nz

https://en.wikipedia.org/wiki/Regions_of_New_Zealand

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  summary(nz)
  plot(nz)
}
## Not run: 
# Find "Regional Council 2018 Clipped (generalised)"
# select the GeoPackage option in the "Vectors/tables" dropdown
# at https://datafinder.stats.govt.nz/data/ (requires registration)
# Save the result as:
unzip("statsnzregional-council-2018-clipped-generalised-GPKG.zip")
library(sf)
library(tidyverse)
nz_full = st_read("regional-council-2018-clipped-generalised.gpkg")
print(object.size(nz_full), units = "Kb") # 14407.2 Kb
nz = rmapshaper::ms_simplify(nz_full, keep = 0.001, sys = TRUE)
print(object.size(nz), units = "Kb") # 39.9 Kb
names(nz)
nz$REGC2018_V1_00_NAME
nz = filter(nz, REGC2018_V1_00_NAME != "Area Outside Region") %>%
        select(Name = REGC2018_V1_00_NAME, `Land_area` = LAND_AREA_SQ_KM)
# regions basic info
# devtools::install_github("hadley/rvest")
library(rvest)
doc = read_html("https://en.wikipedia.org/wiki/Regions_of_New_Zealand") %>%
        html_nodes("div table")
tab = doc[[3]] %>% html_table()
tab = tab %>% select(Name = Region, Population = `Population[20]`, Island)
tab = tab %>% mutate(Population = str_replace_all(Population, ",", "")) %>%
        mutate(Population = as.numeric(Population)) %>%
        mutate(Name = str_remove_all(Name, " \\([1-9]\\)?.+"))
nz$Name = as.character(nz$Name)
nz$Name = str_remove(nz$Name, " Region")
nz$Name %in% tab$Name
# regions additional info
library(nzcensus)
nz_add_data = REGC2013 %>% 
        select(Name = REGC2013_N, Median_income = MedianIncome2013, 
               PropFemale2013, PropMale2013) %>% 
        mutate(Sex_ratio = PropMale2013 / PropFemale2013) %>% 
        mutate(Name = gsub(" Region", "", Name)) %>% 
        select(Name, Median_income, Sex_ratio)
# data join
nz = left_join(nz, tab, by = "Name") %>% 
        left_join(nz_add_data, by = "Name") %>% 
        select(Name, Island, Land_area, Population, Median_income, Sex_ratio)

## End(Not run)

High points in New Zealand

Description

Top 101 heighest points in New Zealand (2017). See https://data.linz.govt.nz/layer/50284-nz-height-points-topo-150k/ for details.

Usage

nz_height

Format

FORMAT:

t50_fid: ID
elevation: Height above sea level in m
geometry: sfc_POINT

Source

https://data.linz.govt.nz

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  summary(nz_height)
  plot(nz$geom)
  plot(nz_height$geom, add = TRUE)
}
## Not run: 
library(dplyr)
# After downloading data
unzip("lds-nz-height-points-topo-150k-SHP.zip")
nz_height = st_read("nz-height-points-topo-150k.shp") %>% 
  top_n(n = 100, wt = elevation)
library(tmap)
tmap_mode("view")
qtm(nz) +
  qtm(nz_height)
f = list.files(pattern = "*nz-height*")
file.remove(f)

## End(Not run)

Dataset of properties in the municipality of Athens (sf)

Description

A dataset of apartments in the municipality of Athens for 2017. Point location of the properties is given together with their main characteristics and the distance to the closest metro/train station.

Usage

properties

Format

An sf object of 1000 points with the following 6 variables.

id: An unique identifier for each property.
size : The size of the property (unit: square meters)
price : The asking price (unit: euros)
prpsqm : The asking price per squre meter (unit: euroes/square meter).
age : Age of property in 2017 (unit: years).
dist_metro: The distance to closest train/metro station (unit: meters).

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  if (requireNamespace("spdep", quietly = TRUE)) {
    library(sf)
    library(spdep)
    
    data(properties)
    
    summary(properties$prpsqm)
    
    pr.nb.800 <- dnearneigh(properties, 0, 800)
    pr.listw <- nb2listw(pr.nb.800)
    
    moran.test(properties$prpsqm, pr.listw)
    moran.plot(properties$prpsqm, pr.listw, xlab = "Price/m^2", ylab = "Lagged")
  }
}

Small river network in France

Description

Lines representing the Seine, Marne and Yonne rivers.

Usage

seine

Format

FORMAT:

name: name
geometry: sfc_MULTILINESTRING

The object is in the RGF93 / Lambert-93 CRS.

Source

https://www.naturalearthdata.com/

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  seine
  plot(seine)
}
## Not run: 
library(sf)
library(rnaturalearth)
library(tidyverse)

seine = ne_download(scale = 10, type = "rivers_lake_centerlines", 
                    category = "physical", returnclass = "sf") %>% 
        filter(name %in% c("Yonne", "Seine", "Marne")) %>% 
        select(name = name_en) %>% 
        st_transform(2154)

## End(Not run)

US State Visibility Based Map

Description

A SpatialPolygonsDataFrame object to plot a Visibility Based Map.

Usage

state.vbm

Format

An object of class SpatialPolygonsDataFrame with 50 rows and 2 columns.

Details

A SpatialPolygonsDataFrame object to plot a map of the US states where the sizes of the states have been adjusted to be more equal. This map can be useful for plotting state data using colors patterns without the larger states dominating and the smallest states being lost. The original map is copyrighted by Mark Monmonier. Official publications based on this map should acknowledge him. Comercial publications of maps based on this probably need permission from him to use.

Author(s)

Greg Snow greg.snow@imail.org (of this compilation)

Source

The data was converted from the maps library for S-PLUS. S-PLUS uses the map with permission from the author. This version of the data has not received permission from the author (no attempt made, not that it was refused), most of my uses I feel fall under fair use and do not violate copyright, but you will need to decide for yourself and your applications.

References

http://www.markmonmonier.com/index.htm, http://euclid.psych.yorku.ca/SCS/Gallery/bright-ideas.html

Examples

if (requireNamespace("sp", quietly = TRUE)) {
  library(sp)
  data(state.vbm)
  plot(state.vbm)

  tmp <- state.x77[, 'HS Grad']
  tmp2 <- cut(tmp, seq(min(tmp), max(tmp), length.out=11),
            include.lowest=TRUE)
  plot(state.vbm, col=cm.colors(10)[tmp2])
}

Major urban areas worldwide

Description

Dataset in a 'long' form from the United Nations population division with projections up to 2050. Includes only the top 30 largest areas by population at 5 year intervals.

Usage

urban_agglomerations

Format

Selected variables:

year: Year of population estimate
country_code: Code of country
urban_agglomeration: Name of the urban agglomeration
population_millions: Estimated human population
geometry: sfc_POINT

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  plot(urban_agglomerations)
}
# Code used to download the data:
## Not run: 
f = "WUP2018-F11b-30_Largest_Cities_in_2018_by_time.xls"
download.file(
  destfile = f,
  url = paste0("https://population.un.org/wup/Download/Files/", f)
 )
library(dplyr)
library(sf)
urban_agglomerations = readxl::read_excel(f, skip = 16) %>%
    st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326)
names(urban_agglomerations)
names(urban_agglomerations) <- gsub(" |\\n", "_", tolower(names(urban_agglomerations)) ) %>% 
        gsub("\\(|\\)", "", .)
names(urban_agglomerations)
urban_agglomerations
usethis::use_data(urban_agglomerations, overwrite = TRUE)
file.remove("WUP2018-F11b-30_Largest_Cities_in_2018_by_time.xls")

## End(Not run)

US states polygons

Description

The object loaded is a sf object containing the contiguous United States data from the US Census Bureau with a few variables from American Community Survey (ACS)

Usage

us_states

Format

Formal class 'sf' [package "sf"]; the data contains a data.frame with 49 obs. of 7 variables:

GEOID: character vector of geographic identifiers
NAME: character vector of state names
REGION: character vector of region names
AREA: area in square kilometers of units class
total_pop_10: numerical vector of total population in 2010
total_pop_15: numerical vector of total population in 2015
geometry: sfc_MULTIPOLYGON

The object is in geographical coordinates using the NAD83 datum.

Source

https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  data(us_states)

  plot(us_states["REGION"])
}

the American Community Survey (ACS) data

Description

The object loaded is a data.frame object containing the US states data from the American Community Survey (ACS)

Usage

us_states_df

Format

Formal class 'data.frame'; the data contains a data.frame with 51 obs. of 5 variables:

state: character vector of state names
median_income_10: numerical vector of median income in 2010
median_income_15: numerical vector of median income in 2010
poverty_level_10: numerical vector of number of people with income below poverty level in 2010
poverty_level_15: numerical vector of number of people with income below poverty level in 2015

Source

https://www.census.gov/programs-surveys/acs/

Examples

data(us_states_df)

summary(us_states_df)

US 1960 used car prices

Description

The used.cars data frame has 48 rows and 2 columns. The data set includes a neighbours list for the 48 states excluding DC from poly2nb().

Usage

used.cars

Format

This data frame contains the following columns:

tax.charges: taxes and delivery charges for 1955-9 new cars
price.1960: 1960 used car prices by state

Source

Hanna, F. A. 1966 Effects of regional differences in taxes and transport charges on automobile consumption, in Ostry, S., Rhymes, J. K. (eds) Papers on regional statistical studies, Toronto: Toronto University Press, pp. 199-223.

References

Hepple, L. W. 1976 A maximum likelihood model for econometric estimation with spatial series, in Masser, I (ed) Theory and practice in regional science, London: Pion, pp. 90-104.

Examples

if (requireNamespace("spdep", quietly = TRUE)) {
  library(spdep)
  data(used.cars)
  moran.test(used.cars$price.1960, nb2listw(usa48.nb))
  moran.plot(used.cars$price.1960, nb2listw(usa48.nb),
           labels=rownames(used.cars))
  uc.lm <- lm(price.1960 ~ tax.charges, data=used.cars)
  summary(uc.lm)

  lm.morantest(uc.lm, nb2listw(usa48.nb))
  lm.morantest.sad(uc.lm, nb2listw(usa48.nb))
  lm.LMtests(uc.lm, nb2listw(usa48.nb))

  if (requireNamespace("spatialreg", quietly = TRUE)) {
    library(spatialreg)
    uc.err <- errorsarlm(price.1960 ~ tax.charges, data=used.cars,
                       nb2listw(usa48.nb), tol.solve=1.0e-13, 
                       control=list(tol.opt=.Machine$double.eps^0.3))
    summary(uc.err)
    uc.lag <- lagsarlm(price.1960 ~ tax.charges, data=used.cars,
                     nb2listw(usa48.nb), tol.solve=1.0e-13, 
                     control=list(tol.opt=.Machine$double.eps^0.3))
    summary(uc.lag)
    uc.lag1 <- lagsarlm(price.1960 ~ 1, data=used.cars,
                      nb2listw(usa48.nb), tol.solve=1.0e-13, 
                      control=list(tol.opt=.Machine$double.eps^0.3))
    summary(uc.lag1)
    uc.err1 <- errorsarlm(price.1960 ~ 1, data=used.cars,
                        nb2listw(usa48.nb), tol.solve=1.0e-13, 
                        control=list(tol.opt=.Machine$double.eps^0.3),
                        Durbin=FALSE)
    summary(uc.err1)
  }

}

Mercer and Hall wheat yield data

Description

Mercer and Hall wheat yield data, based on version in Cressie (1993), p. 455.

Usage

wheat

Format

The format of the object generated by running data(wheat) is a three column data frame made available by Hongfei Li. The example section shows how to convert this to the object used in demonstrating the aple function, and is a formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots; the data slot is a data frame with 500 observations on the following 6 variables.

lat: local coordinates northings ordered north to south
yield: Mercer and Hall wheat yield data
r: rows south to north; levels in distance units of plot centres
c: columns west to east; levels in distance units of plot centres
lon: local coordinates eastings
lat1: local coordinates northings ordered south to north

Note

The value of 4.03 was changed to 4.33 (wheat[71,]) 13 January 2014; thanks to Sandy Burden; cross-checked with http://www.itc.nl/personal/rossiter/teach/R/mhw.csv, which agrees.

Source

Cressie, N. A. C. (1993) Statistics for Spatial Data. Wiley, New York, p. 455.

References

Mercer, W. B. and Hall, A. D. (1911) The experimental error of field trials. Journal of Agricultural Science 4, 107-132.

Examples


if (requireNamespace("sp", quietly = TRUE)) {
library(sp)
data(wheat)
wheat$lat1 <- 69 - wheat$lat
wheat$r <- factor(wheat$lat1)
wheat$c <- factor(wheat$lon)
wheat_sp <- wheat
coordinates(wheat_sp) <- c("lon", "lat1")
wheat_spg <- wheat_sp

gridded(wheat_spg) <- TRUE
wheat_spl <- as(wheat_spg, "SpatialPolygons")
df <- as(wheat_spg, "data.frame")
row.names(df) <- sapply(slot(wheat_spl, "polygons"),
                        function(x) slot(x, "ID"))
wheat <- SpatialPolygonsDataFrame(wheat_spl, data=df)
}


if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  wheat <- st_read(system.file("shapes/wheat.gpkg", package="spData"))
  plot(wheat)
}

World country polygons

Description

The object loaded is a sf object containing a world map data from Natural Earth with a few variables from World Bank

Usage

world

Format

Formal class 'sf' [package "sf"]; the data contains a data.frame with 177 obs. of 11 variables:

iso_a2: character vector of ISO 2 character country codes
name_long: character vector of country names
continent: character vector of continent names
region_un: character vector of region names
subregion: character vector of subregion names
type: character vector of type names
area_km2: integer vector of area values
pop: integer vector of population in 2014
lifeExp: integer vector of life expectancy at birth in 2014
gdpPercap: integer vector of per-capita GDP in 2014
geom: sfc_MULTIPOLYGON

The object is in geographical coordinates using the WGS84 datum.

Source

https://www.naturalearthdata.com/

https://data.worldbank.org/

Examples

if (requireNamespace("sf", quietly = TRUE)) {
  library(sf)
  data(world)
  # or
  world <- st_read(system.file("shapes/world.gpkg", package="spData"))

  plot(world)
}

World Bank data

Description

The object loaded is a data.frame object containing data from World Bank

Usage

worldbank_df

Format

Formal class 'data.frame'; the data contains a data.frame with 177 obs. of 7 variables:

name: character vector of country names
iso_a2: character vector of ISO 2 character country codes
HDI: human development index (HDI)
urban_pop: urban population
unemployment: unemployment, total (% of total labor force)
pop_growth: population growth (annual %)
literacy: adult literacy rate, population 15+ years, both sexes (%)

Source

https://data.worldbank.org/

Examples

data(worldbank_df)
# or
worldbank_df <- read.csv(system.file("misc/worldbank_df.csv", package="spData"))

summary(worldbank_df)

Data for Splash Dams in western Oregon

Description

Usage

Format

Source

Examples

Spatial patterns of conflict in Africa 1966-78

Description

Usage

Format

Note

Source

Examples

Alaska multipolygon

Description

Usage

Format

Source

See Also

Examples

Marshall's infant mortality in Auckland dataset

Description

Usage

Format

Details

Source

Examples

House sales prices, Baltimore, MD 1978

Description

Usage

Format

Source

References

Examples

Corrected Boston Housing Data

Description

Format

Note

Source

References

Examples

World coffee production data

Description

Usage

Format

Details

Source

Examples

Columbus OH spatial analysis data set

Description

Usage

Format

Details

Note

Source

Examples

Datasets to illustrate the concept of spatial congruence

Description

Usage

Format

Source

Examples

Cycle hire points in London

Description

Usage

Format

Source

Examples

Cycle hire points in London from OSM

Description

Usage

Format

Source

See Also

Examples

Municipality departments of Athens (Sf)

Description

Usage

Format

See Also