In certain methods of spatial cross-validation, you need to set an inclusion radius and a buffer distance for the folds into which you divide your data. I am working with census tracts, and think it would make sense to use distances based on neighborhoods to set the inclusion radius and buffer distance.
Set-Up
Load the data wrangling and mapping libraries, then import Chicago neighborhoods:
Code
library(tidyverse)library(sf)library(tmap)# Source: Chicago Data Portal - Boundaries - Neighborhoods# URL: https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Neighborhoods/bbvz-uum9neighborhoods <-st_read(here::here("data", "boundaries_neighborhoods_chicago.kml"))
Reading layer `Layer0' from data source
`/home/riggins/blog/data/boundaries_neighborhoods_chicago.kml'
using driver `LIBKML'
Simple feature collection with 98 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304
Geodetic CRS: WGS 84