SF package

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(sf)
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(sfheaders)

sf - basically a tibble, but one of the columns hold geometric / geographical information (like a shape, or point).

sfheaders - helps with transition from sf to tibble, and vice versa.

Getting data

There are tons of shapefiles online. This is one such example.

nyc = read_sf(dsn = "tl_2019_us_zcta510/", layer = 'tl_2019_us_zcta510') %>%
  mutate(borough = case_when(
    ZCTA5CE10 %in% 10001:10282 ~ "Manhattan",
    
    ZCTA5CE10 %in% 10301:10314 ~ "Staten Island",
    
    ZCTA5CE10 %in% 10451:10475 ~ "Bronx",
    
    ZCTA5CE10 %in% c(11004:11109, 11351:11697) ~ "Queens",
    
    ZCTA5CE10 %in% 11201:11256 ~ "Brooklyn",
    
    T ~ "Not NYC"
  )) %>%
  filter(borough != "Not NYC")

head(nyc, 4)
## Simple feature collection with 4 features and 10 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -73.99614 ymin: 40.75928 xmax: -73.94701 ymax: 40.78104
## Geodetic CRS:  NAD83
## # A tibble: 4 × 11
##   ZCTA5CE10 GEOID10 CLASSFP10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 
##   <chr>     <chr>   <chr>     <chr>   <chr>        <dbl>    <dbl> <chr>      
## 1 10065     10065   B5        G6350   S           984654        0 +40.7646284
## 2 10069     10069   B5        G6350   S           249050        0 +40.7759598
## 3 10075     10075   B5        G6350   S           477137        0 +40.7733630
## 4 10103     10103   B5        G6350   S            24776        0 +40.7607797
## # ℹ 3 more variables: INTPTLON10 <chr>, geometry <MULTIPOLYGON [°]>,
## #   borough <chr>

Quick map!

ggplot(data = nyc) +
  geom_sf(aes(geometry = geometry), fill = "white")

Can also filter, just like a tibble!

nyc %>%
  filter(borough != "Bronx") %>%
  ggplot() +
    geom_sf(aes(geometry = geometry))