background-image: url("img/logo_padded.001.jpeg") background-position: left background-size: 60% class: middle, center, .pull-right[ <br> ## .base_color[Data of Different Types:] ## .base_color[Spatial Data] <br> #### .navy[Kelly McConville] #### .navy[ Stat 108 | Week 5 | Spring 2023] ] --- ## Announcements * P-Set 3 is due NEXT Wednesday. * Please fill out [this form](https://forms.gle/w7Z4izaxburr9jur6) to help us create groups for Project 1. ************************ ## Week's Goals .pull-left[ **Mon Lecture** * Cancelled due to university holiday ] .pull-right[ **Wed Lecture** * More data types + Spatial data ] --- ### Spatial Data * Data that is linked to the physical world. -- * **Example**: Crash data in Cambridge! -- * Visualizing these data on a map (or map-like graph) is a great way to add useful context! -- Focus on creating the following types of maps: .pull-left[ * Choropleth maps * Cartograms * Hexbin maps (cartogram heatmaps) ] .pull-right[ * Raster maps * Interactive maps ] -- **Disclaimer**: * I know we are in COVID-19 overload and so I have intentionally minimized the number of COVID-19 related examples. * But, people have made a LOT of COVID maps so we will look at some of them today. --- ### Static, Raster Maps <img src="img/static_map.png" width="80%" style="display: block; margin: auto;" /> * Source: [JHU](https://coronavirus.jhu.edu/map.html) --- ### Choropleth Maps <img src="img/vaccine_rates.png" width="50%" style="display: block; margin: auto;" /> * Source: [The Guardian](https://www.theguardian.com/us-news/2021/feb/18/us-vaccine-distribution-tracker-by-state) --- ### Choropleth Maps <img src="img/choroplethCDC.png" width="80%" style="display: block; margin: auto;" /> * Source: [CDC](https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html) --- ### Choropleth Maps <img src="img/choropleth1.png" width="70%" style="display: block; margin: auto;" /> * Source: [NYtimes.com](https://www.nytimes.com/interactive/2020/04/02/us/coronavirus-social-distancing.html) --- ### Choropleth Maps <img src="img/choropleth2.png" width="70%" style="display: block; margin: auto;" /> * Source: [NYtimes.com](https://www.nytimes.com/interactive/2020/04/02/us/coronavirus-social-distancing.html) --- ### Cartogram <img src="img/Population-cartogram_World.png" width="90%" style="display: block; margin: auto;" /> * Source: [Our World in Data](https://ourworldindata.org/world-population-cartogram) --- ### "Hex"bin Maps <img src="img/voting_restrictions.png" width="60%" style="display: block; margin: auto;" /> * Source: [FiveThirtyEight](https://projects.fivethirtyeight.com/voting-restrictions-by-state/) --- ### Interactive Maps <img src="img/climate_change.png" width="70%" style="display: block; margin: auto;" /> * Source: [NYTimes](https://www.nytimes.com/2020/10/15/learning/whats-going-on-in-this-graph-climate-threats.html) --- ## Spatial Data Structures * Often stored in special data structures + Ex: Shapefile(s) + Consist of geometric objects like points, lines, and polygons -- * Visualizing Spatial Data + Need to draw the map + Add metadata on top of the map + Color in regions + Dots --- ### Polygon Maps .pull-left[ * Want to draw various boundaries + EXs: Countries, states, counties, etc * polygon = closed area including the boundaries making up the area ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/unnamed-chunk-9-1.png" width="576" style="display: block; margin: auto;" /> ] --- ### Polygon Maps * Let's get a polygon map from the `maps` package (that loads with `ggplot2`) ```r ma_counties <- map_data("county", "massachusetts") glimpse(ma_counties) ``` ``` ## Rows: 744 ## Columns: 6 ## $ long <dbl> -70.67435, -70.53683, -70.53683, -70.51392, -70.47954, -70.4… ## $ lat <dbl> 41.73997, 41.79727, 41.79727, 41.78008, 41.75716, 41.73425, … ## $ group <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ order <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1… ## $ region <chr> "massachusetts", "massachusetts", "massachusetts", "massachu… ## $ subregion <chr> "barnstable", "barnstable", "barnstable", "barnstable", "bar… ``` --- ### Polygon Maps * `lat`: latitude of a corner of the polygon * `long`: longitude of a corner of the polygon * `group`: county ```r glimpse(ma_counties) ``` ``` ## Rows: 744 ## Columns: 6 ## $ long <dbl> -70.67435, -70.53683, -70.53683, -70.51392, -70.47954, -70.4… ## $ lat <dbl> 41.73997, 41.79727, 41.79727, 41.78008, 41.75716, 41.73425, … ## $ group <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ order <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1… ## $ region <chr> "massachusetts", "massachusetts", "massachusetts", "massachu… ## $ subregion <chr> "barnstable", "barnstable", "barnstable", "barnstable", "bar… ``` --- ### Polygon Maps .pull-left[ ```r ggplot(data = ma_counties, mapping = aes(x = long, y = lat, color = factor(group))) + geom_point(show.legend = FALSE) + coord_quickmap() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Polygon Maps .pull-left[ ```r ggplot(data = ma_counties, mapping = aes(x = long, y = lat, group = group)) + geom_polygon(fill = "white", color = "olivedrab") + coord_quickmap() + theme_void() ``` * Importance of `coord_quickmap()`? ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass2-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Polygon Maps .pull-left[ ```r ggplot(data = ma_counties, mapping = aes(x = long, y = lat, group = group)) + geom_polygon(fill = "white", color = "olivedrab") + # coord_quickmap() + theme_void() ``` * Importance of `coord_quickmap()`? ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass3-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Simple Features Maps * Typically map data doesn't come in the lat-long format given in the `maps` package. -- * Often use the "simple features" standard produced by the Open Geospatial Consortium + R package: `sf` handles this type of data + Within `ggplot2`: `geom_sf()` and `coord_sf()` work with `sf` --- ### [`tidycensus`](https://walkerke.github.io/tidycensus/articles/spatial-data.html) * Data from the US Census or American Community Survey + Comes in simple features format (using `tigris`) + Need to obtain an [API key](https://api.census.gov/data/key_signup.html) ```r api_key <- "insert key" ``` ```r library(tidycensus) options(tigris_use_cache = TRUE) ma <- get_acs(state = "MA", geography = "county", variables = "B19013_001", geometry = TRUE, key = api_key) ``` --- ### [`tidycensus`](https://walkerke.github.io/tidycensus/articles/spatial-data.html) * B19013_001 = Median Household Income ```r vars <- load_variables(2021, "acs5", cache = FALSE) vars ``` ``` ## # A tibble: 27,886 × 4 ## name label concept geography ## <chr> <chr> <chr> <chr> ## 1 B01001_001 Estimate!!Total: SEX BY AGE block group ## 2 B01001_002 Estimate!!Total:!!Male: SEX BY AGE block group ## 3 B01001_003 Estimate!!Total:!!Male:!!Under 5 years SEX BY AGE block group ## 4 B01001_004 Estimate!!Total:!!Male:!!5 to 9 years SEX BY AGE block group ## 5 B01001_005 Estimate!!Total:!!Male:!!10 to 14 years SEX BY AGE block group ## 6 B01001_006 Estimate!!Total:!!Male:!!15 to 17 years SEX BY AGE block group ## 7 B01001_007 Estimate!!Total:!!Male:!!18 and 19 years SEX BY AGE block group ## 8 B01001_008 Estimate!!Total:!!Male:!!20 years SEX BY AGE block group ## 9 B01001_009 Estimate!!Total:!!Male:!!21 years SEX BY AGE block group ## 10 B01001_010 Estimate!!Total:!!Male:!!22 to 24 years SEX BY AGE block group ## # … with 27,876 more rows ``` --- ### Simple Features Object * Row for each polygon * `geometry` contains the polygon object (borders of the county) ```r ma ``` ``` ## Simple feature collection with 14 features and 5 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -73.50814 ymin: 41.23796 xmax: -69.92839 ymax: 42.88659 ## Geodetic CRS: NAD83 ## First 10 features: ## GEOID NAME variable estimate moe ## 1 25017 Middlesex County, Massachusetts B19013_001 111790 1032 ## 2 25005 Bristol County, Massachusetts B19013_001 74290 1371 ## 3 25015 Hampshire County, Massachusetts B19013_001 76959 2504 ## 4 25025 Suffolk County, Massachusetts B19013_001 80260 1586 ## 5 25023 Plymouth County, Massachusetts B19013_001 98190 1711 ## 6 25027 Worcester County, Massachusetts B19013_001 81660 987 ## 7 25009 Essex County, Massachusetts B19013_001 86684 1333 ## 8 25001 Barnstable County, Massachusetts B19013_001 82619 2539 ## 9 25013 Hampden County, Massachusetts B19013_001 61310 1065 ## 10 25021 Norfolk County, Massachusetts B19013_001 112089 1589 ## geometry ## 1 MULTIPOLYGON (((-71.89877 4... ## 2 MULTIPOLYGON (((-70.83595 4... ## 3 MULTIPOLYGON (((-73.06577 4... ## 4 MULTIPOLYGON (((-70.93091 4... ## 5 MULTIPOLYGON (((-70.88335 4... ## 6 MULTIPOLYGON (((-72.31363 4... ## 7 MULTIPOLYGON (((-70.58029 4... ## 8 MULTIPOLYGON (((-70.68698 4... ## 9 MULTIPOLYGON (((-73.07484 4... ## 10 MULTIPOLYGON (((-70.84466 4... ``` --- ### Polygon Map .pull-left[ ```r ggplot(data = ma, mapping = aes(geometry = geometry)) + geom_sf() + coord_sf() + theme_void() ``` * `geom_sf()`: creates map from the `geometry` column * `coord_sf()`: controls the projection ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass4-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Choropleth Map .pull-left[ ```r ggplot(data = ma, mapping = aes(geometry = geometry, fill = estimate)) + geom_sf() + coord_sf() + scale_fill_viridis_c( name = "Median \nHousehold \nIncome", direction = -1) + theme_void() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass5-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Choropleth Map .pull-left[ ```r ggplot(data = ma, mapping = aes(geometry = geometry, fill = estimate)) + geom_sf() + geom_sf_label(aes(label = NAME)) + coord_sf() + scale_fill_viridis_c( name = "Median \nHousehold \nIncome", direction = -1) + theme_void() ``` * How should we modify the code if we don't want the labels to be colored differently? ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass6-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Choropleth Map .pull-left[ ```r ggplot(data = ma, mapping = aes(geometry = geometry)) + geom_sf(mapping = aes(fill = estimate)) + geom_sf_label(aes(label = NAME)) + coord_sf() + scale_fill_viridis_c( name = "Median \nHousehold \nIncome", direction = -1) + theme_void() ``` * Fixes for crowding? ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass7-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Use the `stringr` Package ```r ma <- ma %>% mutate(NAME = str_replace_all(NAME, "County, Massachusetts", "")) ma ``` ``` ## Simple feature collection with 14 features and 5 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -73.50814 ymin: 41.23796 xmax: -69.92839 ymax: 42.88659 ## Geodetic CRS: NAD83 ## First 10 features: ## GEOID NAME variable estimate moe geometry ## 1 25017 Middlesex B19013_001 111790 1032 MULTIPOLYGON (((-71.89877 4... ## 2 25005 Bristol B19013_001 74290 1371 MULTIPOLYGON (((-70.83595 4... ## 3 25015 Hampshire B19013_001 76959 2504 MULTIPOLYGON (((-73.06577 4... ## 4 25025 Suffolk B19013_001 80260 1586 MULTIPOLYGON (((-70.93091 4... ## 5 25023 Plymouth B19013_001 98190 1711 MULTIPOLYGON (((-70.88335 4... ## 6 25027 Worcester B19013_001 81660 987 MULTIPOLYGON (((-72.31363 4... ## 7 25009 Essex B19013_001 86684 1333 MULTIPOLYGON (((-70.58029 4... ## 8 25001 Barnstable B19013_001 82619 2539 MULTIPOLYGON (((-70.68698 4... ## 9 25013 Hampden B19013_001 61310 1065 MULTIPOLYGON (((-73.07484 4... ## 10 25021 Norfolk B19013_001 112089 1589 MULTIPOLYGON (((-70.84466 4... ``` --- ### Choropleth Map .pull-left[ ```r ggplot(data = ma, mapping = aes(geometry = geometry)) + geom_sf(mapping = aes(fill = estimate)) + geom_sf_label(aes(label = NAME)) + coord_sf() + scale_fill_viridis_c( name = "Median \nHousehold \nIncome", direction = -1) + theme_void() ``` * Fixes for crowding? ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass8-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Choropleth Map .pull-left[ ```r #devtools::install_github("yutannihilation/ggsflabel") library(ggsflabel) ggplot(data = ma, mapping = aes(geometry = geometry)) + geom_sf(mapping = aes(fill = estimate)) + geom_sf_label_repel(aes(label = NAME), force = 40) + coord_sf() + scale_fill_viridis_c( name = "Median \nHousehold \nIncome", direction = -1) + theme_void() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass9-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Can Add Information from Another Dataset ```r ma_cities <- data_frame(city = c("Boston", "Worcester", "Springfield", "Cambridge"), lat = c(42.360, 42.270, 42.107, 42.378), long = c(-71.058, -71.810, -72.558, -71.122)) ma_cities ``` ``` ## # A tibble: 4 × 3 ## city lat long ## <chr> <dbl> <dbl> ## 1 Boston 42.4 -71.1 ## 2 Worcester 42.3 -71.8 ## 3 Springfield 42.1 -72.6 ## 4 Cambridge 42.4 -71.1 ``` --- .pull-left[ ```r library(ggrepel) ggplot() + geom_sf(data = ma, mapping = aes(geometry = geometry, fill = estimate)) + coord_sf() + geom_point(data = ma_cities, mapping = aes(x = long, y = lat), shape = 21, fill = "white", color = "black") + geom_label_repel(data = ma_cities, mapping = aes(x = long, y = lat, label = city)) + scale_fill_viridis_c( name = "Median \nHousehold \nIncome", direction = -1) + theme_void() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass10-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Coordinate Reference System * So far on our p-sets, we plotted latitude and longitude on the Cartesian plane. * But the Earth is round (though not a perfect sphere). * **geodetic datum**: Set of assumptions about the shape of the Earth + North American Datum (NAD83) + World Geodetic System (WGS84) * Map projections: Convert from 3-D to 2-D + Area preserving + Shape preserving * Together give us a **coordinate reference system** --- ### Coordinate Reference System Census defaults to NAD 1983 (EPSG: 4269) for its coordinate reference system ```r ma ``` ``` ## Simple feature collection with 14 features and 5 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -73.50814 ymin: 41.23796 xmax: -69.92839 ymax: 42.88659 ## Geodetic CRS: NAD83 ## First 10 features: ## GEOID NAME variable estimate moe geometry ## 1 25017 Middlesex B19013_001 111790 1032 MULTIPOLYGON (((-71.89877 4... ## 2 25005 Bristol B19013_001 74290 1371 MULTIPOLYGON (((-70.83595 4... ## 3 25015 Hampshire B19013_001 76959 2504 MULTIPOLYGON (((-73.06577 4... ## 4 25025 Suffolk B19013_001 80260 1586 MULTIPOLYGON (((-70.93091 4... ## 5 25023 Plymouth B19013_001 98190 1711 MULTIPOLYGON (((-70.88335 4... ## 6 25027 Worcester B19013_001 81660 987 MULTIPOLYGON (((-72.31363 4... ## 7 25009 Essex B19013_001 86684 1333 MULTIPOLYGON (((-70.58029 4... ## 8 25001 Barnstable B19013_001 82619 2539 MULTIPOLYGON (((-70.68698 4... ## 9 25013 Hampden B19013_001 61310 1065 MULTIPOLYGON (((-73.07484 4... ## 10 25021 Norfolk B19013_001 112089 1589 MULTIPOLYGON (((-70.84466 4... ``` --- ### Coordinate Reference System Census defaults to NAD 1983 (EPSG: 4269) for its coordinate reference system ```r library(sf) st_crs(ma) ``` ``` ## Coordinate Reference System: ## User input: NAD83 ## wkt: ## GEOGCRS["NAD83", ## DATUM["North American Datum 1983", ## ELLIPSOID["GRS 1980",6378137,298.257222101, ## LENGTHUNIT["metre",1]]], ## PRIMEM["Greenwich",0, ## ANGLEUNIT["degree",0.0174532925199433]], ## CS[ellipsoidal,2], ## AXIS["latitude",north, ## ORDER[1], ## ANGLEUNIT["degree",0.0174532925199433]], ## AXIS["longitude",east, ## ORDER[2], ## ANGLEUNIT["degree",0.0174532925199433]], ## ID["EPSG",4269]] ``` --- ### Cartograms * Projection 26986 = Massachusetts Mainland + Preserves angles but not areas ```r library(cartogram) ma_trans <- sf::st_transform(ma, 26986) ma_trans ``` ``` ## Simple feature collection with 14 features and 5 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: 33869.7 ymin: 777514.6 xmax: 330794.3 ymax: 959715.9 ## Projected CRS: NAD83 / Massachusetts Mainland ## First 10 features: ## GEOID NAME variable estimate moe geometry ## 1 25017 Middlesex B19013_001 111790 1032 MULTIPOLYGON (((167332.1 94... ## 2 25005 Bristol B19013_001 74290 1371 MULTIPOLYGON (((255358.7 81... ## 3 25015 Hampshire B19013_001 76959 2504 MULTIPOLYGON (((71079.03 90... ## 4 25025 Suffolk B19013_001 80260 1586 MULTIPOLYGON (((246909.5 89... ## 5 25023 Plymouth B19013_001 98190 1711 MULTIPOLYGON (((250814.4 89... ## 6 25027 Worcester B19013_001 81660 987 MULTIPOLYGON (((133013 9054... ## 7 25009 Essex B19013_001 86684 1333 MULTIPOLYGON (((275433.6 93... ## 8 25001 Barnstable B19013_001 82619 2539 MULTIPOLYGON (((267854.4 80... ## 9 25013 Hampden B19013_001 61310 1065 MULTIPOLYGON (((69751.8 874... ## 10 25021 Norfolk B19013_001 112089 1589 MULTIPOLYGON (((254086.2 88... ``` ```r ma_trans <- cartogram_cont(ma_trans, weight = "estimate") ``` --- ### Cartograms .pull-left[ ```r ggplot() + geom_sf(data = ma_trans, mapping = aes(geometry = geometry, fill = estimate)) + coord_sf() + scale_fill_viridis_c( name = "Median \nHousehold \nIncome", direction = -1) + theme_void() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass11-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Cartograms * Often scaled by a population size type variable. ```r ma_pop <- get_decennial(state = "MA", geography = "county", variables = "P001001", year = 2010, api_key = api_key) ma_trans_pop <- inner_join(ma_trans, ma_pop, by = c("GEOID" = "GEOID")) ma_trans_pop ``` ``` ## Simple feature collection with 14 features and 8 fields ## Geometry type: GEOMETRY ## Dimension: XY ## Bounding box: xmin: 42627.83 ymin: 759913.3 xmax: 344391.7 ymax: 962563.7 ## Projected CRS: NAD83 / Massachusetts Mainland ## First 10 features: ## GEOID NAME.x variable.x estimate moe NAME.y ## 1 25017 Middlesex B19013_001 111790 1032 Middlesex County, Massachusetts ## 2 25005 Bristol B19013_001 74290 1371 Bristol County, Massachusetts ## 3 25015 Hampshire B19013_001 76959 2504 Hampshire County, Massachusetts ## 4 25025 Suffolk B19013_001 80260 1586 Suffolk County, Massachusetts ## 5 25023 Plymouth B19013_001 98190 1711 Plymouth County, Massachusetts ## 6 25027 Worcester B19013_001 81660 987 Worcester County, Massachusetts ## 7 25009 Essex B19013_001 86684 1333 Essex County, Massachusetts ## 8 25001 Barnstable B19013_001 82619 2539 Barnstable County, Massachusetts ## 9 25013 Hampden B19013_001 61310 1065 Hampden County, Massachusetts ## 10 25021 Norfolk B19013_001 112089 1589 Norfolk County, Massachusetts ## variable.y value geometry ## 1 P001001 1503085 POLYGON ((156148.2 928221.9... ## 2 P001001 548285 MULTIPOLYGON (((242645.3 82... ## 3 P001001 158080 POLYGON ((70011.47 905370.8... ## 4 P001001 722023 MULTIPOLYGON (((249226.6 90... ## 5 P001001 494919 MULTIPOLYGON (((228945 8660... ## 6 P001001 798552 POLYGON ((133785.2 905802.7... ## 7 P001001 743159 MULTIPOLYGON (((271933.1 93... ## 8 P001001 215888 POLYGON ((255986 819917.7, ... ## 9 P001001 463490 POLYGON ((70068.5 888831.2,... ## 10 P001001 670850 MULTIPOLYGON (((253159.7 88... ``` --- ### Cartograms * Often scaled by a population size type variable. ```r ma_trans_pop <- ma_trans_pop %>% rename(population_size = value, median_income = estimate) ma_trans_pop <- cartogram_cont(ma_trans_pop, weight = "population_size") ``` --- ### Cartograms .pull-left[ ```r ggplot() + geom_sf(data = ma_trans_pop, mapping = aes(geometry = geometry, fill = median_income)) + coord_sf() + scale_fill_viridis_c( name = "Median \nHousehold \nIncome", direction = -1) + theme_void() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass11b-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Cartograms .pull-left[ <img src="stat108_wk05wed_files/figure-html/mass5-1.png" width="768" style="display: block; margin: auto;" /> ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/mass11b-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### New Dataset * State Level Information ```r library(Lock5Data) glimpse(USStates) ``` ``` ## Rows: 50 ## Columns: 22 ## $ State <fct> Alabama, Alaska, Arizona, Arkansas, California, Color… ## $ HouseholdIncome <dbl> 46.472, 76.114, 53.510, 43.813, 67.169, 65.458, 73.78… ## $ Region <fct> S, W, W, S, W, W, NE, NE, S, S, W, W, MW, MW, MW, MW,… ## $ Population <dbl> 4.875, 0.740, 7.016, 3.004, 39.537, 5.607, 3.588, 0.9… ## $ EighthGradeMath <dbl> 268.7, 274.3, 279.9, 274.4, 275.6, 284.7, 286.2, 276.… ## $ HighSchool <dbl> 87.1, 92.8, 87.1, 89.1, 87.4, 91.5, 92.4, 89.2, 89.4,… ## $ College <dbl> 26.0, 26.5, 27.4, 24.7, 34.5, 39.6, 42.7, 33.4, 28.8,… ## $ IQ <dbl> 95.7, 99.0, 97.4, 97.5, 95.5, 101.6, 103.1, 100.4, 98… ## $ GSP <dbl> 40.279, 70.936, 43.096, 38.467, 67.698, 59.057, 67.78… ## $ Vegetables <dbl> 80.7, 81.0, 79.2, 80.7, 78.6, 82.6, 83.1, 82.8, 80.6,… ## $ Fruit <dbl> 55.1, 63.1, 62.8, 55.3, 67.5, 67.0, 68.5, 64.6, 65.6,… ## $ Smokers <dbl> 20.9, 21.0, 15.6, 22.3, 11.3, 14.6, 12.7, 17.0, 16.1,… ## $ PhysicalActivity <dbl> 42.8, 58.3, 52.7, 45.4, 57.5, 58.7, 51.9, 46.3, 49.5,… ## $ Obese <dbl> 36.2, 29.5, 29.5, 37.1, 25.8, 23.0, 27.4, 33.5, 30.7,… ## $ NonWhite <dbl> 31.6, 34.7, 22.5, 22.7, 39.4, 15.8, 23.3, 30.9, 24.3,… ## $ HeavyDrinkers <dbl> 5.45, 7.33, 5.57, 5.32, 5.95, 7.30, 6.45, 6.40, 7.61,… ## $ Electoral <int> 9, 3, 11, 6, 55, 9, 7, 3, 29, 16, 4, 4, 20, 11, 6, 6,… ## $ ClintonVote <dbl> 34.36, 36.55, 45.13, 33.65, 61.73, 48.16, 54.57, 53.0… ## $ Elect2016 <fct> R, R, R, R, D, D, D, D, R, R, D, R, D, R, R, R, R, R,… ## $ TwoParents <dbl> 60.9, 71.5, 62.7, 63.3, 66.8, 71.9, 66.5, 62.6, 61.0,… ## $ StudentSpending <dbl> 9.236, 17.510, 7.613, 9.846, 11.495, 9.575, 18.958, 1… ## $ Insured <dbl> 83.7, 80.2, 83.4, 84.1, 85.2, 87.2, 91.1, 90.7, 78.2,… ``` --- ### Hexbin Maps ```r #install.packages("statebins") library(statebins) statebins(state_data = USStates, state_col = "State", value_col = "Insured", ggplot2_scale_function = viridis::scale_fill_viridis, direction = -1) + theme_statebins() ``` ``` ## Error in nchar(state_data[, state_col]): 'nchar()' requires a character vector ``` -- ```r class(USStates$State) ``` ``` ## [1] "factor" ``` ```r USStates <- mutate(USStates, State = as.character(State)) ``` --- ### Hexbin Maps .pull-left[ ```r statebins(state_data = USStates, state_col = "State", value_col = "Insured", ggplot2_scale_function = viridis::scale_fill_viridis, direction = -1, round = TRUE) + theme_statebins() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/states-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Returning to the Volcanoes ```r Eruptions <- read_csv("https://raw.githubusercontent.com/harvard-stat108s23/materials/main/psets/data/GVP_Eruption_Results.csv") select(Eruptions, VolcanoName, StartYear, Latitude, Longitude) %>% glimpse() ``` ``` ## Rows: 11,019 ## Columns: 4 ## $ VolcanoName <chr> "Bulusan", "Alaid", "Turrialba", "San Miguel", "Chill\xe0n… ## $ StartYear <dbl> 2016, 2016, 2016, 2016, 2016, 2016, 2015, 2015, 2015, 2015… ## $ Latitude <dbl> 12.770, 50.861, 10.025, 13.434, -36.863, 1.112, 11.984, 37… ## $ Longitude <dbl> 124.050, 155.565, -83.767, -88.269, -71.377, 124.737, -86.… ``` ```r world <- map_data("world") ``` --- ### Static Maps .pull-left[ ```r ggplot(data = world, mapping = aes(x = long, y = lat, group = group)) + geom_polygon(fill = "azure", color = "olivedrab4") + coord_sf() + geom_point(data = Eruptions, mapping = aes(x = Longitude, y = Latitude), inherit.aes = FALSE, color = "orange3") + theme_void() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/volcano-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Raster Maps .pull-left[ * Want a specific base layer map: `ggmap` package ```r library(ggmap) aleutian_box <- c(bottom = 50.977, left = -172.164, top = 59.5617, right = -157.1507) aleutian <- get_stamenmap(aleutian_box, maptype = "terrain-background", zoom = 5) aleutian %>% ggmap() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/volcano2-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Raster Maps * Can take a long time to download so not a bad idea to save: ```r save(aleutian, file = "aleutian.RData") ``` * And then load it when you need to: ```r load("aleutian.RData") ``` --- ### Raster Maps .pull-left[ ```r aleutian %>% ggmap() + geom_point(data = Eruptions, mapping = aes(x = Longitude, y = Latitude), inherit.aes = FALSE, color = "orange3") + theme_void() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/volcano3-1.png" width="768" style="display: block; margin: auto;" /> ] --- ## Raster Maps ```r eruption_count <- count(Eruptions, VolcanoName, Latitude, Longitude) %>% filter(Longitude > -172.164, Longitude < -157.1507, Latitude > 50.977, Latitude < 59.5617) %>% arrange(desc(n)) eruption_count ``` ``` ## # A tibble: 24 × 4 ## VolcanoName Latitude Longitude n ## <chr> <dbl> <dbl> <int> ## 1 Shishaldin 54.8 -164. 49 ## 2 Akutan 54.1 -166. 47 ## 3 Pavlof 55.4 -162. 46 ## 4 Cleveland 52.8 -170. 29 ## 5 Makushin 53.9 -167. 28 ## 6 Veniaminof 56.2 -159. 24 ## 7 Okmok 53.4 -168. 19 ## 8 Aniakchak 56.9 -158. 15 ## 9 Bogoslof 53.9 -168. 11 ## 10 Amukta 52.5 -171. 8 ## # … with 14 more rows ``` --- ## Raster Maps .pull-left[ ```r aleutian %>% ggmap() + geom_point(data = eruption_count, mapping = aes(x = Longitude, y = Latitude, size = n), inherit.aes = FALSE, color = "orange3", alpha = 0.5) + theme_void() ``` ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/volcano4-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Expectations of Today's Viewer .pull-left[ * But what about dynamic zooming? * What about clickable labels? ] .pull-right[ <img src="stat108_wk05wed_files/figure-html/volcano4-1.png" width="768" style="display: block; margin: auto;" /> ] --- ### Interactive Maps with `leaflet` .pull-left[ * Syntax similar to `ggplot2` and `dplyr` * Interactive zooming * Embed interactive map into RMarkdown documents (HTML output) ] .pull-right[
] --- ### Interactive Maps with `leaflet` .pull-left[ ```r library(leaflet) leaflet() %>% addTiles() %>% addCircleMarkers(lng = ~Longitude, lat = ~Latitude, data = eruption_count, radius = ~sqrt(n), stroke = FALSE, fillOpacity = 0.9) ``` ] .pull-right[ * First Layer: `leaflet()` + Creates the map widget * Second layer: `addTiles()` + Creates a base map using [OpenStreetMap](https://www.openstreetmap.org) * Additional layers: `addMarkers()`, `addPolygons()` ] --- ### Interactive Maps with `leaflet` .pull-left[ ```r library(leaflet) leaflet() %>% addTiles() %>% addCircleMarkers(lng = ~Longitude, lat = ~Latitude, data = eruption_count, radius = ~sqrt(n), stroke = FALSE, fillOpacity = 0.9) ``` ] .pull-right[
] --- ### Clusters .pull-left[ ```r leaflet() %>% setView(lng = -160, lat = 55, zoom = 6) %>% addTiles() %>% addCircleMarkers(lng = ~Longitude, lat = ~Latitude, data = eruption_count, clusterOptions = markerClusterOptions()) ``` ] .pull-right[
] --- ### Zooming Constraints .pull-left[ ```r leaflet(options = leafletOptions(minZoom = 3, maxZoom = 7)) %>% addTiles() %>% addCircleMarkers(lng = ~Longitude, lat = ~Latitude, data = eruption_count, radius = ~sqrt(n)) ``` ] .pull-right[
] --- ### Other Markers .pull-left[ ```r leaflet() %>% setView(lng = -160, lat = 55, zoom = 4) %>% addTiles() %>% addMarkers(lng = ~Longitude, lat = ~Latitude, data = eruption_count, label = ~as.character(n)) ``` ] .pull-right[
] --- ### Custom Icons ```r volcano <- makeIcon( iconUrl = "https://raw.githubusercontent.com/harvard-stat108s23/materials/main/img/volcano.png", iconWidth = 20, iconHeight = 20) volcano ``` ``` ## $iconUrl ## [1] "https://raw.githubusercontent.com/harvard-stat108s23/materials/main/img/volcano.png" ## ## $iconWidth ## [1] 20 ## ## $iconHeight ## [1] 20 ## ## attr(,"class") ## [1] "leaflet_icon" ``` --- ### Custom Icons .pull-left[ ```r leaflet() %>% setView(lng = -160, lat = 55, zoom = 4) %>% addTiles() %>% addMarkers(lng = ~Longitude, lat = ~Latitude, data = eruption_count, label = ~as.character(n), icon = volcano) ``` ] .pull-right[
] --- ### Pop-Ups ```r content <- paste("<b>", eruption_count$VolcanoName, "</b></br>", "Number of eruptions:", eruption_count$n) content ``` ``` ## [1] "<b> Shishaldin </b></br> Number of eruptions: 49" ## [2] "<b> Akutan </b></br> Number of eruptions: 47" ## [3] "<b> Pavlof </b></br> Number of eruptions: 46" ## [4] "<b> Cleveland </b></br> Number of eruptions: 29" ## [5] "<b> Makushin </b></br> Number of eruptions: 28" ## [6] "<b> Veniaminof </b></br> Number of eruptions: 24" ## [7] "<b> Okmok </b></br> Number of eruptions: 19" ## [8] "<b> Aniakchak </b></br> Number of eruptions: 15" ## [9] "<b> Bogoslof </b></br> Number of eruptions: 11" ## [10] "<b> Amukta </b></br> Number of eruptions: 8" ## [11] "<b> Westdahl </b></br> Number of eruptions: 8" ## [12] "<b> Vsevidof </b></br> Number of eruptions: 7" ## [13] "<b> Fisher </b></br> Number of eruptions: 6" ## [14] "<b> Yunaska </b></br> Number of eruptions: 6" ## [15] "<b> Isanotski </b></br> Number of eruptions: 5" ## [16] "<b> Carlisle </b></br> Number of eruptions: 4" ## [17] "<b> Amak </b></br> Number of eruptions: 3" ## [18] "<b> St. Paul Island </b></br> Number of eruptions: 2" ## [19] "<b> Black Peak </b></br> Number of eruptions: 1" ## [20] "<b> Dana </b></br> Number of eruptions: 1" ## [21] "<b> Kagamil </b></br> Number of eruptions: 1" ## [22] "<b> Kupreanof </b></br> Number of eruptions: 1" ## [23] "<b> Roundtop </b></br> Number of eruptions: 1" ## [24] "<b> Yantarni </b></br> Number of eruptions: 1" ``` --- ### Pop-Ups .pull-left[ ```r leaflet() %>% setView(lng = -160, lat = 55, zoom = 4) %>% addTiles() %>% addMarkers(lng = ~Longitude, lat = ~Latitude, data = eruption_count, popup = content, icon = volcano) ``` ] .pull-right[
] --- ### Other Tiles .pull-left[ ```r library(leaflet.extras) leaflet() %>% setView(lng = -71.1167, lat = 42.3770, zoom = 15) %>% addProviderTiles(providers$Stamen.Watercolor) ``` ] .pull-right[
] --- ### Other Tiles .pull-left[ ```r leaflet() %>% setView(lng = -71.1167, lat = 42.3770, zoom = 15) %>% addProviderTiles( providers$CartoDB.Positron) %>% addMiniMap() ``` ] .pull-right[
] --- ### Other Tiles .pull-left[ ```r leaflet() %>% setView(lng = -98.58, lat = 39.82, zoom = 3) %>% addProviderTiles( "NASAGIBS.ViirsEarthAtNight2012") ``` ] .pull-right[
] --- ### Interactive Choropleth Map ```r pal <- colorNumeric(palette = "viridis", domain = ma$estimate, reverse = TRUE) pal(ma$estimate) ``` ``` ## [1] "#482071" "#65CB5E" "#4CC26C" "#32B67A" "#31688E" "#29AF7F" "#1F9A8A" ## [8] "#25AC82" "#FDE725" "#481E70" "#D3E21B" "#48C16E" "#440154" "#E8E419" ``` --- ### Interactive Choropleth Map .pull-left[ ```r ma %>% leaflet() %>% addTiles() %>% addPolygons(popup = ~NAME, color = ~pal(estimate), stroke = FALSE, fillOpacity = 0.9) ``` * Throws a warning about the projection ] .pull-right[
] --- ### Interactive Choropleth Map .pull-left[ ```r ma %>% sf::st_transform(crs = "EPSG:4326") %>% leaflet() %>% addTiles() %>% addPolygons(popup = ~NAME, color = ~pal(estimate), stroke = FALSE, fillOpacity = 0.9) ``` ] .pull-right[
] --- ### Interactive Choropleth Map .pull-left[ ```r ma %>% sf::st_transform(crs = "EPSG:4326") %>% leaflet() %>% addTiles() %>% addPolygons(popup = ~NAME, color = ~pal(estimate), stroke = FALSE, fillOpacity = 0.9) %>% addLegend("bottomright", pal = pal, values = ~estimate, title = "Median Income", opacity = 1) ``` ] .pull-right[
] --- ### Want more user interactivity? -- We are getting there! Have to learn `shiny`. -- ### Want to learn more about spatial data? * Dig into wrangling spatial data with the `sf` package: [https://github.com/rstudio/cheatsheets/blob/master/sf.pdf](https://github.com/rstudio/cheatsheets/blob/master/sf.pdf) * All things spatial data in R: [https://geocompr.robinlovelace.net/index.html](https://geocompr.robinlovelace.net/index.html)