background-image: url("img/logo_padded.001.jpeg") background-position: left background-size: 60% class: middle, center, .pull-right[ <br> ## .base_color[Git/GitHub] ## .base_color[and] ## .base_color[Basic Interactivity with `plotly`] #### .navy[Kelly McConville] #### .navy[ Stat 108 | Week 3 | Spring 2023] ] --- ## Announcements * All P-Sets are now due at **10pm** instead of **5pm** on Wednesdays so that you have time to finish up/process the help given during the Wednesday afternoon OHs. + **Important note**: The teaching team will NOT be actively watching the Slack #q-and-a channel from 5-10pm. ************************ ## Week 3 Goals .pull-left[ **Mon Lecture** * Animation with `gganimate` * Data wrangling with `dplyr` ] .pull-right[ **Wed Lecture** * GitHub/git * RStudio Projects * Basic interactivity with `plotly` + More advanced interactivity to come later on. ] --- class: middle, center ## But First: A Few Student Questions --- ## How do I save an animation? ```r library(gganimate) anim_save("p_aleutian_an.gif", p_aleutian_an) ``` --- ## Inheriting `aes` from `ggplot()` .pull-left[ ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_point(data = holidays, size = 3, color = "black") ``` * What `aes`thetics did the second `geom_point()` inherit? What didn't it inherit? ] .pull-right[ <img src="stat108_wk03wed_files/figure-html/aesPlot-1.png" width="768" style="display: block; margin: auto;" /> ] --- ```r glimpse(Births2015) ``` ``` ## Rows: 365 ## Columns: 8 ## $ date <date> 2015-01-01, 2015-01-02, 2015-01-03, 2015-01-04, 2015-01-… ## $ births <dbl> 8068, 10850, 8328, 7065, 11892, 12425, 12141, 12094, 1186… ## $ wday <ord> Thu, Fri, Sat, Sun, Mon, Tue, Wed, Thu, Fri, Sat, Sun, Mo… ## $ year <dbl> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 201… ## $ month <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ day_of_year <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17… ## $ day_of_month <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17… ## $ day_of_week <dbl> 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, … ``` ```r glimpse(holidays) ``` ``` ## Rows: 7 ## Columns: 9 ## $ date <date> 2015-01-01, 2015-05-25, 2015-07-04, 2015-12-25, 2015-11-… ## $ occasion <chr> "New Year", "Memorial Day", "Independence Day", "Christma… ## $ births <dbl> 8068, 7746, 7944, 6515, 7332, 8714, 8127 ## $ wday <ord> Thu, Mon, Sat, Fri, Thu, Thu, Mon ## $ year <dbl> 2015, 2015, 2015, 2015, 2015, 2015, 2015 ## $ month <dbl> 1, 5, 7, 12, 11, 12, 9 ## $ day_of_year <int> 1, 145, 185, 359, 330, 358, 250 ## $ day_of_month <dbl> 1, 25, 4, 25, 26, 24, 7 ## $ day_of_week <dbl> 5, 2, 7, 6, 5, 5, 2 ``` --- ## Inheriting `aes` from `ggplot()` * What `aes`thetics did the second `geom_point()` inherit? What didn't it inherit? ```r holidays <- rename(holidays, Dates = date) ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_point(data = holidays, size = 3, color = "black") ``` ``` ## Error in `geom_point()`: ## ! Problem while computing aesthetics. ## ℹ Error occurred in the 2nd layer. ## Caused by error in `compute_aesthetics()`: ## ! Aesthetics are not valid data columns. ## ✖ The following aesthetics are invalid: ## ✖ `x = date` ## ℹ Did you mistype the name of a data column or forget to add `after_stat()`? ``` --- ## Inheriting `aes` from `ggplot()` .pull-left[ ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_point(data = holidays, size = 3, color = "black", mapping = aes(x = Dates)) ``` * What `aes`thetics did the second `geom_point()` inherit? What didn't it inherit? ] .pull-right[ <img src="stat108_wk03wed_files/figure-html/aesPlot2-1.png" width="768" style="display: block; margin: auto;" /> ] --- ## Inheriting `aes` from `ggplot()` .pull-left[ ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_point(data = holidays, size = 3, color = "black", mapping = aes(x = Dates), inherit.aes = FALSE) ``` ``` ## Error in `geom_point()`: ## ! Problem while setting up geom. ## ℹ Error occurred in the 2nd layer. ## Caused by error in `compute_geom_1()`: ## ! `geom_point()` requires the following missing aesthetics: y ``` * What `aes`thetics did the second `geom_point()` inherit? What didn't it inherit? ] .pull-right[ ] --- ## Inheriting `aes` from `ggplot()` .pull-left[ ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_point(data = holidays, size = 3, color = "black", mapping = aes(x = Dates, y = births), inherit.aes = FALSE) ``` * What `aes`thetics did the second `geom_point()` inherit? What didn't it inherit? ] .pull-right[ <img src="stat108_wk03wed_files/figure-html/aesPlot4-1.png" width="768" style="display: block; margin: auto;" /> ] --- ## Inheriting `aes` from `ggplot()` .pull-left[ ```r #Add a box around Thanksgiving to Christmas holidays_season <- data.frame(start = as_date("2015-11-26"), end = as_date("2015-12-24")) ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_rect(data = holidays_season, mapping = aes(xmin = start, xmax = end, ymin = 6000, ymax = 14000)) + geom_point() ``` ``` ## Error in `geom_rect()`: ## ! Problem while computing aesthetics. ## ℹ Error occurred in the 1st layer. ## Caused by error in `FUN()`: ## ! object 'births' not found ``` * Problem: non-matching `aes` arguments ] .pull-right[ ] --- ## Inheriting `aes` from `ggplot()` .pull-left[ ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_rect(data = holidays_season, mapping = aes(xmin = start, xmax = end, ymin = 6000, ymax = 14000), inherit.aes = FALSE) + geom_point() ``` * Problem: non-matching `aes` arguments ] .pull-right[ <img src="stat108_wk03wed_files/figure-html/aesPlot6-1.png" width="768" style="display: block; margin: auto;" /> ] --- ## Safe Play: Put the mappings in the individual `geoms` .pull-left[ ```r ggplot(data = Births2015) + geom_rect(data = holidays_season, mapping = aes(xmin = start, xmax = end, ymin = 6000, ymax = 14000), inherit.aes = FALSE) + geom_point(mapping = aes(x = date, y = births, color = wday)) ``` * Problem: non-matching `aes` arguments ] .pull-right[ <img src="stat108_wk03wed_files/figure-html/aesPlot7-1.png" width="768" style="display: block; margin: auto;" /> ] --- ## Simple Interactivity with `ggplotly` * We can use the `plotly` package to convert a static `ggplot` to an interactive plot. .pull-left[ ```r p <- ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() p ``` ] .pull-right[ <img src="stat108_wk03wed_files/figure-html/plotly1-1.png" width="768" style="display: block; margin: auto;" /> ] --- ## Simple Interactivity with `ggplotly` * We can use the `plotly` package to convert a static `ggplot` to an interactive plot. .pull-left[ ```r library(plotly) ggplotly(p) ``` * Conversion isn't always perfect. + May need to spend more time tweaking `theme()` beforehand. * Can also create graphs with `plot_ly()` ] .pull-right[
] --- ## Simple Interactivity with `ggplotly` .pull-left[ ```r ggplotly(p, dynamicTicks = TRUE) %>% rangeslider() %>% layout(hovermode = "x") ``` * Add a slider and make tick marks dynamic with zooming * Change the comparisons made when you hover ] .pull-right[
] --- ## Simple Interactivity with `ggplotly` .pull-left[ ```r p <- ggplot(data = holidays, mapping = aes(text = occasion)) + geom_point(data = Births2015, mapping = aes(x = date, y = births, color = wday), inherit.aes = FALSE) + geom_point(data = holidays, size = 3, color = "black", mapping = aes(x = Dates, y = births)) ggplotly(p, tooltip = "text") ``` ] .pull-right[
] --- class: middle, center .pull-left[ ## Shift from <img src="img/STAT108Logo_Viz.png" width="40%" style="display: block; margin: auto;" /> ] -- .pull-right[ ## To <img src="img/STAT108Logo_Sharing.png" width="40%" style="display: block; margin: auto;" /> ] --- ## R Projects .pull-left[ * Where does your analysis live? + Working directory + Where `R` looks for files you ask it to load. + Where `R` puts files you ask it to save. ] .pull-right[ <img src="img/console_wd.png" width="60%" style="display: block; margin: auto;" /> ] -- ```r getwd() ``` ``` ## [1] "/n/academic_homes/g117625/u428252g117625/stat108s23/slides" ``` -- * For a given project, your analyses should live in the folder where you store the files associated with the project. + In other words, **working directory = project folder**. * Common default: Working directory = home directory * Can change the working directory with `setwd()` but instead we will use RStudio Projects. --- ## RStudio Projects **RStudio Projects**: RStudio feature that helps you organize your work. * We will create a Stat 108 project shortly. + Will store course related RMarkdown documents, script files, data, figures, etc there. -- * For RStudio Projects, the working directory is the home directory of the project. -- * **Question**: My RStudio Project is `stat108s23`. Why does the file path end there when I run the following? ```r getwd() ``` ``` ## [1] "/n/academic_homes/g117625/u428252g117625/stat108s23/slides" ``` --- ## RStudio Projects * To access the project, you can do either of the following: + Go to the upper right and select "Open Project" + Click on the `___.Rproj` * Notice that when I open the project, all the files and command history are still there. --- ## RStudio Projects * To access files in this project, I use relative paths: ```r knitr::include_graphics("img/STAT108Logo_Sharing.png") ``` <img src="img/STAT108Logo_Sharing.png" width="20%" style="display: block; margin: auto;" /> --- ## Projects and Workflow * Create an RStudio project for each analysis project. + We will have three RStudio projects for this course: + An individual coursework related RStudio project + An RStudio project for project 1 + An RStudio project for project 2 -- * Keep data files that you need for the project in a `data` folder. -- * Keep your `Rmd`s and script files there. -- * Save outputs there. -- * Use relative paths whenever possible. --- background-image: url("img/octocat.001.jpeg") background-position: left background-size: 50% .pull-right[ ## git and GitHub * **git**: Version control system + Think fancier type of *Track Changes*. ] .pull-right[ * **GitHub**: Hosting service for git projects (which are called repositories) + Think fancier type of *DropBox* or *Google Drive*. * Useful resource when getting started: [https://happygitwithr.com/](https://happygitwithr.com/) ] --- ## Manual Version Control .pull-left[ * But I already do version control... + draft.Rmd, draft2.Rmd, final.Rmd, realFinal.Rmd, REALLYVERYREALFinal.Rmd... * Issues with this version of version control: + Hard to know how these files relate. + Hard to extend this approach to working with others. ] .pull-right[ <img src="img/phd-final.png" width="60%" style="display: block; margin: auto;" /> ] --- ## Git Real * Git is a *decentralized* version control system. + Each collaborator has a complete version of the repo. + Everyone can work offline and simultaneously. + GitHub holds the master copy. -- * git is not friendly and can be frustrating. + BUT, the version control and collaborative rewards are big! -- * [GitHub.com](GitHub.com) is a great place to develop an online presence. + In Stat 108, we will default to keeping the work private. -- * If you end up with a mess of errors, then don't worry but come see one of the instructors for help. + It happens to [everyone](https://xkcd.com/1597/). --- ## Github Repo = `RStudio` Project * A **repo**, short for repository, is the folder that contains all of the files for the project on [GitHub.com](GitHub.com). * Under the **harvard-stat108s23** GitHub Organization you currently have 1 repo: + `work-username`: Just you and the Stat 108 teaching team can access * For each repo, you should create an `RStudio` Project (with version control). + We will all do this together in a moment. --- ## Steps to Get Started **Once-Per-Computer/Server:** * Install git on your computer. + Already done on the Server. * Get your local git talking to GitHub. **Once-Per-Project:** * Create a repo on GitHub. + I did this step for us. * Create a version controlled RStudio Project that is synced with the GitHub repo. --- class: middle ## Let's try to set this up **twice**: ### First on the FAS OnDemand RStudio **Server** ### Second on **your own computer** (if you have a local installation of R/RStudio) --- ## First on the FAS OnDemand RStudio Server * Sign-in to GitHub.com. * Sign into the Stat 108 RStudio Server. * Let's **git** your RStudio synced with your GitHub account and then make a change to your personal repo! + Make sure you accepted the repo request sent to your email. --- ## Check if git is install on the RStudio Server In **Terminal**, type ```r which git ``` ```r git --version ``` * If you receive an answer, then git is already install! 🎉 * If the reply is `git: command not found` (or similar), then + Windows: Visit [https://gitforwindows.org/](https://gitforwindows.org/) + Macs: In the Terminal type ```r xcode-select --install ``` * If you had to install git, restart your RStudio Session. --- ## Introduce Yourself to Git * Install the package `usethis`. * In your R console, modify and run the following code to introduce yourself to git ```r library(usethis) use_git_config(user.name = "mcconvil", user.email = "kmcconville@fas.harvard.edu") ``` --- ## Personal Access Token Time * To interact with GitHub, you need to include credentials. + Personal Access Token (PAT) * In your R console, run ```r usethis::create_github_token() ``` * Should take you to GitHub. * Select "repo", "user", and "workflow" for the scopes. * Click "Generate token". * Copy the PAT to your clipboard and store it by running: ```r library(usethis) edit_r_environ() ``` --- ## Personal Access Token Time * Add the following to the `.Renviron` file: ```r GITHUB_PAT=PasteYourTokenHere ``` * Also run the following the console and when prompted for your password, provide your PAT: ```r library(credentials) credential_helper_set("store", global = TRUE) git_credential_ask('https://github.com') ``` * Also store the PAT somewhere safe in case you need it again! --- ## Sync GitHub.com repo and an RStudio Project **In your repo on GitHub.com**: * Click on the green clone or download button. * Copy the given url for "Clone with HTTPS". **In RStudio**: * In the upper left, go to File > New Project > Version Control. * Select Git. * Paste in the url. It should automatically give it a name. Select where you want the project to live in your home directory. Then click okay. --- class: middle, center ## Let's now do this for our local installation of RStudio! --- ## Check if git is install on your computer In **Terminal**, type ```r which git ``` ```r git --version ``` * If you receive an answer, then git is already install! 🎉 * If the reply is `git: command not found` (or similar), then + Windows: Visit [https://gitforwindows.org/](https://gitforwindows.org/) + Macs: In the Terminal type ```r xcode-select --install ``` * If you had to install git, restart your RStudio Session. --- ## Introduce Yourself to Git * Install the package `usethis`. * In your R console, modify and run the following code to introduce yourself to git ```r library(usethis) use_git_config(user.name = "mcconvil", user.email = "kmcconville@fas.harvard.edu") ``` --- ## Personal Access Token Time * Copy your PAT (that you stored in a safe place). + You can also generate a new one if you want. * Store it by running: ```r gitcreds::gitcreds_set() ``` --- ## Sync GitHub.com repo and an RStudio Project **In your repo on GitHub.com**: * Click on the green clone or download button. * Copy the given url for "Clone with HTTPS". **In RStudio**: * In the upper left, go to File > New Project > Version Control. * Select Git. * Paste in the url. It should automatically give it a name. Select where you want the project to live in your home directory. Then click okay. --- class: middle, center ## Feel free to use EITHER the RStudio Server and/or your local RStudio. --- ## Workflow Once your GitHub repo and RStudio project are synced, here's your workflow: * **Pull** the most recent version of the repo from GitHub to your RStudio project. * Do some work on your project in RStudio. * **Commit** that work. + Committing takes a snapshot of all the files in the project. + Look over the **Diff**: which shows what has changed since your last update. + Include a quick note, **Commit Message** to summarize the motivation for the changes. * **Push** your commit to GitHub from RStudio. --- class: middle, center, inverse ## Workflow Demo --- ## Ignoring Files * There are several files that we want to **NOT** push to GitHub. * These include: + `.gitignore` + `.DS_Store` * Add these files to the `.gitignore`. --- ## Test the waters: Let's go through the workflow. * Pull. (Yes, there is nothing to pull yet but it is always good practice to start here.) * Click on the ReadMe. * Add something to the ReadMe. * Click on the git tab. Check the box next to the ReadMe.md. Hit commit. * Put in a commit message. Look over the diff. * Push. **Look for updates in the ReadMe on GitHub.com.** --- ## Git Collaboration: Merge conflicts * What if my collaborators and I both make changes? + Scenario: Your collaborator makes changes to a file, commits, and pushes to GitHub. You also modify that file, commit and push. + Result: Your push will fail because there's a commit on GitHub that you don't have. + Usual Solution: Pull and *usually* git will merge their work nicely with yours. Then push. If that doesn't work, you have a **merge conflict**. Let's cross that bridge when we get there. * How to avoid merge conflicts? + Always pull when you are going to work on your project. + Always commit and push when you are done even if you made small changes. --- ## Collaboration: Git Style * **Projects**: Can use to create to do lists and stay organized. * **Issues**: Useful method to communicate with your group members. --- ### Reminders * All P-Sets are now due at **10pm** instead of **5pm** on Wednesdays so that you have time to finish up/process the help given during the Wednesday afternoon OHs. + **Important note**: The teaching team will NOT be actively watching the Slack #q-and-a channel from 5-10pm.