background-image: url("img/DAW.png") background-position: left background-size: 50% class: middle, center, .pull-right[ ## .base-blue[Decisions, Decisions] <br> <br> ### .purple[Kelly McConville] #### .purple[ Stat 100 | Week 11 | Fall 2022] ] --- ### Goals for Today .pull-left[ * Undergraduate research * More hypothesis testing with `infer` ] .pull-right[ * **Decisions** in a hypothesis test + Types of errors * Statistical inference zoom out ] --- class: middle, center ## Let's talk about Undergraduate Research. -- #### *"an inquiry or investigation conducted by an undergraduate student that makes an original intellectual or creative contribution to the discipline."* -- Council for Undergraduate Research -- Let's look at some specific examples of undergraduate research in statistics. --- ### Example 1: Estimator Comparison My team has developed a new estimator of a population parameter. How does it compare to the usual suspects? .pull-left[ <img src="img/minions.png" width="85%" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="img/purple_minion.png" width="55%" style="display: block; margin: auto;" /> ] --- ### [Example 1: Estimator Comparison](https://www.frontiersin.org/articles/10.3389/ffgc.2021.763414/full) My team has developed a new estimator of **the population mean**. How does it compare to the usual suspects? .pull-left[ <img src="img/minions.png" width="85%" style="display: block; margin: auto;" /> Usual suspects: * Sample mean * Post-stratified estimator * Generalized regression estimator (GREG) ] .pull-right[ <img src="img/purple_minion.png" width="55%" style="display: block; margin: auto;" /> New estimator: * Generalized regression estimator over resolutions of Y (GREGORY) ] --- ### [Example 1: Estimator Comparison](https://www.frontiersin.org/articles/10.3389/ffgc.2021.763414/full) My team has developed a new estimator of the population mean. How does it compare to the usual suspects? .pull-left[ <img src="img/tree_amigos.png" width="85%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="img/gregory.png" width="110%" style="display: block; margin: auto;" /> ] --- ### [Example 2: Science Question Answered with Stat Modeling](https://www.mdpi.com/1999-4907/11/8/856) Can we estimate the loss of forested lands in North Central Georgia? .pull-left[ <img src="img/fia_data.gif" width="95%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="img/net_FIA5yr_TS1yr_TS_5yr.jpg" width="90%" style="display: block; margin: auto;" /> ] --- class: background-image: url("img/mases.001.png") background-position: left background-size: contain .pull-right2[ ### [Example 3: Software Creation](https://cran.r-project.org/web/packages/mase/index.html) <img src="img/mase_code.png" width="120%" style="display: block; margin: auto;" /> ] --- ### [Example 4: (Data) Science Communication](https://mjdvl.shinyapps.io/NCASI_APP/) How can we help forest managers understand the climate risks for their forests? <img src="img/climate_dash.png" width="60%" style="display: block; margin: auto;" /> --- ## Debunking Myths about Research **Myth**: *I should only consider engaging in undergraduate research if I want to go into academia.* -- * Conducting undergraduate research has **lots** of benefits: + Clarity on career goals + Deeper exposure to field + Increased sense of belonging + Skill development (communication, problem-solving) -- **Myth**: *I am behind if I didn't start doing undergraduate research early in my undergraduate career.* -- * Pick the time that makes sense for you. * Common moments: senior thesis, summer after sophomore or junior year ********************************* -- * Happy to chat about undergraduate research during Stat 100 OHs or my general topic OHs on Weds 2:30-3:30pm! * Check out [Mally Shan's article](https://www.hodp.org/project/a-data-driven-way-to-find-your-perfect-research-opportunity-on-campus) to learn about finding UR opportunities on-campus. --- class: middle, center ## Back to Hypothesis Testing --- ## Another Hypothesis Testing Example Let's return to the palmer penguins and ask if flipper length varies, on average, by the sex of the penguin. -- `\(H_o: \mu_F - \mu_M = 0\)` `\(H_a: \mu_F - \mu_M \neq 0\)` -- Need a null distribution for the difference in sample means. -- **Question**: If I shuffle (permute) the `sex` column and then compute the difference in sample means, what do you expect the difference in sample means to equal? ``` ## # A tibble: 333 × 2 ## flipper_length_mm sex ## <int> <fct> ## 1 181 male ## 2 186 female ## 3 195 female ## 4 193 female ## 5 190 male ## 6 181 female ## 7 195 male ## 8 182 female ## 9 191 male ## 10 198 male ## # … with 323 more rows ``` --- ## Generating a Null Distribution Let's return to the penguins and ask if flipper length varies, on average, by the sex of the penguin. `\(H_o: \mu_F - \mu_M = 0\)` `\(H_a: \mu_F - \mu_M \neq 0\)` Need a null distribution for the difference in sample means. Steps: 1. Permute/shuffle the `sex` column. 2. Compute the difference in sample means. 3. Repeat 1 and 2 many times. Let's go back to the "hypothesisTesting.Rmd" document and see how to implement this process with `infer`. --- ### Hypothesis Testing: Decisions, Decisions Once you get to the end of a hypothesis test you make one of two decisions: -- (1) P-value is small. → I have evidence for `\(H_a\)`. Reject `\(H_o\)`. -- (2) P-value is not small. → I don't have evidence for `\(H_a\)`. Fail to reject `\(H_o\)`. -- Sometimes we make the correct decision. Sometimes we make a mistake. --- ### Hypothesis Testing: Decisions, Decisions Let's create a table of potential outcomes. <br> <br><br><br><br><br><br><br><br> <br> <br> <br> -- `\(\alpha\)` = prob of Type I error **under repeated sampling** = prob reject `\(H_o\)` when it is true -- `\(\beta\)` = prob of Type II error **under repeated sampling** = prob fail to reject `\(H_o\)` when `\(H_a\)` is true. --- ### Hypothesis Testing: Decisions, Decisions Typically set `\(\alpha\)` level beforehand. -- Use `\(\alpha\)` to determine "small" for a p-value. -- (1) P-value ~~is~~ ~~small~~ `\(< \alpha\)`. → I have evidence for `\(H_a\)`. Reject `\(H_o\)`. (2) P-value ~~is~~ ~~not~~ ~~small~~ `\(\geq \alpha\)`. → I don't have evidence for `\(H_a\)`. Fail to reject `\(H_o\)`. --- ### Hypothesis Testing: Decisions, Decisions **Question**: How do I select `\(\alpha\)`? -- * Will depend on the convention in your field. -- * Want a small `\(\alpha\)` and a small `\(\beta\)`. But they are related. + How? -- **The smaller `\(\alpha\)` is the larger `\(\beta\)` will be.** -- → Choose a lower `\(\alpha\)` (e.g., 0.01, 0.001) when the Type I error is worse and a higher `\(\alpha\)` (e.g., 0.1) when the Type II error is worse. -- * Note: Can't easily compute `\(\beta\)`. Why? * One more important term: + **Power** = probability reject `\(H_o\)` when the alternative is true. --- ### Example Suppose we have a baseball player who has been a 0.250 career hitter who suddenly improves to be a 0.333 hitter. He wants a raise but needs to convince his manager that he has genuinely improved. The manager offers to examine his performance in 20 at-bats. .pull-left[ #### Ho: ] .pull-right[ #### Ha: ] -- .pull-left[ <img src="stat100_wk11mon_files/figure-html/unnamed-chunk-12-1.png" width="576" style="display: block; margin: auto;" /> ] -- .pull-right[ * When `\(\alpha\)` is set to `\(0.05\)`, he needs to hit 9 or more (showcase a hitting average of at least 0.45) to get a small enough p-value to reject `\(H_o\)`. * When `\(\alpha\)` is set to `\(0.05\)`, the power of this test is 0.18. * Why is the power **so low**? * What aspects of the test could the baseball player change to **increase the power** of the test? ] --- ### Example Suppose we have a baseball player who has been a 0.250 career hitter who suddenly improves to be a 0.333 hitter. He wants a raise but needs to convince his manager that he has genuinely improved. The manager offers to examine his performance in ~~20~~ 100 at-bats. **What will happen to the power of the test if we increase the sample size?** -- .pull-left[ <img src="stat100_wk11mon_files/figure-html/unnamed-chunk-13-1.png" width="576" style="display: block; margin: auto;" /> ] -- .pull-right[ * Increasing the sample size increases the power. * When `\(\alpha\)` is set to `\(0.05\)` and the sample size is now 100, the power of this test is 0.55. ] --- ### Example Suppose we have a baseball player who has been a 0.250 career hitter who suddenly improves to be a 0.333 hitter. He wants a raise but needs to convince his manager that he has genuinely improved. The manager offers to examine his performance in ~~20~~ 100 at-bats. **What will happen to the power of the test if we increase `\(\alpha\)` to 0.1?** -- .pull-left[ <img src="stat100_wk11mon_files/figure-html/unnamed-chunk-14-1.png" width="576" style="display: block; margin: auto;" /> ] -- .pull-right[ * Increasing `\(\alpha\)` increases the power. + Decreases `\(\beta\)`. * When `\(\alpha\)` is set to `\(0.1\)` and the sample size is 100, the power of this test is 0.65. ] --- ### Example Suppose we have a baseball player who has been a 0.250 career hitter who suddenly improves to be a ~~0.333~~ 0.400 hitter. He wants a raise but needs to convince his manager that he has genuinely improved. The manager offers to examine his performance in ~~20~~ 100 at-bats. **What will happen to the power of the test if he is an even better player?** -- .pull-left[ <img src="stat100_wk11mon_files/figure-html/unnamed-chunk-15-1.png" width="576" style="display: block; margin: auto;" /> ] -- .pull-right[ * **Effect size**: Difference between true value of the parameter and null value. + Often standardized. * Increasing the effect size increases the power. * When `\(\alpha\)` is set to `\(0.1\)`, the sample size is 100, and the true probability of hitting the ball is 0.4, the power of this test is 0.97. ] --- ## Thoughts on Power * What aspects of the test did the player actually have control over? -- * Why is it easier to set `\(\alpha\)` than to set `\(\beta\)` or power? -- * Consider power before collecting data is very important! --- ### Reporting Results in Journal Articles <img src="img/bem_honorton_1994_results.png" width="60%" style="display: block; margin: auto;" /> --- background-image: url("img/ci_diagram_sim.png") background-position: contain background-size: 70% ### Statistical Inference Zoom Out -- Estimation -- **Question**: How did folks do inference before computers? --- background-image: url("img/hyp_testing_diagram_sim.png") background-position: contain background-size: 80% ### Statistical Inference Zoom Out -- Testing -- **Question**: How did folks do inference before computers? --- background-image: url("img/ci_diagram_sim.png") background-position: contain background-size: 70% ### Statistical Inference Zoom Out -- Estimation **Question**: How did folks do inference before computers? --- background-image: url("img/ci_diagram.png") background-position: contain background-size: 70% ### Statistical Inference Zoom Out -- Estimation **Question**: How did folks do inference before computers? --- background-image: url("img/hyp_testing_diagram_sim.png") background-position: contain background-size: 80% ### Statistical Inference Zoom Out -- Testing **Question**: How did folks do inference before computers? --- background-image: url("img/hyp_testing_diagram.png") background-position: contain background-size: 80% ### Statistical Inference Zoom Out -- Testing **Question**: How did folks do inference before computers? --- class: middle, center ## This means we need to learn about probability models!