Decisions, Decisions

background-image: url("img/DAW.png")
background-position: left
background-size: 50%
class: middle, center,

## .base-blue[Decisions, Decisions]

### .purple[Kelly McConville]

#### .purple[ Stat 100 | Week 11 | Fall 2022]

]

---

### Goals for Today

* Undergraduate research

* More hypothesis testing with `infer`

]

* **Decisions** in a hypothesis test
    + Types of errors

* Statistical inference zoom out

]

---

## Let's talk about Undergraduate Research.

#### *"an inquiry or investigation conducted by an undergraduate student that makes an original intellectual or creative contribution to the discipline."* -- Council for Undergraduate Research

Let's look at some specific examples of undergraduate research in statistics.

---

### Example 1: Estimator Comparison

My team has developed a new estimator of a population parameter.  How does it compare to the usual suspects?

]

]

---

### [Example 1: Estimator Comparison](https://www.frontiersin.org/articles/10.3389/ffgc.2021.763414/full)

My team has developed a new estimator of **the population mean**.  How does it compare to the usual suspects?

Usual suspects:

* Sample mean
* Post-stratified estimator
* Generalized regression estimator (GREG)

]

New estimator:

* Generalized regression estimator over resolutions of Y (GREGORY)

]

---

### [Example 1: Estimator Comparison](https://www.frontiersin.org/articles/10.3389/ffgc.2021.763414/full)

My team has developed a new estimator of the population mean.  How does it compare to the usual suspects?

]

]

---

### [Example 2: Science Question Answered with Stat Modeling](https://www.mdpi.com/1999-4907/11/8/856)

Can we estimate the loss of forested lands in North Central Georgia?

]

]

---

class: 
background-image: url("img/mases.001.png")
background-position: left
background-size: contain

### [Example 3: Software Creation](https://cran.r-project.org/web/packages/mase/index.html)

]

---

### [Example 4: (Data) Science Communication](https://mjdvl.shinyapps.io/NCASI_APP/)

How can we help forest managers understand the climate risks for their forests?

---

## Debunking Myths about Research

**Myth**: *I should only consider engaging in undergraduate research if I want to go into academia.*

* Conducting undergraduate research has **lots** of benefits:
    + Clarity on career goals
    + Deeper exposure to field
    + Increased sense of belonging
    + Skill development (communication, problem-solving)

**Myth**: *I am behind if I didn't start doing undergraduate research early in my undergraduate career.*

* Pick the time that makes sense for you.

* Common moments: senior thesis, summer after sophomore or junior year

*********************************

* Happy to chat about undergraduate research during Stat 100 OHs or my general topic OHs on Weds 2:30-3:30pm!

* Check out [Mally Shan's article](https://www.hodp.org/project/a-data-driven-way-to-find-your-perfect-research-opportunity-on-campus) to learn about finding UR opportunities on-campus.

---

## Back to Hypothesis Testing

---

## Another Hypothesis Testing Example

Let's return to the palmer penguins and ask if flipper length varies, on average, by the sex of the penguin.

`\(H_o: \mu_F - \mu_M = 0\)`

`\(H_a: \mu_F - \mu_M \neq 0\)`

Need a null distribution for the difference in sample means.

**Question**: If I shuffle (permute) the `sex` column and then compute the difference in sample means, what do you expect the difference in sample means to equal?

```
## # A tibble: 333 × 2
## flipper_length_mm sex 
## <int> <fct> 
## 1 181 male 
## 2 186 female
## 3 195 female
## 4 193 female
## 5 190 male 
## 6 181 female
## 7 195 male 
## 8 182 female
## 9 191 male 
## 10 198 male 
## # … with 323 more rows
```

---

## Generating a Null Distribution

Let's return to the penguins and ask if flipper length varies, on average, by the sex of the penguin.

`\(H_o: \mu_F - \mu_M = 0\)`

`\(H_a: \mu_F - \mu_M \neq 0\)`

Need a null distribution for the difference in sample means.

Steps:

1. Permute/shuffle the `sex` column.
2. Compute the difference in sample means.
3. Repeat 1 and 2 many times.

Let's go back to the "hypothesisTesting.Rmd" document and see how to implement this process with `infer`.

---

### Hypothesis Testing: Decisions, Decisions

Once you get to the end of a hypothesis test you make one of two decisions:

(1) P-value is small.

&rarr; I have evidence for `\(H_a\)`. Reject `\(H_o\)`.

(2) P-value is not small.

&rarr; I don't have evidence for `\(H_a\)`. Fail to reject `\(H_o\)`.

Sometimes we make the correct decision.  Sometimes we make a mistake.

---

### Hypothesis Testing: Decisions, Decisions

Let's create a table of potential outcomes.

`\(\alpha\)` = prob of Type I error **under repeated sampling** = prob reject `\(H_o\)` when it is true

`\(\beta\)` = prob of Type II error **under repeated sampling** = prob fail to reject `\(H_o\)` when `\(H_a\)` is true.

---

### Hypothesis Testing: Decisions, Decisions

Typically set `\(\alpha\)` level beforehand.

Use `\(\alpha\)` to determine "small" for a p-value.

(1) P-value ~~is~~ ~~small~~ `\(< \alpha\)`.

&rarr; I have evidence for `\(H_a\)`. Reject `\(H_o\)`.

(2) P-value ~~is~~ ~~not~~ ~~small~~  `\(\geq \alpha\)`.

&rarr; I don't have evidence for `\(H_a\)`. Fail to reject `\(H_o\)`.

---

### Hypothesis Testing: Decisions, Decisions

**Question**: How do I select `\(\alpha\)`?

* Will depend on the convention in your field.

* Want a small `\(\alpha\)` and a small `\(\beta\)`. But they are related.  
    + How?

**The smaller `\(\alpha\)` is the larger `\(\beta\)` will be.**

&rarr; Choose a lower `\(\alpha\)` (e.g., 0.01, 0.001) when the Type I error is worse and a higher `\(\alpha\)` (e.g., 0.1) when the Type II error is worse.

* Note: Can't easily compute `\(\beta\)`.  Why?

* One more important term:
    + **Power** = probability reject `\(H_o\)` when the alternative is true.

---

### Example

Suppose we have a baseball player who has been a 0.250 career hitter who suddenly improves to be a 0.333 hitter.  He wants a raise but needs to convince his manager that he has genuinely improved.  The manager offers to examine his performance in 20 at-bats.

#### Ho:

]

#### Ha:

]

.pull-left[
<img src="stat100_wk11mon_files/figure-html/unnamed-chunk-12-1.png" width="576" style="display: block; margin: auto;" />

]

* When `\(\alpha\)` is set to `\(0.05\)`, he needs to hit 9 or more (showcase a hitting average of at least 0.45) to get a small enough p-value to reject `\(H_o\)`.

* When `\(\alpha\)` is set to `\(0.05\)`, the power of this test is 0.18.

* Why is the power **so low**?

* What aspects of the test could the baseball player change to **increase the power** of the test?

]

---

### Example

**What will happen to the power of the test if we increase the sample size?**

]

* Increasing the sample size increases the power.

* When `\(\alpha\)` is set to `\(0.05\)` and the sample size is now 100, the power of this test is 0.55.

]

---

### Example

**What will happen to the power of the test if we increase `\(\alpha\)` to 0.1?**

]

* Increasing `\(\alpha\)` increases the power.
    + Decreases `\(\beta\)`.

* When `\(\alpha\)` is set to `\(0.1\)` and the sample size is  100, the power of this test is 0.65.

]

---

### Example

Suppose we have a baseball player who has been a 0.250 career hitter who suddenly improves to be a ~~0.333~~ 0.400 hitter.  He wants a raise but needs to convince his manager that he has genuinely improved.  The manager offers to examine his performance in ~~20~~ 100 at-bats.

**What will happen to the power of the test if he is an even better player?**

]

* **Effect size**: Difference between true value of the parameter and null value.
    + Often standardized.

* Increasing the effect size increases the power.

* When `\(\alpha\)` is set to `\(0.1\)`, the sample size is 100, and the true probability of hitting the ball is 0.4, the power of this test is 0.97.

]

---

## Thoughts on Power

* What aspects of the test did the player actually have control over?

* Why is it easier to set `\(\alpha\)` than to set `\(\beta\)` or power?

* Consider power before collecting data is very important!

---

### Reporting Results in Journal Articles

---

background-image: url("img/ci_diagram_sim.png")
background-position: contain
background-size: 70%

### Statistical Inference Zoom Out -- Estimation