Useful Summary Tables
Symbols and R Functions
Response | Explanatory | Numerical_Quantity | Parameter | Statistic | Function |
---|---|---|---|---|---|
quantitative | - | mean | \(\mu\) | \(\bar{x}\) | t_test() |
categorical | - | proportion | \(p\) | \(\hat{p}\) | prop_test() |
quantitative | categorical | difference in means | \(\mu_1 - \mu_2\) | \(\bar{x}_1 - \bar{x}_2\) | t_test() |
categorical | categorical | difference in proportions | \(p_1 - p_2\) | \(\hat{p}_1 - \hat{p}_2\) | prop_test() |
quantitative | quantitative | correlation | \(\rho\) | \(r\) | cor.test() |
Common Test Statistics and Approximate Distributions
Response | Explanatory | Numerical_Quantity | Test_Statistic | Distribution | Assumptions |
---|---|---|---|---|---|
quantitative | - | mean | \(\frac{\bar{x} - \mu_o}{s/\sqrt{n}}\) | \(t(df = n - 1)\) | \(n \geq 30\) or data are normal |
categorical | - | proportion | \(\frac{\hat{p} - p_o}{\sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}}\) | \(N(0, 1)\) | Ten successes, Ten failures |
quantitative | categorical | difference in means | \(\frac{\bar{x}_1 - \bar{x}_2 - 0}{\sqrt{\frac{s^2_1}{n_1} + \frac{s^2_2}{n_2}}}\) | \(t(df = \min(n_1, n_2) - 1)\) | \(n_1, n_2 \geq 30\) or data are normal |
categorical | categorical | difference in proportions | \(\frac{\hat{p}_1 - \hat{p}_2 - 0}{\sqrt{\frac{\hat{p}(1 - \hat{p})}{n_1} + \frac{\hat{p}(1 - \hat{p})}{n_2}}}\) | \(N(0, 1)\) | Ten successes, Ten failures in each category |
quantitative | quantitative | correlation | \(\frac{r - 0}{\sqrt{\frac{1 - r^2}{n - 2}}}\) | \(t(df = n - 2)\) | \(n \geq 30\) |
Common Distribution-Based Confidence Interval Formulae
Response | Explanatory | Numerical_Quantity | Confidence_Interval | Distribution | Assumptions |
---|---|---|---|---|---|
quantitative | - | mean | \(\bar{x} \pm t^*s/\sqrt{n}\) | \(t(df = n - 1)\) | \(n \geq 30\) or data are normal |
categorical | - | proportion | \(\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}\) | \(N(0, 1)\) | Ten successes, Ten failures |
quantitative | categorical | difference in means | \(\bar{x}_1 - \bar{x}_2 \pm t^* \sqrt{\frac{s^2_1}{n_1} + \frac{s^2_2}{n_2}}\) | \(t(df = \min(n_1, n_2) - 1)\) | \(n_1, n_2 \geq 30\) or data are normal |
categorical | categorical | difference in proportions | \(\hat{p}_1 - \hat{p}_2 \pm z^* \sqrt{\frac{\hat{p}_1(1 - \hat{p}_1)}{n_1} + \frac{\hat{p}_2(1 - \hat{p}_2)}{n_2}}\) | \(N(0, 1)\) | Ten successes, Ten failures in each category |
quantitative | quantitative | correlation | \(r \pm t^* \sqrt{\frac{1 - r^2}{n - 2}}\) | \(t(df = n - 2)\) | \(n \geq 30\) |