For all the talk about artificial intelligence and machine learning, plain ol’ statistics still powers the way we think about data — and there’s no better way to think through statistics than with Excel. That’s why this workshop is built to cover some of the most important statistical concepts from Excel in a single day. You are welcome to use it however helpful.
Multiple groups and points in time
I tend to start teaching inferential with the independent samples t-test, which allows users to compare means between two groups.
That may sound so constraining as to provide little use in the real-world, but quite the opposite is true. In fact, procedures like the t-test power some of the most common data tasks done today, such as the A/B test.
That said, there are probably quite a few other things you’d like to test: what about the differences across multiple groups? The difference across multiple points in time of the same subject?
There are also some assumptions to the t-test which don’t always hold up. What do you do then?
This workshop discusses these next-level statistical considerations, then caps with the queen of data analysis: linear regression.
Correlation and causation
We’ve all heard “correlation is not causation,” but what does that really mean? The second half of this workshop breaks it down right from Excel. Students will learn how to use a combination of visualizations and statistics to infer causality and make predictions.
The deeper instinct students can gather from these tests, the easier it will be for them to unpack more advanced tests and algorithms. By using Excel, this workshop keeps the focus off technology and coding, and on the more important statistical foundations.
Lesson 1: Comparing categories
Objective: Student can compare the expected values of two categories
Description:
- T-tests, continued
- Chi-square independent samples test
Time: 35 minutes
Assets needed: A/B test results dataset
Lesson 2: Comparing repeated measures
Objective: Student can compare the means of dependent samples
Description:
- Repeated measures in statistics
- Dependent samples t-test
Time: 35 minutes
Assets needed: Patient records dataset
Lesson 3: Comparing multiple groups
Objective: Student can compare the means of more than two groups
Description:
- One-way ANOVA
- Visualizing & interpreting resultzs
- Post-hoc tests and Type II error
Time: 75 minutes
Assets needed: Abalone snails data
Lesson 4: Parametric and non-parametric tests
Objective: Student can compare groups using non-parametric methods
- Parametric versus non-parametric tests
- Statistically testing for normality
- Wilcoxon signed-rank test
Time: 75 minutes
Assets needed: Patient records dataset
Lesson 5: Correlations
Objective: Student can correlate two or more variables and visualize the results
- Correlations and covariances
- Testing for correlations
- Correlations and visualizations
- Spurious correlations
- From correlation to causation
Time: 90 minutes
Assets needed: Athlete records dataset
Lesson 6: Linear regression
Objective: Student can conduct and interpret a univariate linear regression
- Checking assumptions
- Conducting a regression
- Model interpretation & diagnostics
Time: 120 minutes
Assets needed: Athlete records dataset
Leave a Reply