Part 4. `dplyr`: `mutate()`, `group_by()`/`summarize()`/`across`
Class Video
Slides
Open the slides in a separate window: https://sph-r-programming.netlify.com/slides/04-dplyr-ggplot2-part2#1
Midterm will be assigned next week
For this week, identify 1 dataset from the tidytuesday
datasets and identify a research question that you think can be answered by the data. Be curious!
Your dataset should have:
- categorical variables
- continuous variables
https://github.com/rfordatascience/tidytuesday
We will take part of next class doing short meetings to discuss the midterm and the datasets individually.
Post-Class
Please fill out the following survey and we will discuss the results during the next lecture. All responses will be anonymous.
- Clearest Point: What was the most clear part of the lecture?
- Muddiest Point: What was the most unclear part of the lecture to you?
- Anything Else: Is there something you’d like me to know?
Muddiest Points
How functions that we saw used with across() work, such as where() and start_with() (are they for across() only or do they work in other areas too?)
I don’t have more time to cover this in class, but I will show you how to get started with the tidyowl
tutorial.
I’m still not clear why we create a factor. I did the gender code without factoring and the table was the same when I included the factoring code. Why would i add this work?
Stay tuned!
scale_x/y_continuous
I will try and come up with more examples. This is a very handy reference: http://www.sthda.com/english/wiki/ggplot2-axis-ticks-a-guide-to-customize-tick-marks-and-labels
I struggled a bit with case_when () function and understanding how it is used.
I have put together a few more examples and we will try to go over them in class.