Part 2: Loading Data, data.frames, and ggplot2
Class Video
Slides
For Loops / Projects / ggplot2
Open the slides in a separate window: https://sph-r-programming.netlify.com/slides/02-for_loops#1
Function of the Week Assignment
Please refer to the function_assignment
project in the RStudio Cloud workspace. We will go over this in class.
Post-Class
Please fill out the following survey and we will discuss the results during the next lecture. All responses will be anonymous.
- Clearest Point: What was the most clear part of the lecture?
- Muddiest Point: What was the most unclear part of the lecture to you?
- Anything Else: Is there something you’d like me to know?
Muddiest Points
little bit confused about how you loaded data. I used a different method when using Rstudio.
There are multiple routes to loading data in R.
There is the option to load data using the file loading wizard, which you may find a little easier to use. But it’s worth talking about all of the different ways loading data can go wrong, which the wizard might not be able to help you with.
It wasn’t unclear but I’d like to learn more about ggplots. Can we customize the formatting of plots we create (like change colors, text size, etc.)?
Stay tuned. We’re covering it in class today!
I always had trouble understanding this first comma inside the third bracket, glimpse(namcs[,1:5]), what does it do?
The comma in the brackets can be hard to wrap your head around.
We use the comma to specify both the row number and the column number in subsetting data.
- The numbers before the comma refer to the rows
- The number after the comma refers to the columns.
library(palmerpenguins)
## Warning: package 'palmerpenguins' was built under R version 4.0.3
data(penguins)
knitr::kable(penguins[1:10,])
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year |
---|---|---|---|---|---|---|---|
Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | male | 2007 |
Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | female | 2007 |
Adelie | Torgersen | 40.3 | 18.0 | 195 | 3250 | female | 2007 |
Adelie | Torgersen | NA | NA | NA | NA | NA | 2007 |
Adelie | Torgersen | 36.7 | 19.3 | 193 | 3450 | female | 2007 |
Adelie | Torgersen | 39.3 | 20.6 | 190 | 3650 | male | 2007 |
Adelie | Torgersen | 38.9 | 17.8 | 181 | 3625 | female | 2007 |
Adelie | Torgersen | 39.2 | 19.6 | 195 | 4675 | male | 2007 |
Adelie | Torgersen | 34.1 | 18.1 | 193 | 3475 | NA | 2007 |
Adelie | Torgersen | 42.0 | 20.2 | 190 | 4250 | NA | 2007 |
For example, if I wanted to refer to the first row and first column of penguins
, I could use this:
penguins[1, 1]
## # A tibble: 1 x 1
## species
## <fct>
## 1 Adelie
To refer to the entire first row of penguins
, I can remove the second number. Note that the comma remains.
penguins[1, ]
## # A tibble: 1 x 8
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g sex
## <fct> <fct> <dbl> <dbl> <int> <int> <fct>
## 1 Adelie Torge~ 39.1 18.7 181 3750 male
## # ... with 1 more variable: year <int>
To refer to the entire first column of penguins
, I can remove the first number:
penguins[,1]
## # A tibble: 344 x 1
## species
## <fct>
## 1 Adelie
## 2 Adelie
## 3 Adelie
## 4 Adelie
## 5 Adelie
## 6 Adelie
## 7 Adelie
## 8 Adelie
## 9 Adelie
## 10 Adelie
## # ... with 334 more rows
And to get a range of columns and rows in penguins
, we can put in a sequence
:
penguins[5:10, 1:5]
## # A tibble: 6 x 5
## species island bill_length_mm bill_depth_mm flipper_length_mm
## <fct> <fct> <dbl> <dbl> <int>
## 1 Adelie Torgersen 36.7 19.3 193
## 2 Adelie Torgersen 39.3 20.6 190
## 3 Adelie Torgersen 38.9 17.8 181
## 4 Adelie Torgersen 39.2 19.6 195
## 5 Adelie Torgersen 34.1 18.1 193
## 6 Adelie Torgersen 42 20.2 190
For loops - though not bad! I think I just need to practice them more now.
Keep at it!
the function presentation: where do we upload it?
There is a Sakai submission. Please upload both the HTML and the Rmd when you submit so I can get it up on the website.
Projects in RStudio Desktop
There were some questions about RStudio Desktop and projects, so here is a short video on how to setup projects in RStudio Desktop. We’ll have a session to install RStudio Desktop to your own machine in the future.
Thanks for Letting me Know
I appreciate you advocating for this class to be taught before the Biostats series as per some of our requests. I appreciate the pace you are taking and your genuine want to make sure we are learning the material. Thank you very much.
Much appreciated, thank you. I’ve emailed both Rochelle and Jessica and they will be thinking about it.