purrr::walk()
purrr::walk()
Colin Hawkinson
2021-03-03
In this document, I will introduce the purrr::walk()
function and show what it’s for.
It’s just like purr:map()
except that it’s called for it’s side effects, and returns the .x
slot for further use (in a %>%
structure, for instance).
remember purrr:map()?
purrr:walk()
doesn’t make much sense without context.
# get the penguins dataset
penguins <- drop_na(penguins)
unique(penguins$species)
## [1] Adelie Gentoo Chinstrap
## Levels: Adelie Chinstrap Gentoo
# split it by the three species, cast those to a list
list_1 <- penguins %>%
group_by(species) %>%
group_split(.keep=T)
# this function takes one of the dataframes in the list and makes a
# linear model of bill length by body mass colored by sex.
by_factor_lm <- function(df, factor) {
ggplot(df, aes(y = bill_length_mm, x = body_mass_g, color = sex)) +
geom_point() +
geom_smooth(method = lm)
}
How’s purrr::map()
behave?
map(.x = list_1, .f = by_factor_lm)
## [[1]]
## `geom_smooth()` using formula 'y ~ x'
##
## [[2]]
## `geom_smooth()` using formula 'y ~ x'
##
## [[3]]
## `geom_smooth()` using formula 'y ~ x'
object_1 <- map(.x = list_1, .f = by_factor_lm)
object_1
## [[1]]
## `geom_smooth()` using formula 'y ~ x'
##
## [[2]]
## `geom_smooth()` using formula 'y ~ x'
##
## [[3]]
## `geom_smooth()` using formula 'y ~ x'
class(object_1)
## [1] "list"
What is purrr::walk()
for?
purrr::walk()
is for writing to the disk, especially when you want the .x
argument back after.
walk(.x = list_1, .f = by_factor_lm)
object_2 <- walk(.x = list_1, .f = by_factor_lm)
object_2
## <list_of<
## tbl_df<
## species : factor<600db>
## island : factor<6667f>
## bill_length_mm : double
## bill_depth_mm : double
## flipper_length_mm: integer
## body_mass_g : integer
## sex : factor<9a28d>
## year : integer
## >
## >[3]>
## [[1]]
## # A tibble: 146 x 8
## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Adelie Torge… 39.1 18.7 181 3750
## 2 Adelie Torge… 39.5 17.4 186 3800
## 3 Adelie Torge… 40.3 18 195 3250
## 4 Adelie Torge… 36.7 19.3 193 3450
## 5 Adelie Torge… 39.3 20.6 190 3650
## 6 Adelie Torge… 38.9 17.8 181 3625
## 7 Adelie Torge… 39.2 19.6 195 4675
## 8 Adelie Torge… 41.1 17.6 182 3200
## 9 Adelie Torge… 38.6 21.2 191 3800
## 10 Adelie Torge… 34.6 21.1 198 4400
## # … with 136 more rows, and 2 more variables: sex <fct>, year <int>
##
## [[2]]
## # A tibble: 68 x 8
## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Chinst… Dream 46.5 17.9 192 3500
## 2 Chinst… Dream 50 19.5 196 3900
## 3 Chinst… Dream 51.3 19.2 193 3650
## 4 Chinst… Dream 45.4 18.7 188 3525
## 5 Chinst… Dream 52.7 19.8 197 3725
## 6 Chinst… Dream 45.2 17.8 198 3950
## 7 Chinst… Dream 46.1 18.2 178 3250
## 8 Chinst… Dream 51.3 18.2 197 3750
## 9 Chinst… Dream 46 18.9 195 4150
## 10 Chinst… Dream 51.3 19.9 198 3700
## # … with 58 more rows, and 2 more variables: sex <fct>, year <int>
##
## [[3]]
## # A tibble: 119 x 8
## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Gentoo Biscoe 46.1 13.2 211 4500
## 2 Gentoo Biscoe 50 16.3 230 5700
## 3 Gentoo Biscoe 48.7 14.1 210 4450
## 4 Gentoo Biscoe 50 15.2 218 5700
## 5 Gentoo Biscoe 47.6 14.5 215 5400
## 6 Gentoo Biscoe 46.5 13.5 210 4550
## 7 Gentoo Biscoe 45.4 14.6 211 4800
## 8 Gentoo Biscoe 46.7 15.3 219 5200
## 9 Gentoo Biscoe 43.3 13.4 209 4400
## 10 Gentoo Biscoe 46.8 15.4 215 5150
## # … with 109 more rows, and 2 more variables: sex <fct>, year <int>
class(object_2)
## [1] "vctrs_list_of" "vctrs_vctr" "list"
What’s
purrr::map()
do with something like write.csv?
map(.x = list_1,
.f = function(d) {write.csv(d, file = toString(unique(d$species)))})
## [[1]]
## NULL
##
## [[2]]
## NULL
##
## [[3]]
## NULL
class(list_1)
## [1] "vctrs_list_of" "vctrs_vctr" "list"
length(list_1)
## [1] 3
object_4 <- map(.x = list_1,
.f = function(d) {write.csv(d, file = toString(unique(d$species)))})
class(object_4)
## [1] "list"
length(object_4)
## [1] 3
Map returns a list the same length as the .x
argument, but the indicies are empty (NULL
) because of the nature of the .f
argument. Not useful.
What about
purrr::walk()
?
walk(.x = list_1,
.f = function(d) {write.csv(d, file = toString(unique(d$species)))})
object_3 <- walk(.x = list_1,
.f = function(d) {write.csv(d, file = toString(unique(d$species)))})
object_3
## <list_of<
## tbl_df<
## species : factor<600db>
## island : factor<6667f>
## bill_length_mm : double
## bill_depth_mm : double
## flipper_length_mm: integer
## body_mass_g : integer
## sex : factor<9a28d>
## year : integer
## >
## >[3]>
## [[1]]
## # A tibble: 146 x 8
## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Adelie Torge… 39.1 18.7 181 3750
## 2 Adelie Torge… 39.5 17.4 186 3800
## 3 Adelie Torge… 40.3 18 195 3250
## 4 Adelie Torge… 36.7 19.3 193 3450
## 5 Adelie Torge… 39.3 20.6 190 3650
## 6 Adelie Torge… 38.9 17.8 181 3625
## 7 Adelie Torge… 39.2 19.6 195 4675
## 8 Adelie Torge… 41.1 17.6 182 3200
## 9 Adelie Torge… 38.6 21.2 191 3800
## 10 Adelie Torge… 34.6 21.1 198 4400
## # … with 136 more rows, and 2 more variables: sex <fct>, year <int>
##
## [[2]]
## # A tibble: 68 x 8
## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Chinst… Dream 46.5 17.9 192 3500
## 2 Chinst… Dream 50 19.5 196 3900
## 3 Chinst… Dream 51.3 19.2 193 3650
## 4 Chinst… Dream 45.4 18.7 188 3525
## 5 Chinst… Dream 52.7 19.8 197 3725
## 6 Chinst… Dream 45.2 17.8 198 3950
## 7 Chinst… Dream 46.1 18.2 178 3250
## 8 Chinst… Dream 51.3 18.2 197 3750
## 9 Chinst… Dream 46 18.9 195 4150
## 10 Chinst… Dream 51.3 19.9 198 3700
## # … with 58 more rows, and 2 more variables: sex <fct>, year <int>
##
## [[3]]
## # A tibble: 119 x 8
## species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Gentoo Biscoe 46.1 13.2 211 4500
## 2 Gentoo Biscoe 50 16.3 230 5700
## 3 Gentoo Biscoe 48.7 14.1 210 4450
## 4 Gentoo Biscoe 50 15.2 218 5700
## 5 Gentoo Biscoe 47.6 14.5 215 5400
## 6 Gentoo Biscoe 46.5 13.5 210 4550
## 7 Gentoo Biscoe 45.4 14.6 211 4800
## 8 Gentoo Biscoe 46.7 15.3 219 5200
## 9 Gentoo Biscoe 43.3 13.4 209 4400
## 10 Gentoo Biscoe 46.8 15.4 215 5150
## # … with 109 more rows, and 2 more variables: sex <fct>, year <int>
Is it helpful?
As part of a big, automated workflow. Espically one that uses other programs (like FIJI!).
It’s really only the return of the .x
that makes this uniquely useful among the purrr:map()
family of functions.
knitr::include_graphics("data/cell_tracking_figure.png")