purrr::pluck()

Function of the Week:

Pluck

In this document, I will introduce the ‘pluck’ function and show what it’s for.

What is it for?

If you’ve ever worked with lists in R you know the syntax is somewhat counter intuitive. Instead of having to do things like list[[1]][[2]] to recover specific elements from lists we can use the very useful pluck function.

Here is the basic functionality:

example <- list(movies, years, preferences)
example
## [[1]]
## [1] "A New Hope"              "The Empire Strikes Back"
## [3] "Return of the Jedi"      "Phantom Menace"         
## [5] "Attack of the Clones"    "Revenge of the Sith"    
## [7] "The Force Awakens"       "The Last Jedi"          
## [9] "Rise of Skywalker"      
## 
## [[2]]
## [1] 1977 1980 1983 1999 2002 2005 2015 2017 2019
## 
## [[3]]
## [1] 2 8 1 7 4 5 6 9
example[[1]][[5]]
## [1] "Attack of the Clones"
example[[2]][[5]]
## [1] 2002
example[[3]][[5]]
## [1] 4

Pluck presents a more logical and easy-read-approach for retreiving list elements:

pluck(example, 1)
## [1] "A New Hope"              "The Empire Strikes Back"
## [3] "Return of the Jedi"      "Phantom Menace"         
## [5] "Attack of the Clones"    "Revenge of the Sith"    
## [7] "The Force Awakens"       "The Last Jedi"          
## [9] "Rise of Skywalker"
pluck(example, 1, 5)
## [1] "Attack of the Clones"
pluck(example, 2, 5)
## [1] 2002
pluck(example, 3, 5)
## [1] 4

By default, pluck consistently returns NULL when an element does not exist:

pluck(example, 4)
## NULL
# The code below report errors because element 4 does not exist

## example[[4]]

The ‘chuck’ function is essentially the same as ‘pluck’. However, by default ’chuck will report an error when an element does not exist.

The map function uses pluck by default to retrieve multiple values from a list

map(example, 2)
## [[1]]
## [1] "The Empire Strikes Back"
## 
## [[2]]
## [1] 1980
## 
## [[3]]
## [1] 8

This is equivalent to:

map(example, pluck, 2)
## [[1]]
## [1] "The Empire Strikes Back"
## 
## [[2]]
## [1] 1980
## 
## [[3]]
## [1] 8

You can also assign a value in a pluck location with pluck <-:

pluck(example, 2, 2) <- "1996"

pluck(example, 2, 2)
## [1] "1996"

Pluck supports accessor functions:

my_function <- function(x) x[[1]][[2]]

my_function(example)
## [1] "The Empire Strikes Back"

The accessor can then be assigned to pluck:

pluck(example, my_function)
## [1] "The Empire Strikes Back"

Assign common r functions to pluck.

pluck(example, 2, max)
## [1] "2019"

The pluck function is not restricted to lists.

glimpse(attacks_df)
## Rows: 25,723
## Columns: 24
## $ Case.Number            <chr> "2018.06.25", "2018.06.18", "2018.06.09", "201…
## $ Date                   <chr> "25-Jun-2018", "18-Jun-2018", "09-Jun-2018", "…
## $ Year                   <dbl> 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018…
## $ Type                   <chr> "Boating", "Unprovoked", "Invalid", "Unprovoke…
## $ Country                <chr> "USA", "USA", "USA", "AUSTRALIA", "MEXICO", "A…
## $ Area                   <chr> "California", "Georgia", "Hawaii", "New South …
## $ Location               <chr> "Oceanside, San Diego County", "St. Simon Isla…
## $ Activity               <chr> "Paddling", "Standing", "Surfing", "Surfing", …
## $ Name                   <chr> "Julie Wolfe", "Adyson\xa0McNeely ", "John Den…
## $ Sex                    <chr> "F", "F", "M", "M", "M", "M", "M", "M", "M", "…
## $ Age                    <chr> "57", "11", "48", "", "", "", "18", "52", "15"…
## $ Injury                 <chr> "No injury to occupant, outrigger canoe and pa…
## $ Fatal..Y.N.            <chr> "N", "N", "N", "N", "N", "N", "Y", "N", "N", "…
## $ Time                   <chr> "18h00", "14h00  -15h00", "07h45", "", "", "",…
## $ Species                <chr> "White shark", "", "", "2 m shark", "Tiger sha…
## $ Investigator.or.Source <chr> "R. Collier, GSAF", "K.McMurray, TrackingShark…
## $ pdf                    <chr> "2018.06.25-Wolfe.pdf", "2018.06.18-McNeely.pd…
## $ href.formula           <chr> "http://sharkattackfile.net/spreadsheets/pdf_d…
## $ href                   <chr> "http://sharkattackfile.net/spreadsheets/pdf_d…
## $ Case.Number.1          <chr> "2018.06.25", "2018.06.18", "2018.06.09", "201…
## $ Case.Number.2          <chr> "2018.06.25", "2018.06.18", "2018.06.09", "201…
## $ original.order         <int> 6303, 6302, 6301, 6300, 6299, 6298, 6297, 6296…
## $ X                      <chr> "", "", "", "", "", "", "", "", "", "", "", ""…
## $ X.1                    <chr> "", "", "", "", "", "", "", "", "", "", "", ""…

What species of shark attacked a specific case number?

#index row number for case number
idx <- which(attacks_df$Case.Number == '2018.06.04')

#pluck the corresponding 'Species' element by case number...
species <- pluck(attacks_df, 'Species', idx )

#...or num feature index number
#bHam_date <- pluck( attacks_df, 15, idx )
species
## [1] "Tiger shark, 3m"

Is it helpful?

I think that pluck is very helpful because it provides a way to retrieve specific objects from a data structure. Furthermore, when lists are nested the pluck function is more logical and easy-to-read compared to the bracket syntax. Pluck becomes very powerful when it is used in combination with the map function because you can retrieve nested objects for multiple lists.