dplyr::crossing()

Function of the Week: crossing()

Claire Bradbury

Introduction

In this document, I will introduce the crossing() function and show what it’s for.

Data set

data("starwars")
kable(head(starwars))
name height mass hair_color skin_color eye_color birth_year gender homeworld species films vehicles starships
Luke Skywalker 172 77 blond fair blue 19.0 male Tatooine Human c(“Revenge of the Sith”, “Return of the Jedi”, “The Empire Strikes Back”, “A New Hope”, “The Force Awakens”) c(“Snowspeeder”, “Imperial Speeder Bike”) c(“X-wing”, “Imperial shuttle”)
C-3PO 167 75 NA gold yellow 112.0 NA Tatooine Droid c(“Attack of the Clones”, “The Phantom Menace”, “Revenge of the Sith”, “Return of the Jedi”, “The Empire Strikes Back”, “A New Hope”) character(0) character(0)
R2-D2 96 32 NA white, blue red 33.0 NA Naboo Droid c(“Attack of the Clones”, “The Phantom Menace”, “Revenge of the Sith”, “Return of the Jedi”, “The Empire Strikes Back”, “A New Hope”, “The Force Awakens”) character(0) character(0)
Darth Vader 202 136 none white yellow 41.9 male Tatooine Human c(“Revenge of the Sith”, “Return of the Jedi”, “The Empire Strikes Back”, “A New Hope”) character(0) TIE Advanced x1
Leia Organa 150 49 brown light brown 19.0 female Alderaan Human c(“Revenge of the Sith”, “Return of the Jedi”, “The Empire Strikes Back”, “A New Hope”, “The Force Awakens”) Imperial Speeder Bike character(0)
Owen Lars 178 120 brown, grey light blue 52.0 male Tatooine Human c(“Attack of the Clones”, “Revenge of the Sith”, “A New Hope”) character(0) character(0)

For my examples I will be using the starwars data set from the vcd package. The starwars data set comes from the Star Wars API website, SWAPI at http://swapi.co/.

What is it for?

Discuss what the function does. Learn from the examples, but show how to use it using another data set such as penguins.

This function shows all combinations of variables within a table.

Characters’ Home World and Species

For example, if you wanted to know what species of semi-major character lived on each planet in Star Wars you could use crossing() on the Star Wars data set.

kable(crossing(starwars[c("homeworld", "species")]))
homeworld species
Alderaan Human
Aleen Minor Aleena
Bespin Human
Bestine IV Human
Cato Neimoidia Neimodian
Cerea Cerean
Champala Chagrian
Chandrila Human
Concord Dawn Human
Corellia Human
Coruscant Human
Coruscant Tholothian
Dathomir Zabrak
Dorin Kel Dor
Endor Ewok
Eriadu Human
Geonosis Geonosian
Glee Anselm Nautolan
Haruun Kal Human
Iktotch Iktotchi
Iridonia Zabrak
Kalee Kaleesh
Kamino Human
Kamino Kaminoan
Kashyyyk Wookiee
Malastare Dug
Mirial Mirialan
Mon Cala Mon Calamari
Muunilinst Muun
Naboo Droid
Naboo Gungan
Naboo Human
Naboo NA
Nal Hutta Hutt
Ojom Besalisk
Quermia Quermian
Rodia Rodian
Ryloth Twi’lek
Serenno Human
Shili Togruta
Skako Skakoan
Socorro Human
Stewjon Human
Sullust Sullustan
Tatooine Droid
Tatooine Human
Toydaria Toydarian
Trandosha Trandoshan
Troiken Xexto
Tund Toong
Umbara NA
Utapau Pau’an
Vulpter Vulptereen
Zolan Clawdite
NA Droid
NA Human
NA Yoda’s species
NA NA

So looking at the table above, if I wanted to know what Star Wars characters had Tatooine as their home world I could see that there are droids and humans call Tatooine home.

Characters’ Species and Home World

Another example, is trying to see the combinations of characters species and home world.

kable(crossing(starwars[c("species", "homeworld")]))
species homeworld
Aleena Aleen Minor
Besalisk Ojom
Cerean Cerea
Chagrian Champala
Clawdite Zolan
Droid Naboo
Droid Tatooine
Droid NA
Dug Malastare
Ewok Endor
Geonosian Geonosis
Gungan Naboo
Human Alderaan
Human Bespin
Human Bestine IV
Human Chandrila
Human Concord Dawn
Human Corellia
Human Coruscant
Human Eriadu
Human Haruun Kal
Human Kamino
Human Naboo
Human Serenno
Human Socorro
Human Stewjon
Human Tatooine
Human NA
Hutt Nal Hutta
Iktotchi Iktotch
Kaleesh Kalee
Kaminoan Kamino
Kel Dor Dorin
Mirialan Mirial
Mon Calamari Mon Cala
Muun Muunilinst
Nautolan Glee Anselm
Neimodian Cato Neimoidia
Pau’an Utapau
Quermian Quermia
Rodian Rodia
Skakoan Skako
Sullustan Sullust
Tholothian Coruscant
Togruta Shili
Toong Tund
Toydarian Toydaria
Trandoshan Trandosha
Twi’lek Ryloth
Vulptereen Vulpter
Wookiee Kashyyyk
Xexto Troiken
Yoda’s species NA
Zabrak Dathomir
Zabrak Iridonia
NA Naboo
NA Umbara
NA NA

If you notice, the returned table is ordered alphabetically by the first column, this makes it easy to search. So I am looking for other inhabitants of Tatooine such as, Jawas, Tuskan Raiders, and Hutts. The table above shows no entry for Jawas or Tuskan Raiders. There is at least one character that is of the Hutt species, but they call Nal Hutta their home world.

Gender, Skin Color, and Species Representation

If I want to see the representation of characters by gender, skin color, and species I could use crossing() for preliminary analysis.

kable(crossing(starwars[c("species", "skin_color", "gender")]))
species skin_color gender
Aleena grey, blue male
Besalisk brown male
Cerean pale male
Chagrian blue male
Clawdite fair, green, yellow female
Droid gold NA
Droid metal none
Droid none none
Droid white, blue NA
Droid white, red NA
Dug grey, red male
Ewok brown male
Geonosian green male
Gungan green male
Gungan grey male
Gungan orange male
Human dark male
Human fair female
Human fair male
Human light female
Human light male
Human pale male
Human tan male
Human white male
Hutt green-tan, brown hermaphrodite
Iktotchi pale male
Kaleesh brown, white male
Kaminoan grey female
Kaminoan grey male
Kel Dor orange male
Mirialan yellow female
Mon Calamari brown mottle male
Muun grey male
Nautolan green male
Neimodian mottled green male
Pau’an grey male
Quermian white male
Rodian green male
Skakoan green, grey male
Sullustan grey male
Tholothian dark female
Togruta red, blue, white female
Toong grey, green, yellow male
Toydarian blue, grey male
Trandoshan green male
Twi’lek blue female
Twi’lek pale male
Vulptereen blue, grey male
Wookiee brown male
Wookiee unknown male
Xexto white, blue male
Yoda’s species green male
Zabrak brown male
Zabrak red male
NA dark male
NA fair male
NA pale female
NA silver, red female
NA unknown female

From the table above, all that can be surmised is that there are many characters that have a skin color and species where there is only a prominent male character. Such as for humans, female characters only have the skin colors of fair or light, while male characters have dark, fair, light, pale, tan, and white skin colors.

Is it helpful?

Discuss whether you think this function is useful for you and your work. Is it the best thing since sliced bread, or is it not really relevant to your work?

Benefits of crossing()

I think the function can be useful if you want to see a combination of a few columns of data, such as I did for the Star Wars data set above. In this case, crossing() is most useful for searching for an entry of data or a unique combination.

crossing() can be used to preliminary analyse data and in conjunction of other models to determine what combinations of variables exist in the data set and/or are missing from the data set.

Downside of crossing()

However, trying to see if there is enough gender and skin color representation by species in Star Wars, crossing() is not the most beneficial. For this purpose the group_by() and summarize() functions would be preferable.

kable(starwars %>% group_by(species, skin_color, gender) %>% summarise(count = n()))
species skin_color gender count
Aleena grey, blue male 1
Besalisk brown male 1
Cerean pale male 1
Chagrian blue male 1
Clawdite fair, green, yellow female 1
Droid gold NA 1
Droid metal none 1
Droid none none 1
Droid white, blue NA 1
Droid white, red NA 1
Dug grey, red male 1
Ewok brown male 1
Geonosian green male 1
Gungan green male 1
Gungan grey male 1
Gungan orange male 1
Human dark male 4
Human fair female 3
Human fair male 13
Human light female 6
Human light male 5
Human pale male 1
Human tan male 2
Human white male 1
Hutt green-tan, brown hermaphrodite 1
Iktotchi pale male 1
Kaleesh brown, white male 1
Kaminoan grey female 1
Kaminoan grey male 1
Kel Dor orange male 1
Mirialan yellow female 2
Mon Calamari brown mottle male 1
Muun grey male 1
Nautolan green male 1
Neimodian mottled green male 1
Pau’an grey male 1
Quermian white male 1
Rodian green male 1
Skakoan green, grey male 1
Sullustan grey male 1
Tholothian dark female 1
Togruta red, blue, white female 1
Toong grey, green, yellow male 1
Toydarian blue, grey male 1
Trandoshan green male 1
Twi’lek blue female 1
Twi’lek pale male 1
Vulptereen blue, grey male 1
Wookiee brown male 1
Wookiee unknown male 1
Xexto white, blue male 1
Yoda’s species green male 1
Zabrak brown male 1
Zabrak red male 1
NA dark male 1
NA fair male 1
NA pale female 1
NA silver, red female 1
NA unknown female 1

Using the group_by() and summarise() functions shows that there are quite a bit more male characters for each species than there are female characters. If the goal of analyzing the data is for statistical applications, it would likely be more useful to have numerical values attached to the grouped variables.