Use spread()
to convert table2
to table1
. What is the meaning of the key
and value
arguments?
table2 %>%
spread(_____)
Use gather()
to convert table1
to table2
. Try an inclusive and an exclusive selection. Do you need an extra transformation to make the result fully identical? Can you reuse key
and value
from the previous result?
table1 %>%
gather(_____, ___:___)
table1 %>%
gather(_____, -___:-___)
Visualize the data: plot cases, population, and both. Which of table1
or table2
is more suitable in which case?
___ %>%
ggplot(aes(___)) +
geom_col()
___ %>%
ggplot(aes(___)) +
geom_col() +
facet_grid(___ ~ ___, scales = "free")
___ %>%
ggplot(aes(___, ___)) +
geom_point()
Use gather()
to convert table4a
and table4b
to table2
. Can you do the same with just one gather()
call?
Hint: Use bind_rows()
to combine similar tibbles.
cases_tbl <-
table4a %>%
gather(_____) %>%
mutate(type = "cases")
population_tbl <-
table4a %>%
gather(_____) %>%
mutate(___)
bind_rows(_____) %>%
_____ %>%
_____
Create a scatterplot from the mpg
dataset that shows both highway and city fuel economy against engine displacement with two different colors using only one geom_point()
call.
mpg %>%
_____ %>%
ggplot(aes(x = displ, y = ___)) +
geom_point()
Find more exercises in Section 12.3.3 of r4ds.
Convert table3
to table1
and table2
.
table3 %>%
separate(
___,
into = c("___", "___"),
convert = TRUE
) %>%
_____ %>%
_____
Convert table2
to table3
.
table2 %>%
_____ %>%
unite(
___,
___, ___,
sep = "/"
)
Count the flights for each relation in the flights
dataset, using just one grouping variable.
flights %>%
unite(
relation,
___, ___,
sep = " -> "
) %>%
count(___)
Find more exercises in Section 12.4.3 of r4ds.
How are the flights
, carriers
, and airports
datasets connected? Which are primary, which are foreign keys?
Hint: Use count()
to support your hypothesis.
flights %>%
count(carrier) %>%
count(n)
airlines %>%
count(_____) %>%
_____
Plot a heat map of destination by airline for all flights shorter than 300 miles. Use explicit names for the carriers and the destinations. Does the result change if you use a full join? Do you use geom_raster()
or geom_bin2d()
?
Hint: Use by = c("dest" = "faa")
.
flights %>%
filter(distance < 300) %>%
count(dest, carrier) %>%
left_join(airlines) %>%
left_join(airports, by = c(___))
# The name of the `name` variable isn't very useful,
# need to rename it before plotting
flights %>%
filter(distance < 300) %>%
count(dest, carrier) %>%
left_join(_____) %>%
rename(___) %>%
left_join(_____) %>%
rename(___) %>%
ggplot() +
geom_raster(aes(___))
Find more exercises in Section 13.4.6 of r4ds.
Find the airports that are serviced by at least one flight. Which airports did not have direct connections in 2013?
airports %>%
semi_join(flights, by = c(_____))
airports %>%
anti_join(flights, by = c(_____))
Find more exercises in Section 13.5.1 of r4ds.
Copyright © 2018 Kirill Müller. Licensed under CC BY-NC 4.0.