Spreading and gathering

  1. Use spread() to convert table2 to table1. What is the meaning of the key and value arguments?

    table2 %>%
      spread(_____)
  2. Use gather() to convert table1 to table2. Try an inclusive and an exclusive selection. Do you need an extra transformation to make the result fully identical? Can you reuse key and value from the previous result?

    table1 %>%
      gather(_____, ___:___)
    
    table1 %>%
      gather(_____, -___:-___)
  3. Visualize the data: plot cases, population, and both. Which of table1 or table2 is more suitable in which case?

    ___ %>%
      ggplot(aes(___)) +
      geom_col()
    
    ___ %>%
      ggplot(aes(___)) +
      geom_col() +
      facet_grid(___ ~ ___, scales = "free")
    
    ___ %>%
      ggplot(aes(___, ___)) +
      geom_point()
  4. Use gather() to convert table4a and table4b to table2. Can you do the same with just one gather() call?

    Hint: Use bind_rows() to combine similar tibbles.

    cases_tbl <-
      table4a %>%
      gather(_____) %>%
      mutate(type = "cases")
    
    population_tbl <-
      table4a %>%
      gather(_____) %>%
      mutate(___)
    
    bind_rows(_____) %>%
      _____ %>%
      _____
  5. Create a scatterplot from the mpg dataset that shows both highway and city fuel economy against engine displacement with two different colors using only one geom_point() call.

    mpg %>%
      _____ %>%
      ggplot(aes(x = displ, y = ___)) +
      geom_point()
  6. Find more exercises in Section 12.3.3 of r4ds.

Separating and uniting

  1. Convert table3 to table1 and table2.

    table3 %>%
      separate(
        ___,
        into = c("___", "___"),
        convert = TRUE
      ) %>%
      _____ %>%
      _____
  2. Convert table2 to table3.

    table2 %>%
      _____ %>%
      unite(
        ___,
        ___, ___,
        sep = "/"
      )
  3. Count the flights for each relation in the flights dataset, using just one grouping variable.

    flights %>%
      unite(
        relation,
        ___, ___,
        sep = " -> "
      ) %>%
      count(___)
  4. Find more exercises in Section 12.4.3 of r4ds.

Keys and mutating joins

  1. How are the flights, carriers, and airports datasets connected? Which are primary, which are foreign keys?

    Hint: Use count() to support your hypothesis.

    flights %>%
      count(carrier) %>%
      count(n)
    
    airlines %>%
      count(_____) %>%
      _____
  2. Plot a heat map of destination by airline for all flights shorter than 300 miles. Use explicit names for the carriers and the destinations. Does the result change if you use a full join? Do you use geom_raster() or geom_bin2d()?

    Hint: Use by = c("dest" = "faa").

    flights %>% 
      filter(distance < 300) %>%
      count(dest, carrier) %>%
      left_join(airlines) %>%
      left_join(airports, by = c(___))
    
    # The name of the `name` variable isn't very useful,
    # need to rename it before plotting
    flights %>% 
      filter(distance < 300) %>%
      count(dest, carrier) %>%
      left_join(_____) %>%
      rename(___) %>%
      left_join(_____) %>%
      rename(___) %>% 
      ggplot() +
      geom_raster(aes(___))
  3. Find more exercises in Section 13.4.6 of r4ds.

Filtering joins

  1. Find the airports that are serviced by at least one flight. Which airports did not have direct connections in 2013?

    airports %>%
      semi_join(flights, by = c(_____))
    
    airports %>%
      anti_join(flights, by = c(_____))
  2. Find more exercises in Section 13.5.1 of r4ds.

Copyright © 2018 Kirill Müller. Licensed under CC BY-NC 4.0.