View all flights that arrived after 10:00 PM. Use an intermediate variable, a nested expression, and the pipe. Which appeals more to you?
flights_after_10 <- filter(flights, ___)
View(flights_after_10)
View(filter(flights, ___))
flights %>%
filter(___) %>%
View()
Extend the four solutions to view all "UA"
flights that arrived after 10:00 PM.
flights_after_10 <- filter(flights, ___)
ua_flights_after_10 <- ...
View(___)
View(filter(filter(flights, ___), ___))
flights %>%
filter(___) %>%
filter(___) %>%
View()
Extend the four solutions to view all "UA"
flights that departed before 6:00 AM and arrived after 10:00 PM.
Extend the four solutions to view all "UA"
flights that departed before 6:00 AM and arrived after 10:00 PM and had a delay of more than two hours.
Extend the four solutions to view all "UA"
flights that departed before 6:00 AM and arrived after 10:00 PM and had a delay of more than two hours, originating in one of New York City’s airports.
Extend the four solutions to view all "UA"
flights that departed before 6:00 AM and arrived after 10:00 PM and had a delay of more than two hours, originating in one of New York City’s airports but excluding Honolulu International airport.
Hint: Consult the airports
dataset, use a filter with the predicate stringr::str_detect(name, "^Honolulu")
.
Sort the result by distance
.
► Solution:
### Intermediate variables
Naming is hard!
early_flights <- filter(flights, dep_time >= 600)
early_late_flights <-
filter(early_flights, arr_time >= 2200)
early_late_ua_flights <-
filter(early_late_flights, carrier == "UA")
early_late_late_ua_flights <-
filter(early_late_ua_flights, arr_delay > 120)
early_late_late_ua_flights_not_honolulu <-
filter(early_late_late_ua_flights, dest != "HNL")
early_late_late_ua_flights_not_honolulu_sorted <-
arrange(
early_late_late_ua_flights_not_honolulu,
distance
)
View(early_late_late_ua_flights_not_honolulu_sorted)
## # A tibble: 330 x 19
## year month day dep_time sched_dep_time dep_delay arr_time
## <int> <int> <int> <int> <int> <dbl> <int>
## 1 2013 10 7 2108 1710 238 2217
## 2 2013 12 17 2122 1714 248 2248
## 3 2013 3 7 2046 1905 101 2231
## 4 2013 3 7 2124 1550 334 2304
## 5 2013 3 19 2251 2030 141 2355
## 6 2013 5 3 2058 1555 303 2203
## 7 2013 5 19 2201 2000 121 2337
## 8 2013 5 21 2103 1730 213 2231
## 9 2013 6 10 2056 1800 176 2205
## 10 2013 6 24 2049 1800 169 2219
## # ... with 320 more rows, and 12 more variables: sched_arr_time <int>,
## # arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
## # origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
## # minute <dbl>, time_hour <dttm>
Difficult to read.
View(
arrange(
filter(
filter(
filter(
filter(
filter(
flights,
dep_time <= 600
),
arr_time >= 2200
),
carrier == "UA"
),
arr_delay > 120
),
dest != "HNL"
),
distance
)
)
## # A tibble: 0 x 19
## # ... with 19 variables: year <int>, month <int>, day <int>,
## # dep_time <int>, sched_dep_time <int>, dep_delay <dbl>, arr_time <int>,
## # sched_arr_time <int>, arr_delay <dbl>, carrier <chr>, flight <int>,
## # tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>,
## # distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>
flights %>%
filter(dep_time <= 600) %>%
filter(arr_time >= 2200) %>%
filter(carrier == "UA") %>%
filter(arr_delay > 120) %>%
filter(dest != "HNL") %>%
arrange(distance) %>%
View()
## # A tibble: 0 x 19
## # ... with 19 variables: year <int>, month <int>, day <int>,
## # dep_time <int>, sched_dep_time <int>, dep_delay <dbl>, arr_time <int>,
## # sched_arr_time <int>, arr_delay <dbl>, carrier <chr>, flight <int>,
## # tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>,
## # distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>
The original data is never updated! You still need to assign the result of a pipe to a variable:
late_late_ua_flights_not_honolulu <-
flights %>%
filter(dep_time <= 600) %>%
filter(arr_time >= 2200) %>%
filter(carrier == "UA") %>%
filter(arr_delay > 120) %>%
filter(dest != "HNL") %>%
arrange(distance)
Copyright © 2018 Kirill Müller. Licensed under CC BY-NC 4.0.