Reverse-engineering, theory

How could the authors of the 2016 WHO TB report have created the following plots? Assume that each plot is based on one or several suitably crafted dataset(s), i.e., that the data has been transformed in advance to support this particular plot. Answer the following questions for each plot:

  1. What layers (geoms) are used?
  2. Which variables are mapped to which aesthetics?
  3. Can you identify manual aesthetics (i.e., aesthetics that are unchanged for all observations?)
  4. What statistical transformations, if any, have been applied?
  5. What positional adjustments, if any, have been applied?
  6. If an image contains more than one graph, explain the mechanism.
  7. What does each observation in the plotting dataset represent?
  8. Do you notice details about the plot which you can’t explain yet?

Plot 1

Plot 1

Plot 1

Plot 2

Plot 2

Plot 2

Plot 3

Plot 3

Plot 3

Plot 4

Plot 4

Plot 4

Plot 5

Plot 5

Plot 5

Plot 6

Plot 6

Plot 6

Plot 7

Plot 7

Plot 7

Plot 8

Plot 8

Plot 8

Plot 9

Plot 9

Plot 9

Plot 10

Plot 10

Plot 10

Plot 11

Plot 11

Plot 11

Plot 12

Plot 12

Plot 12

Plot 13

Plot 13

Plot 13

Plot 14

Plot 14

Plot 14

Plot 15

Plot 15

Plot 15

Plot 16

Plot 16

Plot 16

Plot 17

Plot 17

Plot 17

Plot 18

Plot 18

Plot 18

Plot 19

Plot 19

Plot 19

Plot 20

Plot 20

Plot 20

Plot 21

Plot 21

Plot 21

Reverse-engineering, practice

Recreate the following plots by filling in the dots in the corresponding code snippets. Use the high_impact_stats dataset from the gfdata package. What is the purpose of the predefined scale_...() and theme() calls?

library(gfdata)
ggplot(
  data = ...,
  mapping = aes(
    x = ordered(...),
    y = ...
  )
) +
  geom_...() +
  scale_y_continuous(labels = scales::percent, limits = c(0, NA)) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))
## Warning: Removed 39 rows containing non-finite values (stat_boxplot).

library(gfdata)
ggplot(...) +
  geom_...(
    ...,
    fill = "grey"
  ) +
  geom_line(
    ...,
    color = "green"
  ) +
  facet_wrap(~..., scales = ..., ncol = 3) +
  scale_y_continuous(limits = c(0, NA))

high_impact_stats_tb_long <-
  high_impact_stats %>%
  select(
    iso3, country, five_regions, year,
    tb_new_cases_number, tb_deaths_number
  ) %>%
  gather(indicator, number, starts_with("tb_"))

ggplot(
  data = high_impact_stats_tb_long,
  mapping = aes()
) +
  geom_...() +
  facet_...(...) +
  scale_y_continuous(limits = c(0, NA))
## Warning: Removed 2 rows containing missing values (geom_path).

Copyright © 2017 Kirill Müller. Licensed under CC BY-NC 4.0.