Announcement

rm(list = ls()) # clean-up workspace
library("tidyverse")
## ── Attaching packages ───────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.1     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ──────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Graphics for communications | r4ds chapter 28

Label

labs()

Title

  • Figure title should be descriptive:

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point(aes(color = class)) +
      geom_smooth(se = FALSE) +
      labs(title = "Fuel efficiency generally decreases with engine size")

Subtitle and caption

  • subtitle adds additional detail in a smaller font beneath the title.

  • caption adds text at the bottom right of the plot, often used to describe the source of the data.

    ggplot(mpg, aes(displ, hwy)) +
      geom_point(aes(color = class)) +
      geom_smooth(se = FALSE) + 
      labs(
        title = "Fuel efficiency generally decreases with engine size",
        subtitle = "Two seaters (sports cars) are an exception because of their light weight",
        caption = "Data from fueleconomy.gov"
      )

Axis labels

  • ggplot(mpg, aes(displ, hwy)) +
    geom_point(aes(colour = class)) +
    geom_smooth(se = FALSE) +
    labs(
      x = "Engine displacement (L)",
      y = "Highway fuel economy (mpg)"
    )

Math equations

  • read about available options in ?plotmath

    df <- tibble(x = runif(10), y = runif(10))
    ggplot(df, aes(x, y)) + geom_point() +
      labs(
        x = quote(sum(x[i] ^ 2, i == 1, n)),
        y = quote(alpha + beta + frac(delta, theta))
      )

Annotations

  • Find the most fuel efficient car in each car class:

    best_in_class <- mpg %>%
      group_by(class) %>%
      filter(row_number(desc(hwy)) == 1)
    
    # equivalent as 
    # best_in_class <- filter(group_by(mpg, class), row_number(desc(hwy)) == 1)
    best_in_class
    ## # A tibble: 7 x 11
    ## # Groups:   class [7]
    ##   manufacturer model     displ  year   cyl trans  drv     cty   hwy fl    class 
    ##   <chr>        <chr>     <dbl> <int> <int> <chr>  <chr> <int> <int> <chr> <chr> 
    ## 1 chevrolet    corvette    5.7  1999     8 manua… r        16    26 p     2seat…
    ## 2 dodge        caravan …   2.4  1999     4 auto(… f        18    24 r     miniv…
    ## 3 nissan       altima      2.5  2008     4 manua… f        23    32 r     midsi…
    ## 4 subaru       forester…   2.5  2008     4 manua… 4        20    27 r     suv   
    ## 5 toyota       toyota t…   2.7  2008     4 manua… 4        17    22 r     pickup
    ## 6 volkswagen   jetta       1.9  1999     4 manua… f        33    44 d     compa…
    ## 7 volkswagen   new beet…   1.9  1999     4 manua… f        35    44 d     subco…
  • dplyr::desc function transforms a vector into a format that will be sorted in descending order

  • dplyr::filter function subsets a data frame, retaining all rows that satisfy your conditions

  • geom_label() draws a rectangle behind the text
ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(colour = class)) +
  geom_label(aes(label = model), data = best_in_class, nudge_y = 2, alpha = 0.5)


  • ggrepel package automatically adjust labels so that they don’t overlap:

    library("ggrepel")
    ggplot(mpg, aes(displ, hwy)) +
      geom_point(aes(colour = class)) +
      geom_point(size = 3, shape = 1, data = best_in_class) +
      ggrepel::geom_label_repel(aes(label = model), data = best_in_class)

Scales

  • ggplot(mpg, aes(displ, hwy)) +
      geom_point(aes(colour = class))

    automatically adds scales

    ggplot(mpg, aes(displ, hwy)) +
      geom_point(aes(colour = class)) +
      scale_x_continuous() +
      scale_y_continuous() +
      scale_colour_discrete()


  • breaks

    ggplot(mpg, aes(displ, hwy)) +
      geom_point() +
      scale_y_continuous(breaks = seq(15, 40, by = 5))


  • labels

    ggplot(mpg, aes(displ, hwy)) +
      geom_point() +
      scale_x_continuous(labels = NULL) +
      scale_y_continuous(labels = NULL)


  • Plot y-axis at log scale:

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point() +
      scale_y_log10()


  • Plot x-axis in reverse order:

    ggplot(mpg, aes(x = displ, y = hwy)) +
      geom_point() +
      scale_x_reverse()


  • use scale_colour_manual() to use predefined mapping between values and colors
presidential %>%
  mutate(id = 33 + row_number()) %>%
  ggplot(aes(start, id, colour = party)) +
    geom_point() +
    geom_segment(aes(xend = end, yend = id)) +
    scale_colour_manual(values = c(Republican = "red", Democratic = "blue"))


  • use scale_colour_gradient() or scale_fill_gradient() for continuous colour

  • viridis::scale_colour_viridis()

df <- tibble(
  x = rnorm(10000),
  y = rnorm(10000)
)
ggplot(df, aes(x, y)) +
  geom_hex() +
  coord_fixed()

ggplot(df, aes(x, y)) +
  geom_hex() +
  viridis::scale_fill_viridis() +
  coord_fixed()


All color scales come in two variety:

  • scale_colour_x() for colour aesthetics

  • scale_fill_x() for fill aesthetics

Legends

  • Set legend position: "left", "right", "top", "bottom", none:

    ggplot(mpg, aes(displ, hwy)) +
      geom_point(aes(colour = class)) + 
      theme(legend.position = "left")


Zooming

  • Without clipping (removes unseen data points)

    ggplot(mpg, mapping = aes(displ, hwy)) +
      geom_point(aes(color = class)) +
      geom_smooth() +
      coord_cartesian(xlim = c(5, 7), ylim = c(10, 30))


  • With clipping (removes unseen data points)

    ggplot(mpg, mapping = aes(displ, hwy)) +
      geom_point(aes(color = class)) +
      geom_smooth() +
      xlim(5, 7) + ylim(10, 30)

    same as

    mpg %>%
      filter(displ >= 5, displ <= 7, hwy >= 10, hwy <= 30) %>%
      ggplot(aes(displ, hwy)) +
      geom_point(aes(color = class)) +
      geom_smooth()


  • ggplot(mpg, mapping = aes(displ, hwy)) +
      geom_point(aes(color = class)) +
      geom_smooth() +
      scale_x_continuous(limits = c(5, 7)) +
      scale_y_continuous(limits = c(10, 30))

Themes

  • ggplot(mpg, aes(displ, hwy)) +
      geom_point(aes(color = class)) +
      geom_smooth(se = FALSE) +
      theme_bw()

Saving plots

ggplot(mpg, aes(displ, hwy)) + geom_point()

ggsave("my-plot.pdf")
## Saving 7 x 5 in image

Cheat sheet

RStudio cheat sheet is extremely helpful.