Plot many ggplot diagrams using nest() and map()

At times, it is helpful to plot a multiple of related diagrams, such as a scatter plot for each subgroup. As always, there a number of ways of doing so in R. Specifically, we will make use of ggplot2.

library(tidyverse)
library(glue)
data(mtcars)
d <- mtcars %>% 
  rownames_to_column(var = "car_names")

Is d a tibble?

is_tibble(d)
#> [1] FALSE

What is it?

class(d)
#> [1] "data.frame"

Okay, let’s make a tibble out of it:

d <- as_tibble(d)
class(d)
#> [1] "tbl_df"     "tbl"        "data.frame"

Way 1: using facets

One simple way is to plot several facets according to the grouping variable:

d %>% 
  ggplot() +
  aes(x = hp, y = mpg) +
  geom_point() +
  facet_wrap(~ cyl)

Way 2: using nest() and map2()

First, we nest the data frame:

d_nested <- 
  d %>% 
  group_by(cyl) %>% 
  nest()

d_nested
#> # A tibble: 3 × 2
#> # Groups:   cyl [3]
#>     cyl data              
#>   <dbl> <list>            
#> 1     6 <tibble [7 × 11]> 
#> 2     4 <tibble [11 × 11]>
#> 3     8 <tibble [14 × 11]>

Note that in the column data there is the data of mtcars - broken down for each group. That’s why we have three lines.

Second, we map the data to ggplot:

d_plots <- 
  d_nested %>% 
  mutate(plot = map2(
    data, cyl, 
    ~ ggplot(data = .x, aes(x = hp, y = mpg)) +
      ggtitle(glue("Number of Cylinder: {.y}")) +
      geom_point()))

d_plots
#> # A tibble: 3 × 3
#> # Groups:   cyl [3]
#>     cyl data               plot  
#>   <dbl> <list>             <list>
#> 1     6 <tibble [7 × 11]>  <gg>  
#> 2     4 <tibble [11 × 11]> <gg>  
#> 3     8 <tibble [14 × 11]> <gg>

Finally, we print it:

print(d_plots$plot) 
#> [[1]]
#> 
#> [[2]]
#> 
#> [[3]]

Note that we need map2() because in the data (data), there is no information on the name of the cylinders, ie., the number of cylinders. Hence, we need to hand over a second vector with the cylinder information. If two vectors serving as input for map(), we need map2().

Way 3: Deprecated do

There are of course other ways to achieve what we just explored. For example, good ol’ for-loops are out there. However, here we can make use of R’s beautiful vectorization capabilities. In addition, dplyr::do() is a similar way to map list elements to a function. However, this function iswill probably get deprecated:

In addition, nest() bears the advantage, that the (processed plotting) data is nicely stored in a data frame.

For the sake of completeness:

d_plots2 <- d %>% 
  group_by(cyl) %>% 
  dplyr::do(plot = 
              {ggplot(data = ., aes(x = hp, y= mpg)) + 
                  geom_point() + ggtitle(paste0("Cylinders: ", .$cyl))})

d_plots2
#> # A tibble: 3 × 2
#> # Rowwise: 
#>     cyl plot  
#>   <dbl> <list>
#> 1     4 <gg>  
#> 2     6 <gg>  
#> 3     8 <gg>
print(d_plots2$plot)
#> [[1]]
#> 
#> [[2]]
#> 
#> [[3]]