3 min read

Plot many ggplot diagrams using nest() and map()

At times, it is helpful to plot a multiple of related diagrams, such as a scatter plot for each subgroup. As always, there a number of ways of doing so in R. Specifically, we will make use of ggplot2.

library(tidyverse)
library(glue)
data(mtcars)
d <- mtcars %>% 
  rownames_to_column(var = "car_names")

Is d a tibble`

is_tibble(d)
#> [1] FALSE

What is it?

class(d)
#> [1] "data.frame"

Okay, let’s make a tibble out of it:

d <- as_tibble(d)
class(d)
#> [1] "tbl_df"     "tbl"        "data.frame"

Way 1: using facets

One simple way is to plot several facets according to the grouping variable:

d %>% 
  ggplot() +
  aes(x = hp, y = mpg) +
  geom_point() +
  facet_wrap(~ cyl)

Way 2: using nest() and map2()

First, we nest the data frame:

d_nested <- 
  d %>% 
  group_by(cyl) %>% 
  nest()

d_nested
#> # A tibble: 3 x 2
#>     cyl data              
#>   <dbl> <list>            
#> 1     6 <tibble [7 × 11]> 
#> 2     4 <tibble [11 × 11]>
#> 3     8 <tibble [14 × 11]>

Note that in the column data there is the data of mtcars - broken down for each group. That’s why we have three lines.

Second, we map the data to ggplot:

d_plots <- 
  d_nested %>% 
  mutate(plot = map2(data, cyl, ~ ggplot(data = .x, aes(x = hp, y = mpg)) +
                      ggtitle(glue("Number of Cylinder: {.y}")) +
                      geom_point()))

d_plots
#> # A tibble: 3 x 3
#>     cyl data               plot    
#>   <dbl> <list>             <list>  
#> 1     6 <tibble [7 × 11]>  <S3: gg>
#> 2     4 <tibble [11 × 11]> <S3: gg>
#> 3     8 <tibble [14 × 11]> <S3: gg>

Finally, we print it:

print(d_plots$plot) 
#> [[1]]
#> 
#> [[2]]
#> 
#> [[3]]

Note that we need map2() because in the data (data), there is no information on the number of cylinders. Hence, we need to hand over a second vector with the cylinder information. If two vectors serving as input for map(), we need map2().

Way 3: Deprecated do

There are of course other ways to achieve what we just explored. For example, good ol’ for-loops are out there. However, here we can make use of R’s beautiful vectorization capabilities. In addition, dplyr::do() is a similar way to map list elements to a function. However, this function iswill probably get deprecated:

In addition, nest() bears the advantage, that the (processed plotting) data is nicely stored in a data frame.

For the sake of completeness:

d_plots2 <- d %>% 
  group_by(cyl) %>% 
  dplyr::do(plot = {ggplot(data = ., aes(x = hp, y= mpg)) + geom_point() + ggtitle(paste0("Cylinders: ", .$cyl))})

d_plots2
#> Source: local data frame [3 x 2]
#> Groups: <by row>
#> 
#> # A tibble: 3 x 2
#>     cyl plot    
#> * <dbl> <list>  
#> 1     4 <S3: gg>
#> 2     6 <S3: gg>
#> 3     8 <S3: gg>
print(d_plots2$plot)
#> [[1]]
#> 
#> [[2]]
#> 
#> [[3]]