3 min read

Generalized rowwise operations using purrr::pmap

Load packages

library(tidyverse)

Rowwwise operations are a quite frequent operations in data analysis. The R language environment is particularly strong in column wise operations. This is due to technical reasons, as data frames are internally built as column-by-column structures, hence column wise operations are simple, rowwise more difficult.

This post looks at some rather general way to comput rowwise statistics. Of course, numerous ways exist and there are quite a few tutorials around, notably by Jenny Bryant, and by Emil Hvitfeldt to name a few.

Let’s work with these data:

data(iris)

Example 1

Assume you’d like to compute a a row sum for each case. Let’s neglect that there exist a function called rowSums which does this job nicely. However, there might be situations where the universe did not provide a ready-to-use function. Then a more general approach will be handy.

iris2 <- iris %>% 
  mutate(iris_sum = pmap(., sum))

iris2 %>% 
  select(iris_sum) %>% pull() %>% (head)
#> [[1]]
#> [1] 11.2
#> 
#> [[2]]
#> [1] 10.5
#> 
#> [[3]]
#> [1] 10.4
#> 
#> [[4]]
#> [1] 10.4
#> 
#> [[5]]
#> [1] 11.2
#> 
#> [[6]]
#> [1] 12.4

Hang on, what data type is this column of?

class(iris2$iris_sum)
#> [1] "list"

that’s a list column. Let’s unnest it.

iris2 %>% 
  unnest() %>% 
  pull(iris_sum) %>% class()
#> [1] "numeric"

OK.

Example 2

Let’s compute the rowMeans in a similar way. One compliation is that mean does not take dots as input - as sum does - but takes vector input. Hence, we need to change the domain of the fucntion from vector to dots (list), that’s what lift_vd does.

iris2 <- iris %>% 
  mutate(iris_mean = pmap(., lift_vd(mean))) %>% 
  unnest()

iris2 %>% 
  select(iris_mean) %>% 
  head()
#>   iris_mean
#> 1      2.24
#> 2      2.10
#> 3      2.08
#> 4      2.08
#> 5      2.24
#> 6      2.48

Bonus

Instead of pull there’s also pluck in purrr:

iris2 %>% pluck("iris_mean")
#>   [1] 2.24 2.10 2.08 2.08 2.24 2.48 2.14 2.22 1.98 2.12 2.36 2.20 2.06 1.90
#>  [15] 2.44 2.60 2.40 2.26 2.50 2.34 2.34 2.34 2.08 2.32 2.26 2.16 2.28 2.28
#>  [29] 2.24 2.14 2.14 2.34 2.38 2.46 2.14 2.12 2.30 2.20 1.98 2.24 2.22 1.88
#>  [43] 2.02 2.34 2.44 2.10 2.34 2.08 2.34 2.18 3.66 3.52 3.68 3.02 3.48 3.26
#>  [57] 3.58 2.72 3.48 3.04 2.70 3.32 3.04 3.42 3.08 3.52 3.32 3.12 3.28 3.02
#>  [71] 3.54 3.24 3.44 3.36 3.38 3.48 3.56 3.68 3.38 2.96 2.96 2.92 3.12 3.48
#>  [85] 3.28 3.50 3.60 3.26 3.20 3.06 3.14 3.42 3.12 2.72 3.16 3.22 3.22 3.34
#>  [99] 2.74 3.18 4.22 3.70 4.22 3.92 4.10 4.46 3.32 4.26 3.96 4.48 3.96 3.86
#> [113] 4.08 3.64 3.82 4.04 3.96 4.68 4.50 3.54 4.22 3.66 4.44 3.74 4.16 4.24
#> [127] 3.72 3.76 3.98 4.12 4.24 4.62 4.00 3.74 3.74 4.42 4.14 3.96 3.72 4.10
#> [141] 4.16 4.08 3.70 4.24 4.24 4.04 3.74 3.94 4.06 3.76