Setting to NA, conditionally

1 Load packages

library(tidyverse)  # data wrangling

2 Motivation

Let’s assume we would like to change the values of multiple variables depending in the state of another variable. For the sake of concreteness, let’s say we have some variable called data_trustworthiness. If this variable (indicating whether or not we can have confidence in some other variables) has the value FALSE for some cases, we would like to set the varialbe measure1 and measure2 to NA, thus reflecting that the data from our measurements are not reliable.

Let’s use the tidyverse ecosystem.

3 Minimal example

For the sake of simplicity, we’ll make use of the mtcars dataset.

First, we build a variable trustworthy with (for this minimal example) random values.

data(mtcars)

mtcars <-
  mtcars |> 
  mutate(trustworthy = sample(c(1, 0), size = nrow(mtcars), replace = TRUE)) |> 
  relocate(trustworthy)

glimpse(mtcars)  # check
#> Rows: 32
#> Columns: 12
#> $ trustworthy <dbl> 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1…
#> $ mpg         <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2…
#> $ cyl         <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4…
#> $ disp        <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 14…
#> $ hp          <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 1…
#> $ drat        <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92…
#> $ wt          <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.…
#> $ qsec        <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22…
#> $ vs          <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1…
#> $ am          <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1…
#> $ gear        <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4…
#> $ carb        <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1…

Next comes the actual conditional change of the data variables (data_vars).

mtcars_new <- 
  mtcars |> 
  mutate(across(.cols = c("mpg", "cyl"),
                .fns = ~ifelse(trustworthy == 0, NA, .x))) 

glimpse(mtcars_new)  # check
#> Rows: 32
#> Columns: 12
#> $ trustworthy <dbl> 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1…
#> $ mpg         <dbl> 21.0, NA, NA, NA, NA, NA, NA, 24.4, NA, 19.2, 17.8, NA, 17…
#> $ cyl         <dbl> 6, NA, NA, NA, NA, NA, NA, 4, NA, 6, 6, NA, 8, NA, 8, NA, …
#> $ disp        <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 14…
#> $ hp          <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 1…
#> $ drat        <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92…
#> $ wt          <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.…
#> $ qsec        <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22…
#> $ vs          <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1…
#> $ am          <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1…
#> $ gear        <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4…
#> $ carb        <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1…
sum(is.na(mtcars_new))
#> [1] 42
head(mtcars_new)
#>                   trustworthy mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4                   1  21   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag               0  NA  NA  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710                  0  NA  NA  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive              0  NA  NA  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout           0  NA  NA  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant                     0  NA  NA  225 105 2.76 3.460 20.22  1  0    3    1

4 Reproducibility

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23)
#>  os       macOS Big Sur ... 10.16
#>  system   x86_64, darwin17.0
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Berlin
#>  date     2023-11-04
#>  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  blogdown      1.18    2023-06-19 [1] CRAN (R 4.2.0)
#>  bookdown      0.36    2023-10-16 [1] CRAN (R 4.2.0)
#>  bslib         0.5.1   2023-08-11 [1] CRAN (R 4.2.0)
#>  cachem        1.0.8   2023-05-01 [1] CRAN (R 4.2.0)
#>  callr         3.7.3   2022-11-02 [1] CRAN (R 4.2.0)
#>  cli           3.6.1   2023-03-23 [1] CRAN (R 4.2.0)
#>  codetools     0.2-19  2023-02-01 [1] CRAN (R 4.2.0)
#>  colorout    * 1.2-2   2022-06-13 [1] local
#>  colorspace    2.1-0   2023-01-23 [1] CRAN (R 4.2.0)
#>  crayon        1.5.2   2022-09-29 [1] CRAN (R 4.2.1)
#>  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.2.1)
#>  digest        0.6.33  2023-07-07 [1] CRAN (R 4.2.0)
#>  dplyr       * 1.1.3   2023-09-03 [1] CRAN (R 4.2.0)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate      0.21    2023-05-05 [1] CRAN (R 4.2.0)
#>  fansi         1.0.5   2023-10-08 [1] CRAN (R 4.2.0)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.2.0)
#>  forcats     * 1.0.0   2023-01-29 [1] CRAN (R 4.2.0)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.2.0)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2     * 3.4.4   2023-10-12 [1] CRAN (R 4.2.0)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
#>  gtable        0.3.4   2023-08-21 [1] CRAN (R 4.2.0)
#>  hms           1.1.3   2023-03-21 [1] CRAN (R 4.2.0)
#>  htmltools     0.5.6.1 2023-10-06 [1] CRAN (R 4.2.0)
#>  htmlwidgets   1.6.2   2023-03-17 [1] CRAN (R 4.2.0)
#>  httpuv        1.6.11  2023-05-11 [1] CRAN (R 4.2.0)
#>  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.2.0)
#>  jsonlite      1.8.7   2023-06-29 [1] CRAN (R 4.2.0)
#>  knitr         1.45    2023-10-30 [1] CRAN (R 4.2.1)
#>  later         1.3.1   2023-05-02 [1] CRAN (R 4.2.0)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.0)
#>  lubridate   * 1.9.3   2023-09-27 [1] CRAN (R 4.2.0)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
#>  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.2.0)
#>  mime          0.12    2021-09-28 [1] CRAN (R 4.2.0)
#>  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.2.0)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.2.0)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.2.0)
#>  pkgbuild      1.4.0   2022-11-27 [1] CRAN (R 4.2.0)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.0)
#>  pkgload       1.3.2.1 2023-07-08 [1] CRAN (R 4.2.0)
#>  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.2.0)
#>  processx      3.8.2   2023-06-30 [1] CRAN (R 4.2.0)
#>  profvis       0.3.8   2023-05-02 [1] CRAN (R 4.2.0)
#>  promises      1.2.1   2023-08-10 [1] CRAN (R 4.2.0)
#>  ps            1.7.5   2023-04-18 [1] CRAN (R 4.2.0)
#>  purrr       * 1.0.2   2023-08-10 [1] CRAN (R 4.2.0)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp          1.0.11  2023-07-06 [1] CRAN (R 4.2.0)
#>  readr       * 2.1.4   2023-02-10 [1] CRAN (R 4.2.0)
#>  remotes       2.4.2.1 2023-07-18 [1] CRAN (R 4.2.0)
#>  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.2.0)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.2.0)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.2.0)
#>  sass          0.4.7   2023-07-15 [1] CRAN (R 4.2.0)
#>  scales        1.2.1   2022-08-20 [1] CRAN (R 4.2.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
#>  shiny         1.7.5   2023-08-12 [1] CRAN (R 4.2.0)
#>  stringi       1.7.12  2023-01-11 [1] CRAN (R 4.2.0)
#>  stringr     * 1.5.0   2022-12-02 [1] CRAN (R 4.2.0)
#>  tibble      * 3.2.1   2023-03-20 [1] CRAN (R 4.2.0)
#>  tidyr       * 1.3.0   2023-01-24 [1] CRAN (R 4.2.0)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.2.0)
#>  tidyverse   * 2.0.0   2023-02-22 [1] CRAN (R 4.2.0)
#>  timechange    0.2.0   2023-01-11 [1] CRAN (R 4.2.0)
#>  tzdb          0.4.0   2023-05-12 [1] CRAN (R 4.2.0)
#>  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.2.0)
#>  usethis       2.2.2   2023-07-06 [1] CRAN (R 4.2.0)
#>  utf8          1.2.3   2023-01-31 [1] CRAN (R 4.2.0)
#>  vctrs         0.6.4   2023-10-12 [1] CRAN (R 4.2.0)
#>  withr         2.5.2   2023-10-30 [1] CRAN (R 4.2.1)
#>  xfun          0.40    2023-08-09 [1] CRAN (R 4.2.0)
#>  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.2.0)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Users/sebastiansaueruser/Rlibs
#>  [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────