R/remove_empties.R
remove_empty.RdRemoves all rows and/or columns from a data.frame or matrix that
are composed entirely of NA values.
remove_empty(dat, which = c("rows", "cols"), cutoff = 1, quiet = TRUE)the input data.frame or matrix.
one of "rows", "cols", or c("rows", "cols"). Where no
value of which is provided, defaults to removing both empty rows and empty
columns, declaring the behavior with a printed message.
What fraction (>0 to <=1) of rows or columns must be empty to be removed?
Should messages be suppressed (TRUE) or printed
(FALSE) indicating the summary of empty columns or rows removed?
Returns the object without its missing rows or columns.
remove_constant() for removing
constant columns.
Other remove functions:
remove_constant()
# not run:
# dat %>% remove_empty("rows")
# addressing a common untidy-data scenario where we have a mixture of
# blank values in some (character) columns and NAs in others:
library(dplyr)
dd <- tibble(x=c(LETTERS[1:5],NA,rep("",2)),
y=c(1:5,rep(NA,3)))
# remove_empty() drops row 5 (all NA) but not 6 and 7 (blanks + NAs)
dd %>% remove_empty("rows")
#> # A tibble: 7 × 2
#> x y
#> <chr> <int>
#> 1 "A" 1
#> 2 "B" 2
#> 3 "C" 3
#> 4 "D" 4
#> 5 "E" 5
#> 6 "" NA
#> 7 "" NA
# solution: preprocess to convert whitespace/empty strings to NA,
# _then_ remove empty (all-NA) rows
dd %>% mutate(across(is.character,~na_if(trimws(.),""))) %>%
remove_empty("rows")
#> Warning: There was 1 warning in `mutate()`.
#> ℹ In argument: `across(is.character, ~na_if(trimws(.), ""))`.
#> Caused by warning:
#> ! Use of bare predicate functions was deprecated in tidyselect 1.1.0.
#> ℹ Please use wrap predicates in `where()` instead.
#> # Was:
#> data %>% select(is.character)
#>
#> # Now:
#> data %>% select(where(is.character))
#> # A tibble: 5 × 2
#> x y
#> <chr> <int>
#> 1 A 1
#> 2 B 2
#> 3 C 3
#> 4 D 4
#> 5 E 5