pick() provides a way to easily select a subset of columns from your data
using select() semantics while inside a
"data-masking" function like mutate() or
summarise(). pick() returns a data frame containing the selected columns
for the current group.
pick() is complementary to across():
With pick(), you typically apply a function to the full data frame.
With across(), you typically apply a function to each column.
pick(...)Columns to pick.
You can't pick grouping columns because they are already automatically
handled by the verb (i.e. summarise() or mutate()).
A tibble containing the selected columns for the current group.
Theoretically, pick() is intended to be replaceable with an equivalent call
to tibble(). For example, pick(a, c) could be replaced with
tibble(a = a, c = c), and pick(everything()) on a data frame with cols
a, b, and c could be replaced with tibble(a = a, b = b, c = c).
pick() specially handles the case of an empty selection by returning a 1
row, 0 column tibble, so an exact replacement is more like:
df <- tibble(
x = c(3, 2, 2, 2, 1),
y = c(0, 2, 1, 1, 4),
z1 = c("a", "a", "a", "b", "a"),
z2 = c("c", "d", "d", "a", "c")
)
df
#> # A tibble: 5 × 4
#> x y z1 z2
#> <dbl> <dbl> <chr> <chr>
#> 1 3 0 a c
#> 2 2 2 a d
#> 3 2 1 a d
#> 4 2 1 b a
#> 5 1 4 a c
# `pick()` provides a way to select a subset of your columns using
# tidyselect. It returns a data frame.
df %>% mutate(cols = pick(x, y))
#> # A tibble: 5 × 5
#> x y z1 z2 cols$x $y
#> <dbl> <dbl> <chr> <chr> <dbl> <dbl>
#> 1 3 0 a c 3 0
#> 2 2 2 a d 2 2
#> 3 2 1 a d 2 1
#> 4 2 1 b a 2 1
#> 5 1 4 a c 1 4
# This is useful for functions that take data frames as inputs.
# For example, you can compute a joint rank between `x` and `y`.
df %>% mutate(rank = dense_rank(pick(x, y)))
#> # A tibble: 5 × 5
#> x y z1 z2 rank
#> <dbl> <dbl> <chr> <chr> <int>
#> 1 3 0 a c 4
#> 2 2 2 a d 3
#> 3 2 1 a d 2
#> 4 2 1 b a 2
#> 5 1 4 a c 1
# `pick()` is also useful as a bridge between data-masking functions (like
# `mutate()` or `group_by()`) and functions with tidy-select behavior (like
# `select()`). For example, you can use `pick()` to create a wrapper around
# `group_by()` that takes a tidy-selection of columns to group on. For more
# bridge patterns, see
# https://rlang.r-lib.org/reference/topic-data-mask-programming.html#bridge-patterns.
my_group_by <- function(data, cols) {
group_by(data, pick({{ cols }}))
}
df %>% my_group_by(c(x, starts_with("z")))
#> # A tibble: 5 × 4
#> # Groups: x, z1, z2 [4]
#> x y z1 z2
#> <dbl> <dbl> <chr> <chr>
#> 1 3 0 a c
#> 2 2 2 a d
#> 3 2 1 a d
#> 4 2 1 b a
#> 5 1 4 a c
# Or you can use it to dynamically select columns to `count()` by
df %>% count(pick(starts_with("z")))
#> # A tibble: 3 × 3
#> z1 z2 n
#> <chr> <chr> <int>
#> 1 a c 2
#> 2 a d 2
#> 3 b a 1