vec_chop() provides an efficient method to repeatedly slice a vector. It
captures the pattern of map(indices, vec_slice, x = x). When no indices
are supplied, it is generally equivalent to as.list().
list_unchop() combines a list of vectors into a single vector, placing
elements in the output according to the locations specified by indices.
It is similar to vec_c(), but gives greater control over how the elements
are combined. When no indices are supplied, it is identical to vec_c(),
but typically a little faster.
If indices selects every value in x exactly once, in any order, then
list_unchop() is the inverse of vec_chop() and the following invariant
holds:
vec_chop(x, ..., indices = NULL, sizes = NULL)
list_unchop(
x,
...,
indices = NULL,
ptype = NULL,
name_spec = NULL,
name_repair = c("minimal", "unique", "check_unique", "universal", "unique_quiet",
"universal_quiet"),
error_arg = "x",
error_call = current_env()
)A vector
These dots are for future extensions and must be empty.
For vec_chop(), a list of positive integer vectors to
slice x with, or NULL. Can't be used if sizes is already specified.
If both indices and sizes are NULL, x is split into its individual
elements, equivalent to using an indices of as.list(vec_seq_along(x)).
For list_unchop(), a list of positive integer vectors specifying the
locations to place elements of x in. Each element of x is recycled to
the size of the corresponding index vector. The size of indices must
match the size of x. If NULL, x is combined in the order it is
provided in, which is equivalent to using vec_c().
An integer vector of non-negative sizes representing sequential
indices to slice x with, or NULL. Can't be used if indices is already
specified.
For example, sizes = c(2, 4) is equivalent to indices = list(1:2, 3:6),
but is typically faster.
sum(sizes) must be equal to vec_size(x), i.e. sizes must completely
partition x, but an individual size is allowed to be 0.
If NULL, the default, the output type is determined by
computing the common type across all elements of x. Alternatively, you
can supply ptype to give the output a known type.
A name specification for combining
inner and outer names. This is relevant for inputs passed with a
name, when these inputs are themselves named, like outer = c(inner = 1), or when they have length greater than 1: outer = 1:2. By default, these cases trigger an error. You can resolve
the error by providing a specification that describes how to
combine the names or the indices of the inner vector with the
name of the input. This specification can be:
A function of two arguments. The outer name is passed as a string to the first argument, and the inner names or positions are passed as second argument.
An anonymous function as a purrr-style formula.
A glue specification of the form "{outer}_{inner}".
An rlang::zap() object, in which case both outer and inner
names are ignored and the result is unnamed.
See the name specification topic.
How to repair names, see repair options in
vec_as_names().
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem.
The execution environment of a currently
running function, e.g. caller_env(). The function will be
mentioned in error messages as the source of the error. See the
call argument of abort() for more information.
vec_chop(): A list where each element has the same type as x. The size
of the list is equal to vec_size(indices), vec_size(sizes), or
vec_size(x) depending on whether or not indices or sizes is provided.
list_unchop(): A vector of type vec_ptype_common(!!!x), or ptype, if
specified. The size is computed as vec_size_common(!!!indices) unless
the indices are NULL, in which case the size is vec_size_common(!!!x).
vec_chop()list_unchop()vec_chop(1:5)
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 2
#>
#> [[3]]
#> [1] 3
#>
#> [[4]]
#> [1] 4
#>
#> [[5]]
#> [1] 5
#>
# These two are equivalent
vec_chop(1:5, indices = list(1:2, 3:5))
#> [[1]]
#> [1] 1 2
#>
#> [[2]]
#> [1] 3 4 5
#>
vec_chop(1:5, sizes = c(2, 3))
#> [[1]]
#> [1] 1 2
#>
#> [[2]]
#> [1] 3 4 5
#>
# Can also be used on data frames
vec_chop(mtcars, indices = list(1:3, 4:6))
#> [[1]]
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#>
#> [[2]]
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
#>
# If `indices` selects every value in `x` exactly once,
# in any order, then `list_unchop()` inverts `vec_chop()`
x <- c("a", "b", "c", "d")
indices <- list(2, c(3, 1), 4)
vec_chop(x, indices = indices)
#> [[1]]
#> [1] "b"
#>
#> [[2]]
#> [1] "c" "a"
#>
#> [[3]]
#> [1] "d"
#>
list_unchop(vec_chop(x, indices = indices), indices = indices)
#> [1] "a" "b" "c" "d"
# When unchopping, size 1 elements of `x` are recycled
# to the size of the corresponding index
list_unchop(list(1, 2:3), indices = list(c(1, 3, 5), c(2, 4)))
#> [1] 1 2 1 3 1
# Names are retained, and outer names can be combined with inner
# names through the use of a `name_spec`
lst <- list(x = c(a = 1, b = 2), y = 1)
list_unchop(lst, indices = list(c(3, 2), c(1, 4)), name_spec = "{outer}_{inner}")
#> y_1 x_b x_a y_2
#> 1 2 1 1
# An alternative implementation of `ave()` can be constructed using
# `vec_chop()` and `list_unchop()` in combination with `vec_group_loc()`
ave2 <- function(.x, .by, .f, ...) {
indices <- vec_group_loc(.by)$loc
chopped <- vec_chop(.x, indices = indices)
out <- lapply(chopped, .f, ...)
list_unchop(out, indices = indices)
}
breaks <- warpbreaks$breaks
wool <- warpbreaks$wool
ave2(breaks, wool, mean)
#> [1] 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704
#> [9] 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704
#> [17] 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704 31.03704
#> [25] 31.03704 31.03704 31.03704 25.25926 25.25926 25.25926 25.25926 25.25926
#> [33] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
#> [41] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
#> [49] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
identical(
ave2(breaks, wool, mean),
ave(breaks, wool, FUN = mean)
)
#> [1] TRUE
# If you know your input is sorted and you'd like to split on the groups,
# `vec_run_sizes()` can be efficiently combined with `sizes`
df <- data_frame(
g = c(2, 5, 5, 6, 6, 6, 6, 8, 9, 9),
x = 1:10
)
#> Warning: `data_frame()` was deprecated in tibble 1.1.0.
#> ℹ Please use `tibble()` instead.
vec_chop(df, sizes = vec_run_sizes(df$g))
#> [[1]]
#> # A tibble: 1 × 2
#> g x
#> <dbl> <int>
#> 1 2 1
#>
#> [[2]]
#> # A tibble: 2 × 2
#> g x
#> <dbl> <int>
#> 1 5 2
#> 2 5 3
#>
#> [[3]]
#> # A tibble: 4 × 2
#> g x
#> <dbl> <int>
#> 1 6 4
#> 2 6 5
#> 3 6 6
#> 4 6 7
#>
#> [[4]]
#> # A tibble: 1 × 2
#> g x
#> <dbl> <int>
#> 1 8 8
#>
#> [[5]]
#> # A tibble: 2 × 2
#> g x
#> <dbl> <int>
#> 1 9 9
#> 2 9 10
#>
# If you have a list of homogeneous vectors, sometimes it can be useful to
# unchop, apply a function to the flattened vector, and then rechop according
# to the original indices. This can be done efficiently with `list_sizes()`.
x <- list(c(1, 2, 1), c(3, 1), 5, double())
x_flat <- list_unchop(x)
x_flat <- x_flat + max(x_flat)
vec_chop(x_flat, sizes = list_sizes(x))
#> [[1]]
#> [1] 6 7 6
#>
#> [[2]]
#> [1] 8 6
#>
#> [[3]]
#> [1] 10
#>
#> [[4]]
#> numeric(0)
#>