Fast versions of unique(), duplicated() ,
anyDuplicated() and sum(duplicated(x)) for integers.
bit_unique(x, na.rm = NA, range_na = NULL)
bit_duplicated(x, na.rm = NA, range_na = NULL, retFUN = as.bit)
bit_anyDuplicated(x, na.rm = NA, range_na = NULL)
bit_sumDuplicated(x, na.rm = NA, range_na = NULL)an integer vector
NA treats NAs like other integers, TRUE treats
all NAs as duplicates, FALSE treats no NAs as
duplicates
NULL calls range_na(), optionally the result of range_na() can be
given here to avoid calling it again
bit_unique returns a vector of unique integers,
bit_duplicated returns a boolean vector coerced to retFUN,
bit_anyDuplicated returns the position of the first duplicate (or zero if no
duplicates)
bit_sumDuplicated returns the number of duplicated values (as.integer)
determines the range of the integers and checks if the density justifies use
of a bit vector; if yes, uses a bit vector for finding duplicates; if no,
falls back to unique(), duplicated(), anyDuplicated() and sum(duplicated(x))
bit_unique(): extracts unique elements
bit_duplicated(): determines duplicate elements
bit_anyDuplicated(): checks for existence of duplicate elements
bit_sumDuplicated(): counts duplicate elements
bit_unique(c(2L, 1L, NA, NA, 1L, 2L))
#> [1] 2 1 NA
bit_unique(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE)
#> [1] 2 1 NA NA
bit_unique(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE)
#> [1] 2 1
bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L))
#> bit length=6 occupying only 1 int32
#> 1 2 3 4 5 6
#> FALSE FALSE FALSE TRUE TRUE TRUE
bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE)
#> bit length=6 occupying only 1 int32
#> 1 2 3 4 5 6
#> FALSE FALSE FALSE FALSE TRUE TRUE
bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE)
#> bit length=6 occupying only 1 int32
#> 1 2 3 4 5 6
#> FALSE FALSE TRUE TRUE TRUE TRUE
bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L))
#> [1] 4
bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE)
#> [1] 5
bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE)
#> [1] 3
bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L))
#> [1] 3
bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE)
#> [1] 2
bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE)
#> [1] 4