Hybrid Index, C-coded utilities

These C-coded utilitites speed up index preprocessing considerably.

intrle(x)

intisasc(x, na.method = c("none", "break", "skip")[2])

intisdesc(x, na.method = c("none", "break", "skip")[1])

Arguments

x: an integer vector
na.method: one of "none", "break", "skip", see details. The strange defaults stem from the initial usage.

Value

intrle returns an object of class rle() or NULL, if rle-compression is not efficient (compression factor <3 or length(x) < 3).
intisasc returns one of FALSE, NA, TRUE
intisdesc returns one of FALSE, TRUE (if the input contains NAs, the output is undefined)

Details

intrle is by factor 50 faster and needs less RAM (2x its input vector) compared to rle() which needs 9x the RAM of its input vector. This is achieved because we allow the C-code of intrle to break when it turns out, that rle-packing will not achieve a compression factor of 3 or better.

intisasc is a faster version of is.unsorted(): it checks whether x is sorted.

intisdesc checks for being sorted descending and by default default assumes that the input x contains no NAs.

na.method="none" treats NAs (the smallest integer) like every other integer and hence returns either TRUE or FALSE na.method="break" checks for NAs and returns either NA as soon as NA is encountered. na.method="skip" checks for NAs and skips over them, hence decides the return value only on the basis of non-NA values.

Functions

intisasc(): check whether integer vector is ascending
intisdesc(): check whether integer vector is descending

Author

Jens Oehlschlägel

Examples


  intrle(sample(1:10))
#> NULL
  intrle(diff(1:10))
#> Run Length Encoding
#>   lengths: int 9
#>   values : int 1
  intisasc(1:10)
#> [1] TRUE
  intisasc(10:1)
#> [1] FALSE
  intisasc(c(NA, 1:10))
#> [1] NA
  intisdesc(1:10)
#> [1] FALSE
  intisdesc(c(10:1, NA))
#> [1] TRUE
  intisdesc(c(10:6, NA, 5:1))
#> [1] FALSE
  intisdesc(c(10:6, NA, 5:1), na.method="skip")
#> [1] TRUE
  intisdesc(c(10:6, NA, 5:1), na.method="break")
#> [1] NA