Functions to create cache that accelerates many operations
hashcache(x, nunique = NULL, ...)
sortcache(x, has.na = NULL)
sortordercache(x, has.na = NULL, stable = NULL)
ordercache(x, has.na = NULL, stable = NULL, optimize = "time")an atomic vector (note that currently only integer64 is supported)
giving correct number of unique elements can help reducing the size of the hashmap
passed to hashmap()
boolean scalar defining whether the input vector might contain
NAs. If we know we don't have NAs, this may speed-up. Note that you
risk a crash if there are unexpected NAs with has.na=FALSE.
boolean scalar defining whether stable sorting is needed. Allowing non-stable may speed-up.
by default ramsort optimizes for 'time' which requires more RAM, set to 'memory' to minimize RAM requirements and sacrifice speed.
x with a cache() that contains the result of the expensive operations,
possible together with small derived information (such as nunique.integer64())
and previously cached results.
The result of relative expensive operations hashmap(), bit::ramsort(),
bit::ramsortorder(), and bit::ramorder() can be stored in a cache in
order to avoid multiple excutions. Unless in very specific situations, the
recommended method is hashsortorder only.
Note that we consider storing the big results from sorting and/or ordering as a relevant side-effect, and therefore storing them in the cache should require a conscious decision of the user.
cache() for caching functions and nunique.integer64() for methods benefiting
from small caches
x <- as.integer64(sample(c(rep(NA, 9), 1:9), 32, TRUE))
sortordercache(x)