Sorting: sort an ff vector – optionally in-place

ffsort(x
, aux = NULL
, has.na = TRUE
, na.last = TRUE
, decreasing = FALSE
, inplace = FALSE
, decorate = FALSE
, BATCHBYTES = getOption("ffmaxbytes")
, VERBOSE = FALSE
)

Arguments

x

an ff vector

aux

NULL or an ff vector of the same type for temporary storage

has.na

boolean scalar telling ffsort whether the vector might contain NAs. Note that you risk a crash if there are unexpected NAs with has.na=FALSE

na.last

boolean scalar telling ffsort whether to sort NAs last or first. Note that 'boolean' means that there is no third option NA as in sort

decreasing

boolean scalar telling ffsort whether to sort increasing or decreasing

inplace

boolean scalar telling ffsort whether to sort the original ff vector (TRUE) or to create a sorted copy (FALSE, the default)

decorate

boolean scalar telling ffsort whether to decorate the returned ff vector with is.sorted and na.count attributes.

BATCHBYTES

maximum number of RAM bytes ffsort should try not to exceed

VERBOSE

cat some info about the sorting

Details

ffsort tries to sort the vector in-RAM respecting the BATCHBYTES limit. If a fast sort it not possible, it uses a slower in-place sort (shellsort). If in-RAM is not possible, it uses (a yet simple) out-of-memory algorithm. Like ramsort the in-RAM sorting method is choosen depending on context information. If a key-index sort can be used, ffsort completely avoids merging disk based subsorts. If argument decorate=TRUE is used, then na.count(x) will return the number of NAs and is.sorted(x) will return TRUE if the sort was done with na.last=TRUE and decreasing=FALSE.

Note

the ff vector may not have a names attribute

Value

An ff vector – optionally decorated with is.sorted and na.count, see argument 'decorate'

Author

Jens Oehlschlägel

See also

Examples

   n <- 1e6
   x <- ff(c(NA, 999999:1), vmode="double", length=n)
   x <- ffsort(x)
   x
#> ff (open) double length=1000000 (1000000)
#>       [1]       [2]       [3]       [4]       [5]       [6]       [7]       [8] 
#>         1         2         3         4         5         6         7         8 
#>            [999993]  [999994]  [999995]  [999996]  [999997]  [999998]  [999999] 
#>         :    999993    999994    999995    999996    999997    999998    999999 
#> [1000000] 
#>        NA 
   is.sorted(x)
#> [1] FALSE
   na.count(x)
#> [1] NA
   x <- ffsort(x, decorate=TRUE)
   is.sorted(x)
#> [1] TRUE
   na.count(x)
#> [1] 1
   x <- ffsort(x, BATCHBYTES=n, VERBOSE=TRUE)
#> method=shell  BATCHSIZE=125000