ffindexget.rdFunction ffindexget allows to extract elements from an ff vector according to positive integer suscripts stored in an ff vector.
Function ffindexset allows the inverse operation: assigning to elements of an ff vector according to positive integer suscripts stored in an ff vector.
These functions allow more control than the method dispatch of [ and [<- if an ff integer subscript is used.
A ff vector containing the elements
A ff integer vector with integer subscripts in the range from 1 to length(x).
An ff vector of the same vmode as x containing the values to be assigned
Optionally the return value of ffindexorder, see details
Optionally an ff vector of the same vmode as x in which the returned values shall be stored, see details.
Optinal limit for the batchsize (see details)
Limit for the number of bytes per batch
Logical scalar for verbosing
Accessing integer positions in an ff vector is a non-trivial task, because it could easily lead to random-access to a disk file.
We avoid random access by loading batches of the subscript values into RAM, order them ascending, and only then access the ff values on disk.
Since ordering is expensive, it may pay to do the batched ordering once upfront and then re-use it with ffindexorder,
similar to storing and using hybrid index information with as.hi.
Function ffindexget returns an ff vector with the extracted elements.
Function ffindexset returns the ff vector in which we have updated values.
message("ff integer subscripts with ff return/assign values")
#> ff integer subscripts with ff return/assign values
x <- ff(factor(letters))
i <- ff(2:9)
xi <- x[i]
xi
#> ff (open) integer length=8 (8) levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
#> [1] [2] [3] [4] [5] [6] [7] [8]
#> b c d e : f g h i
xi[] <- NA
xi
#> ff (open) integer length=8 (8) levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
#> [1] [2] [3] [4] [5] [6] [7] [8]
#> NA NA NA NA : NA NA NA NA
x[i] <- xi
x
#> ff (open) integer length=26 (26) levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
#> [1] [2] [3] [4] [5] [6] [7] [8] [19] [20] [21] [22] [23] [24] [25]
#> a NA NA NA NA NA NA NA : s t u v w x y
#> [26]
#> z
message("ff integer subscripts: more control with ffindexget/ffindexset")
#> ff integer subscripts: more control with ffindexget/ffindexset
xi <- ffindexget(x, i, FF_RETURN=xi)
x <- ffindexset(x, i, xi)
rm(x, i, xi)
gc()
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 1156080 61.8 1994352 106.6 1994352 106.6
#> Vcells 2151565 16.5 8388608 64.0 8318871 63.5