Last Observation Carried Forward
na.locf.RdGeneric function for replacing each NA with the most recent
non-NA prior to it.
Usage
na.locf(object, na.rm = TRUE, ...)
# Default S3 method
na.locf(object, na.rm = TRUE, fromLast, rev,
maxgap = Inf, rule = 2, ...)
na.locf0(object, fromLast = FALSE, maxgap = Inf, coredata = NULL)Arguments
- object
an object.
- na.rm
logical. Should leading
NAs be removed?- fromLast
logical. Causes observations to be carried backward rather than forward. Default is
FALSE. With a value ofTRUEthis corresponds to NOCB (next observation carried backward). It is not supported ifxorxoutis specified.- rev
Use
fromLastinstead. This argument will be eliminated in the future in favor offromLast.- maxgap
Runs of more than
maxgapNAs are retained, otherNAs are removed and the last occurrence in the resulting series prior to each time point inxoutis used as that time point's output value. (Ifxoutis not specified this reduces to retaining runs of more thanmaxgapNAs while filling otherNAs with the last occurrence of a non-NA.)- rule
See
approx.- ...
further arguments passed to methods.
- coredata
logical. Should LOCF be applied to the core data of a (time series) object and then assigned to the original object again? By default, this strategy is applied to time series classes (e.g.,
ts,zoo,xts, etc.) where it preserves the time index.
Value
An object in which each NA in the input object is replaced
by the most recent non-NA prior to it. If there are no earlier non-NAs then
the NA is omitted (if na.rm = TRUE) or it is not replaced (if na.rm = FALSE).
The arguments x and xout can be used in which case they have
the same meaning as in approx.
Note that if a multi-column zoo object has a column entirely composed of
NA then with na.rm = TRUE, the default,
the above implies that the resulting object will have
zero rows. Use na.rm = FALSE to preserve the NA values instead.
The function na.locf0 is the workhorse function underlying the default
na.locf method. It has more limited capabilities but is faster for the
special cases it covers. Implicitly, it uses na.rm=FALSE.
Examples
az <- zoo(1:6)
bz <- zoo(c(2,NA,1,4,5,2))
na.locf(bz)
#> 1 2 3 4 5 6
#> 2 2 1 4 5 2
na.locf(bz, fromLast = TRUE)
#> 1 2 3 4 5 6
#> 2 1 1 4 5 2
cz <- zoo(c(NA,9,3,2,3,2))
na.locf(cz)
#> 2 3 4 5 6
#> 9 3 2 3 2
# generate and fill in missing dates
z <- zoo(c(0.007306621, 0.007659046, 0.007681013,
0.007817548, 0.007847579, 0.007867313),
as.Date(c("1993-01-01", "1993-01-09", "1993-01-16",
"1993-01-23", "1993-01-30", "1993-02-06")))
g <- seq(start(z), end(z), "day")
na.locf(z, xout = g)
#> 1993-01-01 1993-01-02 1993-01-03 1993-01-04 1993-01-05 1993-01-06
#> 0.007306621 0.007306621 0.007306621 0.007306621 0.007306621 0.007306621
#> 1993-01-07 1993-01-08 1993-01-09 1993-01-10 1993-01-11 1993-01-12
#> 0.007306621 0.007306621 0.007659046 0.007659046 0.007659046 0.007659046
#> 1993-01-13 1993-01-14 1993-01-15 1993-01-16 1993-01-17 1993-01-18
#> 0.007659046 0.007659046 0.007659046 0.007681013 0.007681013 0.007681013
#> 1993-01-19 1993-01-20 1993-01-21 1993-01-22 1993-01-23 1993-01-24
#> 0.007681013 0.007681013 0.007681013 0.007681013 0.007817548 0.007817548
#> 1993-01-25 1993-01-26 1993-01-27 1993-01-28 1993-01-29 1993-01-30
#> 0.007817548 0.007817548 0.007817548 0.007817548 0.007817548 0.007847579
#> 1993-01-31 1993-02-01 1993-02-02 1993-02-03 1993-02-04 1993-02-05
#> 0.007847579 0.007847579 0.007847579 0.007847579 0.007847579 0.007847579
#> 1993-02-06
#> 0.007867313
# similar but use a 2 second grid
z <- zoo(1:9, as.POSIXct(c("2010-01-04 09:30:02", "2010-01-04 09:30:06",
"2010-01-04 09:30:07", "2010-01-04 09:30:08", "2010-01-04 09:30:09",
"2010-01-04 09:30:10", "2010-01-04 09:30:11", "2010-01-04 09:30:13",
"2010-01-04 09:30:14")))
g <- seq(start(z), end(z), by = "2 sec")
na.locf(z, xout = g)
#> 2010-01-04 09:30:02 2010-01-04 09:30:04 2010-01-04 09:30:06 2010-01-04 09:30:08
#> 1 1 2 4
#> 2010-01-04 09:30:10 2010-01-04 09:30:12 2010-01-04 09:30:14
#> 6 7 9
## get 5th of every month or most recent date prior to 5th if 5th missing.
## Result has index of the date actually used.
z <- zoo(c(1311.56, 1309.04, 1295.5, 1296.6, 1286.57, 1288.12,
1289.12, 1289.12, 1285.33, 1307.65, 1309.93, 1311.46, 1311.28,
1308.11, 1301.74, 1305.41, 1309.72, 1310.61, 1305.19, 1313.21,
1307.85, 1312.25, 1325.76), as.Date(c(13242, 13244,
13245, 13248, 13249, 13250, 13251, 13252, 13255, 13256, 13257,
13258, 13259, 13262, 13263, 13264, 13265, 13266, 13269, 13270,
13271, 13272, 13274)))
# z.na is same as z but with missing days added (with NAs)
# It is formed by merging z with a zero with series having all the dates.
rng <- range(time(z))
z.na <- merge(z, zoo(, seq(rng[1], rng[2], by = "day")))
# use na.locf to bring values forward picking off 5th of month
na.locf(z.na)[as.POSIXlt(time(z.na))$mday == 5]
#> 2006-04-05 2006-05-05
#> 1311.56 1312.25
## this is the same as the last one except instead of always using the
## 5th of month in the result we show the date actually used
# idx has NAs wherever z.na does but has 1, 2, 3, ... instead of
# z.na's data values (so idx can be used for indexing)
idx <- coredata(na.locf(seq_along(z.na) + (0 * z.na)))
# pick off those elements of z.na that correspond to 5th
z.na[idx[as.POSIXlt(time(z.na))$mday == 5]]
#> 2006-04-04 2006-05-04
#> 1311.56 1312.25
## only fill single-day gaps
merge(z.na, filled1 = na.locf(z.na, maxgap = 1))
#> z.na filled1
#> 2006-04-04 1311.56 1311.56
#> 2006-04-05 NA 1311.56
#> 2006-04-06 1309.04 1309.04
#> 2006-04-07 1295.50 1295.50
#> 2006-04-08 NA NA
#> 2006-04-09 NA NA
#> 2006-04-10 1296.60 1296.60
#> 2006-04-11 1286.57 1286.57
#> 2006-04-12 1288.12 1288.12
#> 2006-04-13 1289.12 1289.12
#> 2006-04-14 1289.12 1289.12
#> 2006-04-15 NA NA
#> 2006-04-16 NA NA
#> 2006-04-17 1285.33 1285.33
#> 2006-04-18 1307.65 1307.65
#> 2006-04-19 1309.93 1309.93
#> 2006-04-20 1311.46 1311.46
#> 2006-04-21 1311.28 1311.28
#> 2006-04-22 NA NA
#> 2006-04-23 NA NA
#> 2006-04-24 1308.11 1308.11
#> 2006-04-25 1301.74 1301.74
#> 2006-04-26 1305.41 1305.41
#> 2006-04-27 1309.72 1309.72
#> 2006-04-28 1310.61 1310.61
#> 2006-04-29 NA NA
#> 2006-04-30 NA NA
#> 2006-05-01 1305.19 1305.19
#> 2006-05-02 1313.21 1313.21
#> 2006-05-03 1307.85 1307.85
#> 2006-05-04 1312.25 1312.25
#> 2006-05-05 NA 1312.25
#> 2006-05-06 1325.76 1325.76
## fill NAs in first column by inflating the most recent non-NA
## by the growth in second column. Note that elements of x-x
## are NA if the corresponding element of x is NA and zero else
m <- zoo(cbind(c(1, 2, NA, NA, 5, NA, NA), seq(7)^2), as.Date(1:7))
r <- na.locf(m[,1]) * m[,2] / na.locf(m[,2] + (m[,1]-m[,1]))
cbind(V1 = r, V2 = m[,2])
#> V1 V2
#> 1970-01-02 1.0 1
#> 1970-01-03 2.0 4
#> 1970-01-04 4.5 9
#> 1970-01-05 8.0 16
#> 1970-01-06 5.0 25
#> 1970-01-07 7.2 36
#> 1970-01-08 9.8 49
## repeat a quarterly value every month
## preserving NAs
zq <- zoo(c(1, NA, 3, 4), as.yearqtr(2000) + 0:3/4)
tt <- as.yearmon(start(zq)) + seq(0, len = 3 * length(zq))/12
na.locf(zq, xout = tt, maxgap = 0)
#> Jan 2000 Feb 2000 Mar 2000 Apr 2000 May 2000 Jun 2000 Jul 2000 Aug 2000
#> 1 1 1 NA NA NA 3 3
#> Sep 2000 Oct 2000 Nov 2000 Dec 2000
#> 3 4 4 4
## na.locf() can also be mimicked with ave()
x <- c(NA, 10, NA, NA, 20, NA)
f <- function(x) x[1]
ave(x, cumsum(!is.na(x)), FUN = f)
#> [1] NA 10 10 10 20 20
## by replacing f() with other functions various generalizations can be
## obtained, e.g.,
f <- function(x) if (length(x) > 3) x else x[1] # like maxgap
f <- function(x) replace(x, 1:min(length(x)), 3) # replace up to 2 NAs
f <- function(x) if (!is.na(x[1]) && x[1] > 0) x[1] else x # only positve numbers