na.locf.RdGeneric function for replacing each NA with the most recent
non-NA prior to it.
na.locf(object, na.rm = TRUE, ...)
# Default S3 method
na.locf(object, na.rm = TRUE, fromLast, rev,
maxgap = Inf, rule = 2, ...)
na.locf0(object, fromLast = FALSE, maxgap = Inf, coredata = NULL)an object.
logical. Should leading NAs be removed?
logical. Causes observations to be carried backward rather
than forward. Default is FALSE. With a value of TRUE
this corresponds to NOCB (next observation carried backward).
It is not supported if x or xout is specified.
Use fromLast instead. This argument will
be eliminated in the future in favor of fromLast.
Runs of more than maxgap NAs are retained,
other NAs are removed and the last occurrence in the resulting series
prior to each time point in xout is used as that time point's output value.
(If xout is not specified this reduces to retaining runs of more than
maxgap NAs while filling other NAs with the last
occurrence of a non-NA.)
See approx.
further arguments passed to methods.
logical. Should LOCF be applied to the core data
of a (time series) object and then assigned to the original object
again? By default, this strategy is applied to time series classes
(e.g., ts, zoo, xts, etc.) where it preserves
the time index.
An object in which each NA in the input object is replaced
by the most recent non-NA prior to it. If there are no earlier non-NAs then
the NA is omitted (if na.rm = TRUE) or it is not replaced (if na.rm = FALSE).
The arguments x and xout can be used in which case they have
the same meaning as in approx.
Note that if a multi-column zoo object has a column entirely composed of
NA then with na.rm = TRUE, the default,
the above implies that the resulting object will have
zero rows. Use na.rm = FALSE to preserve the NA values instead.
The function na.locf0 is the workhorse function underlying the default
na.locf method. It has more limited capabilities but is faster for the
special cases it covers. Implicitly, it uses na.rm=FALSE.
az <- zoo(1:6)
bz <- zoo(c(2,NA,1,4,5,2))
na.locf(bz)
#> 1 2 3 4 5 6
#> 2 2 1 4 5 2
na.locf(bz, fromLast = TRUE)
#> 1 2 3 4 5 6
#> 2 1 1 4 5 2
cz <- zoo(c(NA,9,3,2,3,2))
na.locf(cz)
#> 2 3 4 5 6
#> 9 3 2 3 2
# generate and fill in missing dates
z <- zoo(c(0.007306621, 0.007659046, 0.007681013,
0.007817548, 0.007847579, 0.007867313),
as.Date(c("1993-01-01", "1993-01-09", "1993-01-16",
"1993-01-23", "1993-01-30", "1993-02-06")))
g <- seq(start(z), end(z), "day")
na.locf(z, xout = g)
#> 1993-01-01 1993-01-02 1993-01-03 1993-01-04 1993-01-05 1993-01-06
#> 0.007306621 0.007306621 0.007306621 0.007306621 0.007306621 0.007306621
#> 1993-01-07 1993-01-08 1993-01-09 1993-01-10 1993-01-11 1993-01-12
#> 0.007306621 0.007306621 0.007659046 0.007659046 0.007659046 0.007659046
#> 1993-01-13 1993-01-14 1993-01-15 1993-01-16 1993-01-17 1993-01-18
#> 0.007659046 0.007659046 0.007659046 0.007681013 0.007681013 0.007681013
#> 1993-01-19 1993-01-20 1993-01-21 1993-01-22 1993-01-23 1993-01-24
#> 0.007681013 0.007681013 0.007681013 0.007681013 0.007817548 0.007817548
#> 1993-01-25 1993-01-26 1993-01-27 1993-01-28 1993-01-29 1993-01-30
#> 0.007817548 0.007817548 0.007817548 0.007817548 0.007817548 0.007847579
#> 1993-01-31 1993-02-01 1993-02-02 1993-02-03 1993-02-04 1993-02-05
#> 0.007847579 0.007847579 0.007847579 0.007847579 0.007847579 0.007847579
#> 1993-02-06
#> 0.007867313
# similar but use a 2 second grid
z <- zoo(1:9, as.POSIXct(c("2010-01-04 09:30:02", "2010-01-04 09:30:06",
"2010-01-04 09:30:07", "2010-01-04 09:30:08", "2010-01-04 09:30:09",
"2010-01-04 09:30:10", "2010-01-04 09:30:11", "2010-01-04 09:30:13",
"2010-01-04 09:30:14")))
g <- seq(start(z), end(z), by = "2 sec")
na.locf(z, xout = g)
#> 2010-01-04 09:30:02 2010-01-04 09:30:04 2010-01-04 09:30:06 2010-01-04 09:30:08
#> 1 1 2 4
#> 2010-01-04 09:30:10 2010-01-04 09:30:12 2010-01-04 09:30:14
#> 6 7 9
## get 5th of every month or most recent date prior to 5th if 5th missing.
## Result has index of the date actually used.
z <- zoo(c(1311.56, 1309.04, 1295.5, 1296.6, 1286.57, 1288.12,
1289.12, 1289.12, 1285.33, 1307.65, 1309.93, 1311.46, 1311.28,
1308.11, 1301.74, 1305.41, 1309.72, 1310.61, 1305.19, 1313.21,
1307.85, 1312.25, 1325.76), as.Date(c(13242, 13244,
13245, 13248, 13249, 13250, 13251, 13252, 13255, 13256, 13257,
13258, 13259, 13262, 13263, 13264, 13265, 13266, 13269, 13270,
13271, 13272, 13274)))
# z.na is same as z but with missing days added (with NAs)
# It is formed by merging z with a zero with series having all the dates.
rng <- range(time(z))
z.na <- merge(z, zoo(, seq(rng[1], rng[2], by = "day")))
# use na.locf to bring values forward picking off 5th of month
na.locf(z.na)[as.POSIXlt(time(z.na))$mday == 5]
#> 2006-04-05 2006-05-05
#> 1311.56 1312.25
## this is the same as the last one except instead of always using the
## 5th of month in the result we show the date actually used
# idx has NAs wherever z.na does but has 1, 2, 3, ... instead of
# z.na's data values (so idx can be used for indexing)
idx <- coredata(na.locf(seq_along(z.na) + (0 * z.na)))
# pick off those elements of z.na that correspond to 5th
z.na[idx[as.POSIXlt(time(z.na))$mday == 5]]
#> 2006-04-04 2006-05-04
#> 1311.56 1312.25
## only fill single-day gaps
merge(z.na, filled1 = na.locf(z.na, maxgap = 1))
#> z.na filled1
#> 2006-04-04 1311.56 1311.56
#> 2006-04-05 NA 1311.56
#> 2006-04-06 1309.04 1309.04
#> 2006-04-07 1295.50 1295.50
#> 2006-04-08 NA NA
#> 2006-04-09 NA NA
#> 2006-04-10 1296.60 1296.60
#> 2006-04-11 1286.57 1286.57
#> 2006-04-12 1288.12 1288.12
#> 2006-04-13 1289.12 1289.12
#> 2006-04-14 1289.12 1289.12
#> 2006-04-15 NA NA
#> 2006-04-16 NA NA
#> 2006-04-17 1285.33 1285.33
#> 2006-04-18 1307.65 1307.65
#> 2006-04-19 1309.93 1309.93
#> 2006-04-20 1311.46 1311.46
#> 2006-04-21 1311.28 1311.28
#> 2006-04-22 NA NA
#> 2006-04-23 NA NA
#> 2006-04-24 1308.11 1308.11
#> 2006-04-25 1301.74 1301.74
#> 2006-04-26 1305.41 1305.41
#> 2006-04-27 1309.72 1309.72
#> 2006-04-28 1310.61 1310.61
#> 2006-04-29 NA NA
#> 2006-04-30 NA NA
#> 2006-05-01 1305.19 1305.19
#> 2006-05-02 1313.21 1313.21
#> 2006-05-03 1307.85 1307.85
#> 2006-05-04 1312.25 1312.25
#> 2006-05-05 NA 1312.25
#> 2006-05-06 1325.76 1325.76
## fill NAs in first column by inflating the most recent non-NA
## by the growth in second column. Note that elements of x-x
## are NA if the corresponding element of x is NA and zero else
m <- zoo(cbind(c(1, 2, NA, NA, 5, NA, NA), seq(7)^2), as.Date(1:7))
r <- na.locf(m[,1]) * m[,2] / na.locf(m[,2] + (m[,1]-m[,1]))
cbind(V1 = r, V2 = m[,2])
#> V1 V2
#> 1970-01-02 1.0 1
#> 1970-01-03 2.0 4
#> 1970-01-04 4.5 9
#> 1970-01-05 8.0 16
#> 1970-01-06 5.0 25
#> 1970-01-07 7.2 36
#> 1970-01-08 9.8 49
## repeat a quarterly value every month
## preserving NAs
zq <- zoo(c(1, NA, 3, 4), as.yearqtr(2000) + 0:3/4)
tt <- as.yearmon(start(zq)) + seq(0, len = 3 * length(zq))/12
na.locf(zq, xout = tt, maxgap = 0)
#> Jan 2000 Feb 2000 Mar 2000 Apr 2000 May 2000 Jun 2000 Jul 2000 Aug 2000
#> 1 1 1 NA NA NA 3 3
#> Sep 2000 Oct 2000 Nov 2000 Dec 2000
#> 3 4 4 4
## na.locf() can also be mimicked with ave()
x <- c(NA, 10, NA, NA, 20, NA)
f <- function(x) x[1]
ave(x, cumsum(!is.na(x)), FUN = f)
#> [1] NA 10 10 10 20 20
## by replacing f() with other functions various generalizations can be
## obtained, e.g.,
f <- function(x) if (length(x) > 3) x else x[1] # like maxgap
f <- function(x) replace(x, 1:min(length(x)), 3) # replace up to 2 NAs
f <- function(x) if (!is.na(x[1]) && x[1] > 0) x[1] else x # only positve numbers