Cox Survival Estimates
survest.cph.RdCompute survival probabilities and optional confidence limits for
Cox survival models. If x=TRUE, y=TRUE were specified to cph,
confidence limits use the correct formula for any combination of
predictors. Otherwise, if surv=TRUE was specified to cph,
confidence limits are based only on standard errors of log(S(t))
at the mean value of \(X\beta\). If the model
contained only stratification factors, or if predictions are being
requested near the mean of each covariable, this approximation will be
accurate. Unless times is given, at most one observation may be
predicted.
Arguments
- fit
a model fit from
cph- newdata
a data frame containing predictor variable combinations for which predictions are desired
- linear.predictors
a vector of linear predictor values (centered) for which predictions are desired. If the model is stratified, the "strata" attribute must be attached to this vector (see example).
- x
a design matrix at which to compute estimates, with any strata attached as a "strata" attribute. Only one of
newdata,linear.predictors, orxmay be specified. If none is specified, buttimesis specified, you will get survival predictions at all subjects' linear predictor and strata values.- times
a vector of times at which to get predictions. If omitted, predictions are made at all unique failure times in the original input data.
- loglog
set to
TRUEto make thelog-logtransformation of survival estimates and confidence limits.- fun
any function to transform the estimates and confidence limits (
loglogis a special case)- conf.int
set to
FALSEor0to suppress confidence limits, or e.g..95to cause 0.95 confidence limits to be computed- type
see
survfit.coxph- vartype
see
survfit.coxph- conf.type
specifies the basis for computing confidence limits.
"log"is the default as in thesurvivalpackage.- se.fit
set to
TRUEto get standard errors of log predicted survival (no matter whatconf.typeis). IfFALSE, confidence limits are suppressed.- individual
set to
TRUEto havesurvfitinterpretnewdataas specifying a covariable path for a single individual (represented by multiple records).- what
Normally use
what="survival"to estimate survival probabilities at times that may not correspond to the subjects' own times.what="parallel"assumes that the length oftimesis the number of subjects (or one), and causessurvestto estimate the ith subject's survival probability at the ith value oftimes(or at the scalar value oftimes).what="parallel"is used byval.survfor example.- ...
unused
Value
If times is omitted, returns a list with the elements
time, n.risk, n.event, surv, call
(calling statement), and optionally std.err, upper,
lower, conf.type, conf.int. The estimates in this
case correspond to one subject. If times is specified, the
returned list has possible components time, surv,
std.err, lower, and upper. These will be matrices
(except for time) if more than one subject is being predicted,
with rows representing subjects and columns representing times.
If times has only one time, these are reduced to vectors with
the number of elements equal to the number of subjects.
Details
The result is passed through naresid if newdata,
linear.predictors, and x are not specified, to restore
placeholders for NAs.
Examples
# Simulate data from a population model in which the log hazard
# function is linear in age and there is no age x sex interaction
# Proportional hazards holds for both variables but we
# unnecessarily stratify on sex to see what happens
require(survival)
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n, TRUE))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
dd <- datadist(age, sex)
options(datadist='dd')
Srv <- Surv(dt,e)
f <- cph(Srv ~ age*strat(sex), x=TRUE, y=TRUE) #or surv=T
#> Error in Design(data, formula, specials = c("strat", "strata")): dataset dd not found for options(datadist=)
survest(f, expand.grid(age=c(20,40,60),sex=c("Male","Female")),
times=c(2,4,6), conf.int=.9)
#> Error: object 'f' not found
f <- update(f, surv=TRUE)
#> Error: object 'f' not found
lp <- c(0, .5, 1)
f$strata # check strata names
#> Error: object 'f' not found
attr(lp,'strata') <- rep(1,3) # or rep('sex=Female',3)
survest(f, linear.predictors=lp, times=c(2,4,6))
#> Error: object 'f' not found
# Test survest by comparing to survfit.coxph for a more complex model
f <- cph(Srv ~ pol(age,2)*strat(sex), x=TRUE, y=TRUE)
#> Error in Design(data, formula, specials = c("strat", "strata")): dataset dd not found for options(datadist=)
survest(f, data.frame(age=median(age), sex=levels(sex)), times=6)
#> Error: object 'f' not found
age2 <- age^2
f2 <- coxph(Srv ~ (age + age2)*strata(sex))
new <- data.frame(age=median(age), age2=median(age)^2, sex='Male')
summary(survfit(f2, new), times=6)
#> Call: survfit(formula = f2, newdata = new)
#>
#> 1
#> time n.risk n.event survival std.err lower 95% CI
#> 6.0000 285.0000 56.0000 0.8839 0.0178 0.8497
#> upper 95% CI
#> 0.9196
#>
new$sex <- 'Female'
summary(survfit(f2, new), times=6)
#> Call: survfit(formula = f2, newdata = new)
#>
#> 1
#> time n.risk n.event survival std.err lower 95% CI
#> 6.0000 206.0000 90.0000 0.7731 0.0246 0.7264
#> upper 95% CI
#> 0.8228
#>
options(datadist=NULL)