Parametric Survival Model
psm.Rdpsm is a modification of Therneau's survreg function for
fitting the accelerated failure time family of parametric survival
models. psm uses the rms class for automatic
anova, fastbw, calibrate, validate, and
other functions. Hazard.psm, Survival.psm,
Quantile.psm, and Mean.psm create S functions that
evaluate the hazard, survival, quantile, and mean (expected value)
functions analytically, as functions of time or probabilities and the
linear predictor values. The Nagelkerke R^2 and and adjusted
Maddala-Cox-Snell R^2 are computed. For the latter the notation is
R2(p,m) where p is the number of regression coefficients being
adjusted for and m is the effective sample size (number of uncensored
observations). See R2Measures for more information.
For the print method, format of output is controlled by the
user previously running options(prType="lang") where
lang is "plain" (the default), "latex", or
"html".
The residuals.psm function exists mainly to compute normalized
(standardized) residuals and to censor them (i.e., return them as
Surv objects) just as the original failure time variable was
censored. These residuals are useful for checking the underlying
distributional assumption (see the examples). To get these residuals,
the fit must have specified y=TRUE. A lines method for these
residuals automatically draws a curve with the assumed standardized
survival distribution. A survplot method runs the standardized
censored residuals through npsurv to get Kaplan-Meier estimates,
with optional stratification (automatically grouping a continuous
variable into quantiles) and then through survplot.npsurv to plot
them. Then lines is invoked to show the theoretical curve. Other
types of residuals are computed by residuals using
residuals.survreg.
Usage
psm(formula,
data=environment(formula), weights,
subset, na.action=na.delete, dist="weibull",
init=NULL, scale=0,
control=survreg.control(),
parms=NULL,
model=FALSE, x=FALSE, y=TRUE, time.inc, ...)
# S3 method for class 'psm'
print(x, correlation=FALSE, digits=4, r2=c(0,2,4), coefs=TRUE,
pg=FALSE, title, ...)
Hazard(object, ...)
# S3 method for class 'psm'
Hazard(object, ...) # for psm fit
# E.g. lambda <- Hazard(fit)
Survival(object, ...)
# S3 method for class 'psm'
Survival(object, ...) # for psm
# E.g. survival <- Survival(fit)
# S3 method for class 'psm'
Quantile(object, ...) # for psm
# E.g. quantsurv <- Quantile(fit)
# S3 method for class 'psm'
Mean(object, ...) # for psm
# E.g. meant <- Mean(fit)
# lambda(times, lp) # get hazard function at t=times, xbeta=lp
# survival(times, lp) # survival function at t=times, lp
# quantsurv(q, lp) # quantiles of survival time
# meant(lp) # mean survival time
# S3 method for class 'psm'
residuals(object, type=c("censored.normalized",
"response", "deviance", "dfbeta",
"dfbetas", "working", "ldcase", "ldresp", "ldshape", "matrix", "score"), ...)
# S3 method for class 'residuals.psm.censored.normalized'
survplot(fit, x, g=4, col, main, ...)
# S3 method for class 'residuals.psm.censored.normalized'
lines(x, n=100, lty=1, xlim,
lwd=3, ...)
# for type="censored.normalized"Arguments
- formula
an S statistical model formula. Interactions up to third order are supported. The left hand side must be a
Survobject.- object
a fit created by
psm. Forsurvplotwith residuals frompsm,objectis the result ofresiduals.psm.- fit
a fit created by
psm- data,subset,weights,dist,scale,init,na.action,control
see
survreg.- parms
a list of fixed parameters. For the \(t\)-distribution this is the degrees of freedom; most of the distributions have no parameters.
- model
set to
TRUEto include the model frame in the returned object- x
set to
TRUEto include the design matrix in the object produced bypsm. For thesurvplotmethod,xis an optional stratification variable (character, numeric, or categorical). Forlines.residuals.psm.censored.normalized,xis the result ofresiduals.psm. Forprintit is the result ofpsm.- y
set to
TRUEto include theSurv()matrix- time.inc
setting for default time spacing. Used in constructing time axis in
survplot, and also in make confidence bars. Default is 30 if time variable hasunits="Day", 1 otherwise, unless maximum follow-up time \(< 1\). Then max time/10 is used astime.inc. Iftime.incis not given and max time/defaulttime.incis \(> 25\),time.incis increased.- correlation
set to
TRUEto print the correlation matrix for parameter estimates- digits
number of places to print to the right of the decimal point
- r2
vector of integers specifying which R^2 measures to print, with 0 for Nagelkerke R^2 and 1:4 corresponding to the 4 measures computed by
R2Measures. Default is to print Nagelkerke (labeled R2) and second and fourthR2Measureswhich are the measures adjusted for the number of predictors, first for the raw sample size then for the effective sample size, which here is the number of uncensored observations.- coefs
specify
coefs=FALSEto suppress printing the table of model coefficients, standard errors, etc. Specifycoefs=nto print only the firstnregression coefficients in the model.- pg
set to
TRUEto print g-indexes- title
a character string title to be passed to
prModFit- ...
other arguments to fitting routines, or to pass to
survplotfromsurvplot.residuals.psm.censored.normalized. Passed to the genericlinesfunction forlines.- times
a scalar or vector of times for which to evaluate survival probability or hazard
- lp
a scalar or vector of linear predictor values at which to evaluate survival probability or hazard. If both
timesandlpare vectors, they must be of the same length.- q
a scalar or vector of probabilities. The default is .5, so just the median survival time is returned. If
qandlpare both vectors, a matrix of quantiles is returned, with rows corresponding tolpand columns toq.- type
type of residual desired. Default is censored normalized residuals, defined as (link(Y) - linear.predictors)/scale parameter, where the link function was usually the log function. See
survregfor other types.type="score"returns the score residual matrix.- n
number of points to evaluate theoretical standardized survival function for
lines.residuals.psm.censored.normalized- lty
line type for
lines, default is 1- xlim
range of times (or transformed times) for which to evaluate the standardized survival function. Default is range in normalized residuals.
- lwd
line width for theoretical distribution, default is 3
- g
number of quantile groups to use for stratifying continuous variables having more than 5 levels
- col
vector of colors for
survplotmethod, corresponding to levels ofx(must be a scalar if there is nox)- main
main plot title for
survplot. If omitted, is the name or label ofxifxis given. Usemain=""to suppress a title when you specifyx.
Value
psm returns a fit object with all the information survreg would store as
well as what rms stores and units and time.inc.
Hazard, Survival, and Quantile return S-functions.
residuals.psm with type="censored.normalized" returns a
Surv object which has a special attribute "theoretical"
which is used by the lines
routine. This is the assumed standardized survival function as a function
of time or transformed time.
Details
The object survreg.distributions contains definitions of properties
of the various survival distributions.
psm does not trap singularity errors due to the way survreg.fit
does matrix inversion. It will trap non-convergence (thus returning
fit$fail=TRUE) if you give the argument failure=2 inside the
control list which is passed to survreg.fit. For example, use
f <- psm(S ~ x, control=list(failure=2, maxiter=20)) to allow up to
20 iterations and to set f$fail=TRUE in case of non-convergence.
This is especially useful in simulation work.
Author
Frank Harrell
Department of Biostatistics
Vanderbilt University
fh@fharrell.com
Examples
require(survival)
n <- 400
set.seed(1)
age <- rnorm(n, 50, 12)
sex <- factor(sample(c('Female','Male'),n,TRUE))
dd <- datadist(age,sex)
options(datadist='dd')
# Population hazard function:
h <- .02*exp(.06*(age-50)+.8*(sex=='Female'))
d.time <- -log(runif(n))/h
cens <- 15*runif(n)
death <- ifelse(d.time <= cens,1,0)
d.time <- pmin(d.time, cens)
f <- psm(Surv(d.time,death) ~ sex*pol(age,2),
dist='lognormal')
#> Error in Design(m, formula = formula, specials = c("strata", "cluster")): dataset dd not found for options(datadist=)
# Log-normal model is a bad fit for proportional hazards data
print(f, r2=0:4, pg=TRUE)
#> Error: object 'f' not found
anova(f)
#> Error: object 'f' not found
fastbw(f) # if deletes sex while keeping age*sex ignore the result
#> Error: object 'f' not found
f <- update(f, x=TRUE,y=TRUE) # so can validate, compute certain resids
#> Error: object 'f' not found
validate(f, B=10) # ordinarily use B=300 or more
#> Error: object 'f' not found
plot(Predict(f, age, sex)) # needs datadist since no explicit age, hosp.
#> Error: object 'f' not found
# Could have used ggplot(Predict(...))
survplot(f, age=c(20,60)) # needs datadist since hospital not set here
#> Error: object 'f' not found
# latex(f)
S <- Survival(f)
#> Error: object 'f' not found
plot(f$linear.predictors, S(6, f$linear.predictors),
xlab=expression(X*hat(beta)),
ylab=expression(S(6,X*hat(beta))))
#> Error: object 'f' not found
# plots 6-month survival as a function of linear predictor (X*Beta hat)
times <- seq(0,24,by=.25)
plot(times, S(times,0), type='l') # plots survival curve at X*Beta hat=0
#> Error in S(times, 0): could not find function "S"
lam <- Hazard(f)
#> Error: object 'f' not found
plot(times, lam(times,0), type='l') # similarly for hazard function
#> Error in lam(times, 0): could not find function "lam"
med <- Quantile(f) # new function defaults to computing median only
#> Error: object 'f' not found
lp <- seq(-3, 5, by=.1)
plot(lp, med(lp=lp), ylab="Median Survival Time")
#> Error in med(lp = lp): could not find function "med"
med(c(.25,.5), f$linear.predictors)
#> Error in med(c(0.25, 0.5), f$linear.predictors): could not find function "med"
# prints matrix with 2 columns
# fit a model with no predictors
f <- psm(Surv(d.time,death) ~ 1, dist="weibull")
#> Error in Design(m, formula = formula, specials = c("strata", "cluster")): dataset dd not found for options(datadist=)
f
#> Error: object 'f' not found
pphsm(f) # print proportional hazards form
#> Warning: at present, pphsm does not return the correct covariance matrix
#> Error: object 'f' not found
g <- survest(f)
#> Error: object 'f' not found
plot(g$time, g$surv, xlab='Time', type='l',
ylab=expression(S(t)))
#> Error: object 'g' not found
f <- psm(Surv(d.time,death) ~ age,
dist="loglogistic", y=TRUE)
#> Error in Design(m, formula = formula, specials = c("strata", "cluster")): dataset dd not found for options(datadist=)
r <- resid(f, 'cens') # note abbreviation
#> Error: object 'f' not found
survplot(npsurv(r ~ 1), conf='none')
#> Error in eval(predvars, data, env): object 'r' not found
# plot Kaplan-Meier estimate of
# survival function of standardized residuals
survplot(npsurv(r ~ cut2(age, g=2)), conf='none')
#> Error in eval(predvars, data, env): object 'r' not found
# both strata should be n(0,1)
lines(r) # add theoretical survival function
#> Error: object 'r' not found
#More simply:
survplot(r, age, g=2)
#> Error: object 'r' not found
options(datadist=NULL)