
Aalen's additive regression model for censored data
aareg.RdReturns an object of class "aareg" that
represents an Aalen model.
Usage
aareg(formula, data, weights, subset, na.action,
qrtol=1e-07, nmin, dfbeta=FALSE, taper=1,
test = c('aalen', 'variance', 'nrisk'), cluster,
model=FALSE, x=FALSE, y=FALSE)Arguments
- formula
a formula object, with the response on the left of a `~' operator and the terms, separated by
+operators, on the right. The response must be aSurvobject. Due to a particular computational approach that is used, the model MUST include an intercept term. If "-1" is used in the model formula the program will ignore it.- data
data frame in which to interpret the variables named in the
formula,subset, andweightsarguments. This may also be a single number to handle some speci al cases – see below for details. Ifdatais missing, the variables in the model formula should be in the search path.- weights
vector of observation weights. If supplied, the fitting algorithm minimizes the sum of the weights multiplied by the squared residuals (see below for additional technical details). The length of
weightsmust be the same as the number of observations. The weights must be nonnegative and it i s recommended that they be strictly positive, since zero weights are ambiguous. To exclude particular observations from the model, use thesubsetargument instead of zero weights.- subset
expression specifying which subset of observations should be used in the fit. Th is can be a logical vector (which is replicated to have length equal to the numb er of observations), a numeric vector indicating the observation numbers to be included, or a character vector of the observation names that should be included. All observations are included by default.
- na.action
a function to filter missing data. This is applied to the
model.fr ameafter anysubsetargument has be en applied. The default isna.fail, which returns a n error if any missing values are found. An alternative isna.excl ude, which deletes observations that contain one or more missing values.- qrtol
tolerance for detection of singularity in the QR decomposition
- nmin
minimum number of observations for an estimate; defaults to 3 times the number of covariates. This essentially truncates the computations near the tail of the data set, when n is small and the calculations can become numerically unstable.
- dfbeta
should the array of dfbeta residuals be computed. This implies computation of the sandwich variance estimate. The residuals will always be computed if there is a
clusterterm in the model formula.- taper
allows for a smoothed variance estimate. Var(x), where x is the set of covariates, is an important component of the calculations for the Aalen regression model. At any given time point t, it is computed over all subjects who are still at risk at time t. The tape argument allows smoothing these estimates, for example
taper=(1:4)/4would cause the variance estimate used at any event time to be a weighted average of the estimated variance matrices at the last 4 death times, with a weight of 1 for the current death time and decreasing to 1/4 for prior event times. The default value gives the standard Aalen model.- test
selects the weighting to be used, for computing an overall “average” coefficient vector over time and the subsequent test for equality to zero.
- cluster
the clustering group, optional. The variable will be searched for in the data argument.
- model, x, y
should copies of the model frame, the x matrix of predictors, or the response vector y be included in the saved result.
Value
an object of class "aareg"
representing the fit, with the following components:
- n
vector containing the number of observations in the data set, the number of event times, and the number of event times used in the computation
- times
vector of sorted event times, which may contain duplicates
- nrisk
vector containing the number of subjects at risk, of the same length as
times- coefficient
matrix of coefficients, with one row per event and one column per covariate
- test.statistic
the value of the test statistic, a vector with one element per covariate
- test.var
variance-covariance matrix for the test
- test
the type of test; a copy of the
testargument above- tweight
matrix of weights used in the computation, one row per event
- call
a copy of the call that produced this result
Details
The Aalen model assumes that the cumulative hazard H(t) for a subject can be expressed as a(t) + X B(t), where a(t) is a time-dependent intercept term, X is the vector of covariates for the subject (possibly time-dependent), and B(t) is a time-dependent matrix of coefficients. The estimates are inherently non-parametric; a fit of the model will normally be followed by one or more plots of the estimates.
The estimates may become unstable near the tail of a data set, since the
increment to B at time t is based on the subjects still at risk at time
t. The tolerance and/or nmin parameters may act to truncate the estimate
before the last death.
The taper argument can also be used to smooth
out the tail of the curve.
In practice, the addition of a taper such as 1:10 appears to have little
effect on death times when n is still reasonably large, but can considerably
dampen wild occilations in the tail of the plot.