Rank Correlation for Paired Predictors with a Possibly Censored Response, and Integrated Discrimination Index
rcorrp.cens.RdComputes U-statistics to test for whether predictor X1 is more
concordant than predictor X2, extending rcorr.cens. For
method=1, estimates the fraction of pairs for which the
x1 difference is more impressive than the x2
difference. For method=2, estimates the fraction of pairs for
which x1 is concordant with S but x2 is not.
For binary responses the function improveProb provides several
assessments of whether one set of predicted probabilities is better
than another, using the methods describe in
Pencina et al (2007). This involves NRI and IDI to test for
whether predictions from model x1 are significantly different
from those obtained from predictions from model x2. This is a
distinct improvement over comparing ROC areas, sensitivity, or
specificity.
Usage
rcorrp.cens(x1, x2, S, outx=FALSE, method=1)
improveProb(x1, x2, y)
# S3 method for class 'improveProb'
print(x, digits=3, conf.int=.95, ...)Arguments
- x1
first predictor (a probability, for
improveProb)- x2
second predictor (a probability, for
improveProb)- S
a possibly right-censored
Survobject. IfSis a vector instead, it is converted to aSurvobject and it is assumed that no observations are censored.- outx
set to
TRUEto exclude pairs tied onx1orx2from consideration- method
see above
- y
a binary 0/1 outcome variable
- x
the result from
improveProb- digits
number of significant digits for use in printing the result of
improveProb- conf.int
level for confidence limits
- ...
unused
Details
If x1,x2 represent predictions from models, these
functions assume either that you are using a separate sample from the
one used to build the model, or that the amount of overfitting in
x1 equals the amount of overfitting in x2. An example
of the latter is giving both models equal opportunity to be complex so
that both models have the same number of effective degrees of freedom,
whether a predictor was included in the model or was screened out by a
variable selection scheme.
Note that in the first part of their paper, Pencina et al. presented measures that required binning the predicted probabilities. Those measures were then replaced with better continuous measures that are implementedhere.
Value
a vector of statistics for rcorrp.cens, or a list with class
improveProb of statistics for improveProb:
- n
number of cases
- na
number of events
- nb
number of non-events
- pup.ev
mean of pairwise differences in probabilities for those with events and a pairwise difference of \(\mbox{probabilities}>0\)
- pup.ne
mean of pairwise differences in probabilities for those without events and a pairwise difference of \(\mbox{probabilities}>0\)
- pdown.ev
mean of pairwise differences in probabilities for those with events and a pairwise difference of \(\mbox{probabilities}>0\)
- pdown.ne
mean of pairwise differences in probabilities for those without events and a pairwise difference of \(\mbox{probabilities}>0\)
- nri
Net Reclassification Index = \((pup.ev-pdown.ev)-(pup.ne-pdown.ne)\)
- se.nri
standard error of NRI
- z.nri
Z score for NRI
- nri.ev
Net Reclassification Index = \(pup.ev-pdown.ev\)
- se.nri.ev
SE of NRI of events
- z.nri.ev
Z score for NRI of events
- nri.ne
Net Reclassification Index = \(pup.ne-pdown.ne\)
- se.nri.ne
SE of NRI of non-events
- z.nri.ne
Z score for NRI of non-events
- improveSens
improvement in sensitivity
- improveSpec
improvement in specificity
- idi
Integrated Discrimination Index
- se.idi
SE of IDI
- z.idi
Z score of IDI
Author
Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com
Scott Williams
Division of Radiation Oncology
Peter MacCallum Cancer Centre, Melbourne, Australia
scott.williams@petermac.org
References
Pencina MJ, D'Agostino Sr RB, D'Agostino Jr RB, Vasan RS (2008): Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Stat in Med 27:157-172. DOI: 10.1002/sim.2929
Pencina MJ, D'Agostino Sr RB, D'Agostino Jr RB, Vasan RS: Rejoinder: Comments on Integrated discrimination and net reclassification improvements-Practical advice. Stat in Med 2007; DOI: 10.1002/sim.3106
Pencina MJ, D'Agostino RB, Steyerberg EW (2011): Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat in Med 30:11-21; DOI: 10.1002/sim.4085
Examples
set.seed(1)
library(survival)
x1 <- rnorm(400)
x2 <- x1 + rnorm(400)
d.time <- rexp(400) + (x1 - min(x1))
cens <- runif(400,.5,2)
death <- d.time <= cens
d.time <- pmin(d.time, cens)
rcorrp.cens(x1, x2, Surv(d.time, death))
#> Dxy S.D. x1 more concordant x2 more concordant
#> -8.21e-02 1.37e-01 4.59e-01 5.41e-01
#> n missing uncensored Relevant Pairs
#> 4.00e+02 0.00e+00 1.10e+01 4.26e+03
#> Uncertain C X1 C X2 Dxy X1
#> 1.55e+05 9.92e-01 9.26e-01 9.84e-01
#> Dxy X2
#> 8.52e-01
#rcorrp.cens(x1, x2, y) ## no censoring
set.seed(1)
x1 <- runif(1000)
x2 <- runif(1000)
y <- sample(0:1, 1000, TRUE)
rcorrp.cens(x1, x2, y)
#> Dxy S.D. x1 more concordant x2 more concordant
#> 6.00e-02 3.65e-02 5.30e-01 4.70e-01
#> n missing uncensored Relevant Pairs
#> 1.00e+03 0.00e+00 1.00e+03 5.00e+05
#> Uncertain C X1 C X2 Dxy X1
#> 0.00e+00 5.10e-01 4.66e-01 1.91e-02
#> Dxy X2
#> -6.82e-02
improveProb(x1, x2, y)
#>
#> Analysis of Proportions of Subjects with Improvement in Predicted Probability
#>
#> Number of events: 508 Number of non-events: 492
#>
#> Proportions of Positive and Negative Changes in Probabilities
#>
#> Proportion
#> Increase for events (1) 0.482
#> Increase for non-events (2) 0.528
#> Decrease for events (3) 0.518
#> Decrease for non-events (4) 0.472
#>
#>
#> Net Reclassification Improvement
#>
#> Index SE Z 2P Lower 0.95 Upper 0.95
#> NRI (1-3+4-2) -0.0923 0.0632 -1.462 0.144 -0.216 0.0315
#> NRI for events (1-3) -0.0354 0.0443 -0.799 0.424 -0.122 0.0515
#> NRI for non-events (4-2) -0.0569 0.0450 -1.264 0.206 -0.145 0.0313
#>
#>
#> Analysis of Changes in Predicted Probabilities
#>
#> Mean Change in Probability
#> Increase for events (sensitivity) -0.0308
#> Decrease for non-events (specificity) -0.0127
#>
#>
#> Integrated Discrimination Improvement
#> (average of sensitivity and 1-specificity over [0,1];
#> also is difference in Yates' discrimination slope)
#>
#> IDI SE Z 2P Lower 0.95 Upper 0.95
#> -0.0436 0.0265 -1.6433 0.1003 -0.0956 0.0084