Rank Correlation for Paired Predictors with a Possibly Censored Response, and Integrated Discrimination Index

Computes U-statistics to test for whether predictor X1 is more concordant than predictor X2, extending rcorr.cens. For method=1, estimates the fraction of pairs for which the x1 difference is more impressive than the x2 difference. For method=2, estimates the fraction of pairs for which x1 is concordant with S but x2 is not.

For binary responses the function improveProb provides several assessments of whether one set of predicted probabilities is better than another, using the methods describe in Pencina et al (2007). This involves NRI and IDI to test for whether predictions from model x1 are significantly different from those obtained from predictions from model x2. This is a distinct improvement over comparing ROC areas, sensitivity, or specificity.

Usage

rcorrp.cens(x1, x2, S, outx=FALSE, method=1)

improveProb(x1, x2, y)

# S3 method for class 'improveProb'
print(x, digits=3, conf.int=.95, ...)

Arguments

x1: first predictor (a probability, for improveProb)
x2: second predictor (a probability, for improveProb)
S: a possibly right-censored Surv object. If S is a vector instead, it is converted to a Surv object and it is assumed that no observations are censored.
outx: set to TRUE to exclude pairs tied on x1 or x2 from consideration
method: see above
y: a binary 0/1 outcome variable
x: the result from improveProb
digits: number of significant digits for use in printing the result of improveProb
conf.int: level for confidence limits
...: unused

Details

If x1,x2 represent predictions from models, these functions assume either that you are using a separate sample from the one used to build the model, or that the amount of overfitting in x1 equals the amount of overfitting in x2. An example of the latter is giving both models equal opportunity to be complex so that both models have the same number of effective degrees of freedom, whether a predictor was included in the model or was screened out by a variable selection scheme.

Note that in the first part of their paper, Pencina et al. presented measures that required binning the predicted probabilities. Those measures were then replaced with better continuous measures that are implementedhere.

Value

a vector of statistics for rcorrp.cens, or a list with class improveProb of statistics for improveProb:

n: number of cases
na: number of events
nb: number of non-events
pup.ev: mean of pairwise differences in probabilities for those with events and a pairwise difference of \(\mbox{probabilities}>0\)
pup.ne: mean of pairwise differences in probabilities for those without events and a pairwise difference of \(\mbox{probabilities}>0\)
pdown.ev: mean of pairwise differences in probabilities for those with events and a pairwise difference of \(\mbox{probabilities}>0\)
pdown.ne: mean of pairwise differences in probabilities for those without events and a pairwise difference of \(\mbox{probabilities}>0\)
nri: Net Reclassification Index = \((pup.ev-pdown.ev)-(pup.ne-pdown.ne)\)
se.nri: standard error of NRI
z.nri: Z score for NRI
nri.ev: Net Reclassification Index = \(pup.ev-pdown.ev\)
se.nri.ev: SE of NRI of events
z.nri.ev: Z score for NRI of events
nri.ne: Net Reclassification Index = \(pup.ne-pdown.ne\)
se.nri.ne: SE of NRI of non-events
z.nri.ne: Z score for NRI of non-events
improveSens: improvement in sensitivity
improveSpec: improvement in specificity
idi: Integrated Discrimination Index
se.idi: SE of IDI
z.idi: Z score of IDI

Author

Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com

Scott Williams
Division of Radiation Oncology
Peter MacCallum Cancer Centre, Melbourne, Australia
scott.williams@petermac.org

References

Pencina MJ, D'Agostino Sr RB, D'Agostino Jr RB, Vasan RS (2008): Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Stat in Med 27:157-172. DOI: 10.1002/sim.2929

Pencina MJ, D'Agostino Sr RB, D'Agostino Jr RB, Vasan RS: Rejoinder: Comments on Integrated discrimination and net reclassification improvements-Practical advice. Stat in Med 2007; DOI: 10.1002/sim.3106

Pencina MJ, D'Agostino RB, Steyerberg EW (2011): Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat in Med 30:11-21; DOI: 10.1002/sim.4085

Examples

set.seed(1)
library(survival)

x1 <- rnorm(400)
x2 <- x1 + rnorm(400)
d.time <- rexp(400) + (x1 - min(x1))
cens   <- runif(400,.5,2)
death  <- d.time <= cens
d.time <- pmin(d.time, cens)
rcorrp.cens(x1, x2, Surv(d.time, death))
#>                Dxy               S.D. x1 more concordant x2 more concordant 
#>          -8.21e-02           1.37e-01           4.59e-01           5.41e-01 
#>                  n            missing         uncensored     Relevant Pairs 
#>           4.00e+02           0.00e+00           1.10e+01           4.26e+03 
#>          Uncertain               C X1               C X2             Dxy X1 
#>           1.55e+05           9.92e-01           9.26e-01           9.84e-01 
#>             Dxy X2 
#>           8.52e-01 
#rcorrp.cens(x1, x2, y) ## no censoring

set.seed(1)
x1 <- runif(1000)
x2 <- runif(1000)
y  <- sample(0:1, 1000, TRUE)
rcorrp.cens(x1, x2, y)
#>                Dxy               S.D. x1 more concordant x2 more concordant 
#>           6.00e-02           3.65e-02           5.30e-01           4.70e-01 
#>                  n            missing         uncensored     Relevant Pairs 
#>           1.00e+03           0.00e+00           1.00e+03           5.00e+05 
#>          Uncertain               C X1               C X2             Dxy X1 
#>           0.00e+00           5.10e-01           4.66e-01           1.91e-02 
#>             Dxy X2 
#>          -6.82e-02 
improveProb(x1, x2, y)
#> 
#> Analysis of Proportions of Subjects with Improvement in Predicted Probability
#> 
#> Number of events: 508 	Number of non-events: 492 
#> 
#> Proportions of Positive and Negative Changes in Probabilities
#> 
#>                             Proportion
#> Increase for events     (1)      0.482
#> Increase for non-events (2)      0.528
#> Decrease for events     (3)      0.518
#> Decrease for non-events (4)      0.472
#> 
#> 
#> Net Reclassification Improvement
#> 
#>                            Index     SE      Z    2P Lower 0.95 Upper 0.95
#> NRI            (1-3+4-2) -0.0923 0.0632 -1.462 0.144     -0.216     0.0315
#> NRI for events     (1-3) -0.0354 0.0443 -0.799 0.424     -0.122     0.0515
#> NRI for non-events (4-2) -0.0569 0.0450 -1.264 0.206     -0.145     0.0313
#> 
#> 
#> Analysis of Changes in Predicted Probabilities
#> 
#>                                       Mean Change in Probability
#> Increase for events (sensitivity)                        -0.0308
#> Decrease for non-events (specificity)                    -0.0127
#> 
#> 
#> Integrated Discrimination Improvement
#>  (average of sensitivity and 1-specificity over [0,1];
#>  also is difference in Yates' discrimination slope)
#> 
#>        IDI         SE          Z         2P Lower 0.95 Upper 0.95 
#>    -0.0436     0.0265    -1.6433     0.1003    -0.0956     0.0084