Compute the concordance statistic for data or a model

The concordance statistic computes the agreement between an observed response and a predictor. It is closely related to Kendall's tau-a and tau-b, Goodman's gamma, and Somers' d, all of which can also be calculated from the results of this function.

Usage

concordance(object, ...)
# S3 method for class 'formula'
concordance(object, data, weights, subset, na.action,
  cluster, ymin, ymax, timewt= c("n", "S", "S/G", "n/G2", "I"),
  influence=0, ranks = FALSE, reverse=FALSE, timefix=TRUE, keepstrata=10, ...)
# S3 method for class 'lm'
concordance(object, ..., newdata, cluster, ymin, ymax,
  influence=0, ranks=FALSE, timefix=TRUE, keepstrata=10)
# S3 method for class 'coxph'
concordance(object, ..., newdata, cluster, ymin, ymax,
  timewt= c("n", "S", "S/G", "n/G2", "I"), influence=0,
  ranks=FALSE, timefix=TRUE, keepstrata=10)
# S3 method for class 'survreg'
concordance(object, ..., newdata, cluster, ymin, ymax,
  timewt= c("n", "S", "S/G", "n/G2", "I"), influence=0,
  ranks=FALSE, timefix=TRUE, keepstrata=10)

Arguments

object: a fitted model or a formula. The formula should be of the form y ~x or y ~ x + strata(z) with a single numeric or survival response and a single predictor. Counts of concordant, discordant and tied pairs are computed separately per stratum, and then added.
data: a data.frame in which to interpret the variables named in the formula, or in the subset and the weights argument. Only applicable if object is a formula.
weights: optional vector of case weights. Only applicable if object is a formula.
subset: expression indicating which subset of the rows of data should be used in the fit. Only applicable if object is a formula.
na.action: a missing-data filter function. This is applied to the model.frame after any subset argument has been used. Default is options()\$na.action. Only applicable if object is a formula.
...: multiple fitted models are allowed. Only applicable if object is a model object.
newdata: optional, a new data frame in which to evaluate (but not refit) the models
cluster: optional grouping vector for calculating the robust variance
ymin, ymax: compute the concordance over the restricted range ymin <= y <= ymax. (For survival data this is a time range.)
timewt: the weighting to be applied. The overall statistic is a weighted mean over event times.
influence: 1= return the dfbeta vector, 2= return the full influence matrix, 3 = return both
ranks: if TRUE, return a data frame containing the scaled ranks that make up the overall score.
reverse: by default (FALSE) larger predictions x are associated with larger response values y; this is the expected behavior for a linear model, logistic regression, parametric survival, random forest, ... The exception to this rule is a Cox model, where a larger risk score x corresponds to shorter survival time, if x is the prediction from a coxph fit use reverse = TRUE. If concordance is called directly with the result of a coxph fit this is not necessary, as in that case the function knows this is a coxph result and will choose the correct default.
timefix: correct for possible rounding error. See the vignette on tied times for more explanation. Essentially, exact ties are an important part of the concordance computatation, but "exact" can be a subtle issue with floating point numbers.
keepstrata: either TRUE, FALSE, or an integer value. Computations are always done within stratum, then added. If the total number of strata greater than keepstrata, or keepstrata=FALSE, those subtotals are not kept in the output.

Details

The concordance is an estimate of $Pr(x_i < x_j | y_i < y_j)$, for a model fit replace $x$ with $\hat y$, the predicted response from the model. For a survival outcome some pairs of values are not comparable, e.g., censored at time 5 and a death at time 6, as we do not know if the first observation will or will not outlive the second. In this case the total number of evaluable pairs is smaller.

Relatations to other statistics: For continuous x and y, 2C- 1 is equal to Somers' d. If the response is binary, C is equal to the area under the receiver operating curve or AUROC. For a survival response and binary predictor C is the numerator of the Gehan-Wilcoxon test.

A naive compuation requires adding up over all n(n-1)/2 comparisons, which can be quite slow for large data sets. This routine uses an O(n log(n)) algorithm. At each uncensored event time y, compute the rank of x for the subject who had the event as compared to the x values for all others with a longer survival, where the rank has value between 0 and 1. The concordance is a weighted mean of these ranks, determined by the timewt option. The rank vector can be efficiently updated as subjects are added or removed from the risk set. For further details see the vignette.

The variance is based on an infinetesimal jackknife. One advantage of this approach is that it also gives a valid covariance for the covariance based on multiple different predicted values, even if those predictions come from quite different models. See for instance the example below which has a poisson and two non-nested Cox models. This has been useful to compare a machine learning model to a Cox model fit, say. It is absolutely critical, however, that the predicted values line up exactly, with the same observation in each row; otherwise the result will be nonsense. (Be alert to the impact of missing values.) The IJ variance used here is very close but not precisely identical to the U-statistics approach of DeLong, in our limited experience they have differed by .1 percent or less. Thus a comparison of two binomial models is extremely close to DeLong's test.

The timewt option is only applicable to censored data. In this case the default of n(t) corresponds to Harrell's C statistic, which is closely related to the Gehan-Wilcoxon test; timewt="S" corrsponds to the Peto-Wilcoxon, timewt="S/G" is suggested by Schemper, and timewt="n/G2" corresponds to Uno's C. It turns out that the Schemper and Uno weights are computationally identical, we have retained both options as a user convenience. The timewt= "I" option is related to the log-rank statistic.

When the number of strata is very large, such as in a conditional logistic regression for instance (clogit function), a much faster computation is available when the individual strata results are not retained; use keepstrata=FALSE or keepstrata=0 to do so. In the general case the keepstrata = 10 default simply keeps the printout managable: it retains and prints per-strata counts if the number of strata is <= 10.

Value

An object of class concordance containing the following components:

concordance: the estimated concordance value or values
count: a vector containing the number of concordant pairs, discordant, tied on x but not y, tied on y but not x, and tied on both x and y
n: the number of observations
var: a vector containing the estimated variance of the concordance based on the infinitesimal jackknife (IJ) method. If there are multiple models it contains the estimtated variance/covariance matrix.
cvar: a vector containing the estimated variance(s) of the concordance values, based on the variance formula for the associated score test from a proportional hazards model. (This was the primary variance used in the survConcordance function.)
dfbeta: optional, the vector of leverage estimates for the concordance
influence: optional, the matrix of leverage values for each of the counts, one row per observation
ranks: optional, a data frame containing the Somers' d rank at each event time, along with the time weight, and the case weight of the observation. The time weighted sum of the ranks will equal concordant pairs - discordant pairs.

Note

A coxph model that has a numeric failure may have undefined predicted values, in which case the concordance will be NULL.

Computation for an existing coxph model along with newdata has some subtleties with respect to extra arguments in the original call. These include

tt() terms in the model. This is not supported with newdata.
subset. Any subset clause in the original call is ignored, i.e., not applied to the new data.
strata() terms in the model. The new data is expected to have the strata variable(s) found in the original data set, with concordance computed within strata. The levels of the strata variable need not be the same as in the original data.
id or cluster directives. This has not yet been sorted out.

Author

Terry Therneau

References

E DeLong, D DeLong, and D Clarke-Pearson, Comparing the areas under two or more correlated reciever operating curves: a nonparametric approach, Biometics, 1988.

F Harrell, R Califf, D Pryor, K Lee and R Rosati, Evaluating the yield of medical tests, J Am Medical Assoc, 1982.

R Peto and J Peto, Asymptotically efficient rank invariant test procedures (with discussion), J Royal Stat Soc A, 1972.

M Schemper, Cox analysis of survival data with non-proportional hazard functions, The Statistician, 1992.

H Uno, T Cai, M Pencina, R D'Agnostino and Lj Wei, On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data, Statistics in Medicine, 2011.

Examples

fit1 <- coxph(Surv(ptime, pstat) ~ age + sex + mspike, mgus2)
concordance(fit1, timewt="n/G2")  # Uno's weighting
#> Call:
#> concordance.coxph(object = fit1, timewt = "n/G2")
#> 
#> n= 1373 
#> Concordance= 0.6132 se= 0.1026
#> concordant discordant     tied.x     tied.y    tied.xy 
#>  461425.07  290956.09     265.66     120.39       0.00 

# logistic regression 
tdata <- flchain
options(na.action= na.exclude) # fit2a has many more missings, this causes
 # predict to return nrow(flchain) values, but has to be set before the fit
fit2a <- glm(I(sex=='M') ~ age + log(creatinine), binomial, data= flchain)
fit2b <- glm(I(sex=='M') ~ age + kappa + lambda,  binomial, data= flchain)
tdata$eta1 <- predict(fit2a)
tdata$eta2 <- predict(fit2b)
cfit <- concordance(I(sex=="M") ~ eta1 + eta2, tdata)  # equal to the AUC
coef(cfit)  # males and females differ in creatinine, less in kappa/lambda
#>      eta1      eta2 
#> 0.8151094 0.5983134 
convec <- c(1,-1) # compute a contrast
c(test= unname(convec%*% coef(cfit)),
  std = unname(sqrt(convec %*% vcov(cfit) %*% convec)))
#>        test         std 
#> 0.216796003 0.007216991 


# compare multiple survival models 
options(na.action = na.exclude)   # predict all 1384 obs, including missing
fit3 <- glm(pstat ~ age + sex + mspike + offset(log(ptime)), 
            poisson, data= mgus2)
fit4 <- coxph(Surv(ptime, pstat) ~ age + sex + mspike, mgus2)
fit5 <- coxph(Surv(ptime, pstat) ~ age + sex + hgb + creat, mgus2)

tdata <- mgus2; tdata$ptime <- 60   # prediction at 60 months
p3 <- -predict(fit3, newdata=tdata) 
p4 <- -predict(fit4) # high risk scores predict shorter survival
p5 <- -predict(fit5)
options(na.action = na.omit)      # return to the R default

cfit <- concordance(Surv(ptime, pstat) ~p3 +  p4 + p5, mgus2)
cfit
#> Call:
#> concordance.formula(object = Surv(ptime, pstat) ~ p3 + p4 + p5, 
#>     data = mgus2)
#> 
#> n=1338 (46 observations deleted due to missingness)
#>    concordance     se
#> p3      0.6598 0.0313
#> p4      0.6618 0.0310
#> p5      0.6000 0.0293
#> 
#>    concordant discordant tied.x tied.y tied.xy
#> p3      51105      26333     74     28       0
#> p4      51258      26180     74     28       0
#> p5      46507      31003      2     28       0
round(coef(cfit), 3)
#>    p3    p4    p5 
#> 0.660 0.662 0.600 
round(cov2cor(vcov(cfit)), 3)  # high correlation
#>       [,1]  [,2]  [,3]
#> [1,] 1.000 0.994 0.236
#> [2,] 0.994 1.000 0.258
#> [3,] 0.236 0.258 1.000

test <- c(1, -1, 0)  # contrast vector for model 1 - model 2 
round(c(difference = test %*% coef(cfit),
        sd= sqrt(test %*% vcov(cfit) %*% test)), 3)
#> difference         sd 
#>     -0.002      0.003