Kendall's Tau Statistic
kendall.tau.RdComputes Kendall's Tau, which is a rank-based correlation measure, between two vectors.
Arguments
- x, y
Numeric vectors. Must be of equal length. Ideally their values are continuous and not too discrete. Let
length(x)be \(N\), say.- exact
Logical. If
TRUEthen the exact value is computed.- max.n
Numeric. If
exact = FALSEandlength(x)is more thanmax.nthen a random sample ofmax.npairs are chosen.
Details
Kendall's tau is a measure of dependency in a
bivariate distribution.
Loosely, two random variables are concordant
if large values
of one random variable are associated with large
values of the
other random variable.
Similarly, two random variables are disconcordant
if large values
of one random variable are associated with small values of the
other random variable.
More formally, if (x[i] - x[j])*(y[i] - y[j]) > 0 then
that comparison is concordant \((i \neq j)\).
And if (x[i] - x[j])*(y[i] - y[j]) < 0 then
that comparison is disconcordant \((i \neq j)\).
Out of choose(N, 2) comparisons,
let \(c\) and \(d\) be the
number of concordant and disconcordant pairs.
Then Kendall's tau can be estimated by \((c-d)/(c+d)\).
If there are ties then half the ties are deemed concordant and
half disconcordant so that \((c-d)/(c+d+t)\) is used.
Warning
If length(x) is large then
the cost is \(O(N^2)\), which is expensive!
Under these circumstances
it is not advisable to set exact = TRUE
or max.n to a very
large number.
Examples
N <- 5000; x <- 1:N; y <- runif(N)
true.rho <- -0.8
ymat <- rbinorm(N, cov12 = true.rho) # Bivariate normal, aka N_2
x <- ymat[, 1]
y <- ymat[, 2]
if (FALSE) plot(x, y, col = "blue") # \dontrun{}
kendall.tau(x, y) # A random sample is taken here
#> [1] -0.5892622
kendall.tau(x, y) # A random sample is taken here
#> [1] -0.5883819
kendall.tau(x, y, exact = TRUE) # Costly if length(x) is large
#> [1] -0.5904274
kendall.tau(x, y, max.n = N) # Same as exact = TRUE
#> [1] -0.5904274
(rhohat <- sin(kendall.tau(x, y) * pi / 2)) # Holds for N_2 actually
#> [1] -0.8030989
true.rho # rhohat should be near this value
#> [1] -0.8