Robust slope estimator
robslope.RdComputes the Theil-Sen median slope, Siegel's repeated median slope or te equivariant Passing-Bablok slope. The algorithms run in an expected linearithmic time while requiring \(O(n)\) storage. They are based on Dillencourt et. al (1992), Matousek et. al (1998) and Raymaekers and Dufey (2022).
Usage
robslope(formula, data, subset, weights, na.action,
type = c("TheilSen", "RepeatedMedian","PassingBablok"),
alpha = NULL, beta = NULL, verbose = TRUE)Arguments
- formula
an object of class
"formula"(or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.- data
an optional data frame, list or environment (or object coercible by
as.data.frameto a data frame) containing the variables in the model. If not found indata, the variables are taken fromenvironment(formula), typically the environment from whichrobslopeis called.- subset
an optional vector specifying a subset of observations to be used in the fitting process.
- weights
an optional vector of weights to be used in the fitting process. Currently not supported.
- na.action
a function which indicates what should happen when the data contain
NAs. The defaultna.excludeis applied and an informative message is given in case NAs were removed.- type
the type of robust slope estimator. Should be one of
"TheilSen"(default),"RepeatedMedian"or"PassingBablok".- alpha
Determines the order statistic of the target slope. Defaults to the upper median. See below for details.
- beta
Determines the inner order statistic. Only used when
type = "RepeatedMedian". See below for details.- verbose
Whether or not to print out the progress of the algorithm. Defaults to
TRUE.
Details
This function provides a wrapper around robslope.fit, which in turn calls the individual functions TheilSen, RepeatedMedian or PassingBablok. The details on changing the parameters alpha and beta can be found in the documentation of those respective functions.
Value
robslope returns an object of class "lm".
The generic accessor functions coefficients,
fitted.values and residuals extract
various useful features of the value returned by lm.
References
Theil, H. (1950), A rank-invariant method of linear and polynomial regression analysis (Parts 1-3), Ned. Akad. Wetensch. Proc. Ser. A, 53, 386-392, 521-525, 1397-1412.
Sen, P. K. (1968). Estimates of the regression coefficient based on Kendall's tau. Journal of the American statistical association, 63(324), 1379-1389.
Dillencourt, M. B., Mount, D. M., & Netanyahu, N. S. (1992). A randomized algorithm for slope selection. International Journal of Computational Geometry & Applications, 2(01), 1-27.
Siegel, A. F. (1982). Robust regression using repeated medians. Biometrika, 69(1), 242-244.
Matousek, J., Mount, D. M., & Netanyahu, N. S. (1998). Efficient randomized algorithms for the repeated median line estimator. Algorithmica, 20(2), 136-150.
Passing, H., Bablok, W. (1983). A new biometrical procedure for testing the equality of measurements from two different analytical methods. Application of linear regression procedures for method comparison studies in clinical chemistry, Part I, Journal of clinical chemistry and clinical biochemistry, 21,709-720.
Bablok, W., Passing, H., Bender, R., Schneider, B. (1988). A general regression procedure for method transformation. Application of linear regression procedures for method comparison studies in clinical chemistry, Part III. Journal of clinical chemistry and clinical biochemistry, 26,783-790.
Raymaekers J., Dufey F. (2022). Equivariant Passing-Bablok regression in quasilinear time. (link to open access pdf)
Raymaekers (2023). "The R Journal: robslopes: Efficient Computation of the (Repeated) Median Slope", The R Journal. (link to open access pdf)
Examples
set.seed(123)
df <- data.frame(cbind(rnorm(20), rnorm(20)))
colnames(df) <- c("x", "y")
robslope.out <- robslope(y~x, data = df,
type = "RepeatedMedian", verbose = TRUE)
#> Initialization finished, starting interval contraction.
#> Interval contraction ended after 0 iterations.
#> Now starting brute-force computation.
#> Algorithm finished
coef(robslope.out)
#> (intercept) x
#> 0.08928224 0.07688021
plot(fitted.values(robslope.out))
robslope.out <- robslope(y~x, data = df,
type = "TheilSen", verbose = TRUE)
#> Initialization finished, starting interval contraction.
#> Interval contraction ended after 0 iterations.
#> Now starting brute-force computation.
#> Algorithm finished
plot(residuals(robslope.out))