Check Parallelism Assumption of Ordinal Semiparametric Models
Source:R/ordParallel.r
ordParallel.Rdorm models are refitted as a series of binary models for a sequence of cutoffs
on the dependent variable. Regression coefficients from this sequence are plotted
against cutoffs using ggplot2 with one panel per regression coefficient.
When censoring is present, whether or not Y is
greater than or equal to the current cutoff is not always possible, and such
observations are ignored.
Usage
ordParallel(
fit,
which,
terms = onlydata,
m,
maxcuts = 75,
lp = FALSE,
onlydata = FALSE,
scale = c("iqr", "none"),
conf.int = 0.95,
alpha = 0.15
)Arguments
- fit
a fit object from
ormwithx=TRUE, y=TRUEin effect- which
specifies which columns of the design matrix are assessed. By default, all columns are analyzed.
- terms
set to
TRUEto collapse all components of each predictor into a single column weighted by the original regression coefficients but scaled according toscale. This means that each predictor will have a regression coefficient of 1.0 when refitting the original model on this transformed X matrix, before any further scaling. Plots will then show the relative effects over time, i.e., the slope of these combined columns over cuts on Y, so that deviations indicate non-parallelism. But since in this case only relative effects are shown, a weak predictor may be interpreted as having an exagerrated y-dependency ifscale='none'.termsdetauls toTRUEwhenonlydata=TRUE.- m
the lowest cutoff is chosen as the first Y value having at meast
mobservations to its left, and the highest cutoff is chosen so that there are at leastmobservations tot he right of it. Cutoffs are equally spaced between these values. If omitted,mis set to the minimum of 50 and one quarter of the sample size.- maxcuts
the maximum number of cutoffs analyzed
- lp
plot the effect of the linear predictor across cutpoints instead of analyzing individual predictors
- onlydata
set to
TRUEto return a data frame suitable for modeling effects of cuts, instead of constructing a graph. The returned data frame has variablesYcut, Yge_cut, obs, and the original names of the predictors.Ycuthas the cutpoint on the original scale.Yge_cutisTRUE/FALSEdependent on whether the Y variable is greater than or equal toYcut, withNAif censoring prevented this determination. Theobsvariable is useful for passing as theclusterargument torobcov()to account for the high correlations in regression coefficients across cuts. See the example which computes Wald tests for parallelism where theYcutdependence involves a spline function. But sincetermswas used, each predictor is reduced to a single degree of freedom.- scale
applies to
terms=TRUE; set to'none'to leave the predictor terms scaled by regression coefficient so the coefficient of each term in the overall fit is 1.0. The default is to scale terms by the interquartile-range (Gini's mean difference if IQR is zero) of the term. This prevents changes in weak predictors over different cutoffs from being impressive.- conf.int
confidence level for computing Wald confidence intervals for regression coefficients. Set to 0 to suppress confidence bands.
- alpha
saturation for confidence bands
Details
Whenver a cut gives rise to extremely high standard error for a regression coefficient,
the confidence limits are set to NA. Unreasonable standard errors are determined from
the confidence interval width exceeding 7 times the standard error at the middle Y cut.
Examples
if (FALSE) { # \dontrun{
f <- orm(..., x=TRUE, y=TRUE)
ordParallel(f, which=1:5) # first 5 betas
getHdata(nhgh)
set.seed(1)
nhgh$ran <- runif(nrow(nhgh))
f <- orm(gh ~ rcs(age, 4) + ran, data=nhgh, x=TRUE, y=TRUE)
ordParallel(f) # one panel per parameter (multiple parameters per predictor)
dd <- datadist(nhgh); options(datadist='dd')
ordParallel(f, terms=TRUE)
d <- ordParallel(f, maxcuts=30, onlydata=TRUE)
dd2 <- datadist(d); options(datadist='dd2') # needed for plotting
g <- orm(Yge_cut ~ (age + ran) * rcs(Ycut, 4), data=d, x=TRUE, y=TRUE)
h <- robcov(g, d$obs)
anova(h)
qu <- quantile(d$age, c(1, 3)/4)
qu
cuts <- sort(unique(d$Ycut))
cuts
z <- contrast(h, list(age=qu[2], Ycut=cuts),
list(age=qu[1], Ycut=cuts))
z <- as.data.frame(z[.q(Ycut, Contrast, Lower, Upper)])
ggplot(z, aes(x=Ycut, y=Contrast)) + geom_line() +
geom_ribbon(aes(ymin=Lower, ymax=Upper), alpha=0.2)
} # }