Ordinal Regression Model Fitter
orm.fit.RdFits ordinal cumulative probability models for continuous or ordinal
response variables, efficiently allowing for a large number of
intercepts by capitalizing on the information matrix being sparse.
Five different distribution functions are implemented, with the
default being the logistic (yielding the proportional odds
model). Penalized estimation and weights are also implemented, as in `[lrm.fit()]`.
The optimization method is Newton-Raphson with step-halving, or the Levenberg-Marquart method.
The latter has been shown to converge better when there are large offsets.
Execution time is is fast even for hundreds of thousands of intercepts. The limiting factor
is the number of intercepts times the number of columns of x.
Usage
orm.fit(x=NULL, y, family=c("logistic","probit","loglog","cloglog","cauchit"),
offset, initial, opt_method=c('NR', 'LM'),
maxit=30L, eps=5e-4, gradtol=0.001, abstol=1e10,
minstepsize=0.01, tol=.Machine$double.eps, trace=FALSE,
penalty.matrix=NULL, weights=NULL, normwt=FALSE, scale=FALSE, mscore=FALSE,
inclpen=TRUE, y.precision = 7, compstats=TRUE, onlydata=FALSE, ...)Arguments
- x
design matrix with no column for an intercept
- y
response vector, numeric, factor, or character. The ordering of levels is assumed from
factor(y).- family
a character value specifying the distribution family, corresponding to logistic (the default), Gaussian, Cauchy, Gumbel maximum (\(exp(-exp(-x))\); extreme value type I), and Gumbel minimum (\(1-exp(-exp(x))\)) distributions. These are the cumulative distribution functions assumed for \(Prob[Y \ge y | X]\). The
familyargument can be an unquoted or a quoted string, e.g.family=loglogorfamily="loglog". To use a built-in family, the string must be one of the following corresponding to the previous list:logistic, probit, loglog, cloglog, cauchit.- offset
optional numeric vector containing an offset on the logit scale
- initial
vector of initial parameter estimates, beginning with the intercepts. If
initialis not specified, the function computes the overall score \(\chi^2\) test for the global null hypothesis of no regression.initialis padded to the right with zeros for the regression coefficients, if needed. When censoring is present,initialcan also be a list with elementstimeandsurvfrom thenpsurvattribute of theyelement of a previous fit. This is useful when bootstrapping, for example.- opt_method
set to
"LM"to use Levenberg-Marquardt instead of the default Newton-Raphson- maxit
maximum no. iterations (default=
30).- eps
difference in \(-2 log\) likelihood for declaring convergence. Default is
.0005. This handles the case where the initial estimates are MLEs, to prevent endless step-halving.- gradtol
maximum absolute gradient before convergence can be declared.
gradtolis automatically scaled by n / 1000 since the gradient is proportional to the sample size.- abstol
maximum absolute change in parameter estimates from one iteration to the next before convergence can be declared; by default has no effect
- minstepsize
used to specify when to abandon step-halving
- tol
Singularity criterion. Default is typically 2e-16
- trace
set to
TRUEto print -2 log likelihood, step-halving fraction, change in -2 log likelihood, maximum absolute value of first derivative, and max absolute change in parameter estimates at each iteration.- penalty.matrix
a self-contained ready-to-use penalty matrix - see
lrm- weights
a vector (same length as
y) of possibly fractional case weights- normwt
set to
TRUEto scaleweightsso they sum to \(n\), the length ofy; useful for sample surveys as opposed to the default of frequency weighting- mscore
set to
TRUEto compute the sparse score matrix and store its elements as a listmscore- scale
set to
TRUEto subtract column means and divide by column standard deviations ofxbefore fitting, and to back-solve for the un-normalized covariance matrix and regression coefficients. This can sometimes make the model converge for very large sample sizes where for example spline or polynomial component variables create scaling problems leading to loss of precision when accumulating sums of squares and crossproducts.- inclpen
set to
FALSEto not include the penalty matrix in the Hessian when the Hessian is being computed on transformedx, vs. adding the penalty after back-transforming. This should not matter.- y.precision
When ‘y’ is numeric, values may need to be rounded to avoid unpredictable behavior with
unique()with floating-point numbers. Default is to 7 decimal places.- compstats
set to
FALSEto prevent the calculation of the vector of model statistics- onlydata
set to
TRUEto return the data used in model fitting as a list, without fitting the model- ...
ignored
Value
a list with the following components, not counting all the components produced by `orm.fit`:
- call
calling expression
- freq
table of frequencies for
yin order of increasingy- yunique
vector of sorted unique values of
y- stats
vector with the following elements: number of observations used in the fit, number of unique
yvalues, medianyfrom among the observations used in the fit, maximum absolute value of first derivative of log likelihood, model likelihood ratio chi-square, d.f., P-value, score chi-square and its P-value, Spearman's \(\rho\) rank correlation between linear predictor andy(if there is no censoring), Somers' \(Dxy\) rank correlation (if there is no censoring or only right censoring),) the Nagelkerke \(R^2\) index, other \(R^2\) measures, the \(g\)-index, \(gr\) (the \(g\)-index on the ratio scale), and \(pdm\) (the mean absolute difference between 0.5 and the estimated probability that \(y\geq\) the marginal median). Whenpenalty.matrixis present, the \(\chi^2\), d.f., and P-value are not corrected for the effective d.f.- fail
set to
TRUEif convergence failed (andmaxit>1)- coefficients
estimated parameters
- family, famfunctions
see
orm- deviance
-2 log likelihoods. When an offset variable is present, three deviances are computed: for intercept(s) only, for intercepts+offset, and for intercepts+offset+predictors. When there is no offset variable, the vector contains deviances for the intercept(s)-only model and the model with intercept(s) and predictors.
- lpe
vector of per-observation likelihood probability elements. An observation's contribution to the log likelihood is the log of
lpe.- non.slopes
number of intercepts in model
- interceptRef
the index of the middle (median) intercept used in computing the linear predictor and
var- linear.predictors
the linear predictor using the first intercept
- penalty.matrix
see above
- info.matrix
see
orm