Ordinal Regression with Stopping Ratios
sratio.RdFits a stopping ratio logit/probit/cloglog/cauchit/... regression model to an ordered (preferably) factor response.
Usage
sratio(link = "logitlink", parallel = FALSE, reverse = FALSE,
zero = NULL, ynames = FALSE, Thresh = NULL, Trev = reverse,
Tref = if (Trev) "M" else 1, Intercept = NULL, whitespace = FALSE)Arguments
- link
Link function applied to the \(M\) stopping ratio probabilities. See
Linksfor more choices.- parallel
A logical, or formula specifying which terms have equal/unequal coefficients.
- reverse
Logical. By default, the stopping ratios used are \(\eta_j = logit(P[Y=j|Y \geq j])\) for \(j=1,\dots,M\). If
reverseisTRUE, then \(\eta_j = logit(P[Y=j+1|Y \leq j+1])\) will be used.- ynames
See
multinomialfor information.- zero
Can be an integer-valued vector specifying which linear/additive predictors are modelled as intercepts only. The values must be from the set {1,2,...,\(M\)}. The default value means none are modelled as intercept-only terms. See
CommonVGAMffArgumentsfor information.- Thresh, Trev, Tref, Intercept
See
cumulativefor information. These arguments apply to ordinal categorical regression models.- whitespace
See
CommonVGAMffArgumentsfor information.
Details
In this help file the response \(Y\) is assumed to be a factor with ordered values \(1,2,\dots,M+1\), so that \(M\) is the number of linear/additive predictors \(\eta_j\).
There are a number of definitions for the continuation ratio
in the literature. To make life easier, in the VGAM package,
we use continuation ratios (see cratio)
and stopping ratios.
Continuation ratios deal with quantities such as
logitlink(P[Y>j|Y>=j]).
Value
An object of class "vglmff"
(see vglmff-class).
The object is used by modelling functions
such as vglm,
rrvglm
and vgam.
References
Agresti, A. (2013). Categorical Data Analysis, 3rd ed. Hoboken, NJ, USA: Wiley.
Boersch-Supan, P. H. (2021). Modeling insect phenology using ordinal regression and continuation ratio models. ReScience C, 7.1, 1–14. doi:10.18637/jss.v032.i10 .
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. London: Chapman & Hall.
Tutz, G. (2012). Regression for Categorical Data, Cambridge: Cambridge University Press.
Yee, T. W. (2010). The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1–34. doi:10.18637/jss.v032.i10 .
Note
The response should be either a matrix of counts
(with row sums that
are all positive), or a factor. In both cases,
the y slot
returned by vglm/vgam/rrvglm
is the matrix
of counts.
For a nominal (unordered) factor response, the multinomial
logit model (multinomial) is more appropriate.
Here is an example of the usage of the parallel argument.
If there are covariates x1, x2 and x3, then
parallel = TRUE ~ x1 + x2 -1 and
parallel = FALSE ~ x3 are equivalent. This would constrain
the regression coefficients for x1 and x2 to be
equal; those of the intercepts and x3 would be different.
Warning
No check is made to verify that the response is ordinal if the
response is a matrix;
see ordered.
Boersch-Supan (2021) considers a sparse data set
(called budworm)
and the numerical problems encountered when
fitting models such as
cratio,
sratio,
cumulative.
Although improvements to links such as
clogloglink have been made,
currently these family functions have not been
properly adapted to handle sparse data as well as they could.
Examples
pneumo <- transform(pneumo, let = log(exposure.time))
(fit <- vglm(cbind(normal, mild, severe) ~ let,
sratio(parallel = TRUE), data = pneumo))
#>
#> Call:
#> vglm(formula = cbind(normal, mild, severe) ~ let, family = sratio(parallel = TRUE),
#> data = pneumo)
#>
#>
#> Coefficients:
#> (Intercept):1 (Intercept):2 let
#> 8.733797 8.051302 -2.321359
#>
#> Degrees of Freedom: 16 Total; 13 Residual
#> Residual deviance: 7.626763
#> Log-likelihood: -26.39023
coef(fit, matrix = TRUE)
#> logitlink(P[Y=1|Y>=1]) logitlink(P[Y=2|Y>=2])
#> (Intercept) 8.733797 8.051302
#> let -2.321359 -2.321359
constraints(fit)
#> $`(Intercept)`
#> [,1] [,2]
#> [1,] 1 0
#> [2,] 0 1
#>
#> $let
#> [,1]
#> [1,] 1
#> [2,] 1
#>
predict(fit)
#> logitlink(P[Y=1|Y>=1]) logitlink(P[Y=2|Y>=2])
#> 1 4.6531774 3.9706824
#> 2 2.4474398 1.7649448
#> 3 1.6117442 0.9292491
#> 4 1.0403809 0.3578859
#> 5 0.5822388 -0.1002563
#> 6 0.1997827 -0.4827124
#> 7 -0.1538548 -0.8363499
#> 8 -0.4160301 -1.0985252
predict(fit, untransform = TRUE)
#> P[Y=1|Y>=1] P[Y=2|Y>=2]
#> 1 0.9905587 0.9814886
#> 2 0.9203740 0.8538279
#> 3 0.8336534 0.7169229
#> 4 0.7389235 0.5885286
#> 5 0.6415824 0.4749569
#> 6 0.5497802 0.3816118
#> 7 0.4616120 0.3023041
#> 8 0.3974671 0.2500163