Cumulative link mixed models
clmmOld.RdFits cumulative link mixed models, i.e. cumulative link models with
random effects via the Laplace approximation or the standard and the
adaptive Gauss-Hermite quadrature approximation. The functionality in
clm2 is also implemented here. Currently only a single
random term is allowed in the location-part of the model.
A new implementation is available in clmm that allows
for more than one random effect.
Usage
clmm2(location, scale, nominal, random, data, weights, start, subset,
na.action, contrasts, Hess = FALSE, model = TRUE, sdFixed,
link = c("logistic", "probit", "cloglog", "loglog",
"cauchit", "Aranda-Ordaz", "log-gamma"), lambda,
doFit = TRUE, control, nAGQ = 1,
threshold = c("flexible", "symmetric", "equidistant"), ...)Arguments
- location
as in
clm2.- scale
as in
clm2.- nominal
as in
clm2.- random
a factor for the random effects in the location-part of the model.
- data
as in
clm2.- weights
as in
clm2.- start
initial values for the parameters in the format
c(alpha, beta, log(zeta), lambda, log(stDev))wherestDevis the standard deviation of the random effects.- subset
as in
clm2.- na.action
as in
clm2.- contrasts
as in
clm2.- Hess
logical for whether the Hessian (the inverse of the observed information matrix) should be computed. Use
Hess = TRUEif you intend to callsummaryorvcovon the fit andHess = FALSEin all other instances to save computing time.- model
as in
clm2.- sdFixed
If
sdFixedis specified (a positive scalar), a model is fitted where the standard deviation for the random term is fixed at the value ofsdFixed. IfsdFixedis left unspecified, the standard deviation of the random term is estimated from data.- link
as in
clm2.- lambda
as in
clm2.- doFit
as in
clm2although it can also be one ofc("no", "R" "C"), where"R"use the R-implementation for fitting,"C"(default) use C-implementation for fitting and"no"behaves asFALSEand returns the environment.- control
a call to
clmm2.control.- threshold
as in
clm2.- nAGQ
the number of quadrature points to be used in the adaptive Gauss-Hermite quadrature approximation to the marginal likelihood. Defaults to
1which leads to the Laplace approximation. An odd number of quadrature points is encouraged and 3, 5 or 7 are usually enough to achive high precision. Negative values give the standard, i.e. non-adaptive Gauss-Hermite quadrature.- ...
additional arguments are passed on to
clm2.controland possibly further on to the optimizer, which can lead to surprising error or warning messages when mistyping arguments etc.
Details
There are methods for the standard model-fitting functions, including
summary, vcov,
profile,
plot.profile,
confint,
anova, logLik,
predict
and an extractAIC method.
A Newton scheme is used to obtain the conditional modes of the random
effects for Laplace and AGQ approximations, and a non-linear
optimization is performed over the fixed parameter set to get the
maximum likelihood estimates.
The Newton
scheme uses the observed Hessian rather than the expected as is done
in e.g. glmer, so results from the Laplace
approximation for binomial fits should in general be more precise -
particularly for other links than the "logistic".
Core parts of the function are implemented in C-code for speed.
The function calls clm2 to up an
environment and to get starting values.
Value
If doFit = FALSE the result is an environment
representing the model ready to be optimized.
If doFit = TRUE the result is an
object of class "clmm2" with the following components:
- stDev
the standard deviation of the random effects.
- Niter
the total number of iterations in the Newton updates of the conditional modes of the random effects.
- grFac
the grouping factor defining the random effects.
- nAGQ
the number of quadrature points used in the adaptive Gauss-Hermite Quadrature approximation to the marginal likelihood.
- ranef
the conditional modes of the random effects, sometimes referred to as "random effect estimates".
- condVar
the conditional variances of the random effects at their conditional modes.
- beta
the parameter estimates of the location part.
- zeta
the parameter estimates of the scale part on the log scale; the scale parameter estimates on the original scale are given by
exp(zeta).- Alpha
vector or matrix of the threshold parameters.
- Theta
vector or matrix of the thresholds.
- xi
vector of threshold parameters, which, given a threshold function (e.g.
"equidistant"), and possible nominal effects define the class boundaries,Theta.- lambda
the value of lambda if lambda is supplied or estimated, otherwise missing.
- coefficients
the coefficients of the intercepts (
theta), the location (beta), the scale (zeta), and the link function parameter (lambda).- df.residual
the number of residual degrees of freedoms, calculated using the weights.
- fitted.values
vector of fitted values conditional on the values of the random effects. Use
predictto get the fitted values for a random effect of zero. An observation here is taken to be each of the scalar elements of the multinomial table and not a multinomial vector.- convergence
TRUEif the optimizer terminates wihtout error andFALSEotherwise.- gradient
vector of gradients for the unit-variance random effects at their conditional modes.
- optRes
list with results from the optimizer. The contents of the list depends on the choice of optimizer.
- logLik
the log likelihood of the model at optimizer termination.
- Hessian
if the model was fitted with
Hess = TRUE, this is the Hessian matrix of the parameters at the optimum.- scale
model.framefor the scale model.- location
model.framefor the location model.- nominal
model.framefor the nominal model.- edf
the (effective) number of degrees of freedom used by the model.
- start
the starting values.
- method
character, the optimizer.
- y
the response variable.
- lev
the names of the levels of the response variable.
- nobs
the (effective) number of observations, calculated as the sum of the weights.
- threshold
character, the threshold function used in the model.
- estimLambda
1if lambda is estimated in one of the flexible link functions and0otherwise.- link
character, the link function used in the model.
- call
the matched call.
- contrasts
contrasts applied to terms in location and scale models.
- na.action
the function used to filter missing data.
Examples
options(contrasts = c("contr.treatment", "contr.poly"))
## More manageable data set:
dat <- subset(soup, as.numeric(as.character(RESP)) <= 24)
dat$RESP <- dat$RESP[drop=TRUE]
m1 <- clmm2(SURENESS ~ PROD, random = RESP, data = dat, link="probit",
Hess = TRUE, method="ucminf", threshold = "symmetric")
m1
#> Cumulative Link Mixed Model fitted with the Laplace approximation
#>
#> Call:
#> clmm2(location = SURENESS ~ PROD, random = RESP, data = dat,
#> Hess = TRUE, link = "probit", threshold = "symmetric", method = "ucminf")
#>
#> Random effects:
#> Var Std.Dev
#> RESP 0.05947608 0.2438772
#>
#> Location coefficients:
#> PRODTest
#> 1.090036
#>
#> No Scale coefficients
#>
#> Threshold coefficients:
#> central spacing.1 spacing.2
#> 0.03098235 0.21461394 0.74509589
#>
#> Thresholds:
#> 1|2 2|3 3|4 4|5 5|6
#> -0.71411354 -0.18363158 0.03098235 0.24559629 0.77607824
#>
#> log-likelihood: -283.6719
#> AIC: 577.3439
summary(m1)
#> Cumulative Link Mixed Model fitted with the Laplace approximation
#>
#> Call:
#> clmm2(location = SURENESS ~ PROD, random = RESP, data = dat,
#> Hess = TRUE, link = "probit", threshold = "symmetric", method = "ucminf")
#>
#> Random effects:
#> Var Std.Dev
#> RESP 0.05947608 0.2438772
#>
#> Location coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> PRODTest 1.0900 0.1678 6.4953 8.2860e-11
#>
#> No scale coefficients
#>
#> Threshold coefficients:
#> Estimate Std. Error z value
#> central 0.0310 0.1302 0.2380
#> spacing.1 0.2146 0.0392 5.4682
#> spacing.2 0.7451 0.0681 10.9440
#>
#> log-likelihood: -283.6719
#> AIC: 577.3439
#> Condition number of Hessian: 239.9348
logLik(m1)
#> 'log Lik.' -283.6719 (df=5)
vcov(m1)
#> central spacing.1 spacing.2 PRODTest
#> central 0.0169459608 0.0001516004 0.0002206395 0.0140586784 -0.0005641255
#> spacing.1 0.0001516004 0.0015403682 0.0011567081 0.0008520655 0.0016377544
#> spacing.2 0.0002206395 0.0011567081 0.0046352050 0.0028731407 0.0068104462
#> PRODTest 0.0140586784 0.0008520655 0.0028731407 0.0281631367 0.0139822611
#> -0.0005641255 0.0016377544 0.0068104462 0.0139822611 0.2755472792
extractAIC(m1)
#> [1] 5.0000 577.3439
anova(m1, update(m1, location = SURENESS ~ 1, Hess = FALSE))
#> Likelihood ratio tests of cumulative link models
#>
#> Response: SURENESS
#> Model Resid. df -2logLik Test Df LR stat. Pr(Chi)
#> 1 1 | | 196 610.4108
#> 2 PROD | | 195 567.3439 1 vs 2 1 43.06696 5.289791e-11
anova(m1, update(m1, random = NULL))
#> Likelihood ratio tests of cumulative link models
#>
#> Response: SURENESS
#> Model Resid. df -2logLik Test Df LR stat. Pr(Chi)
#> 1 PROD | | 196 568.8292
#> 2 PROD | | 195 567.3439 1 vs 2 1 1.485361 0.2229376
## Use adaptive Gauss-Hermite quadrature rather than the Laplace
## approximation:
update(m1, Hess = FALSE, nAGQ = 3)
#> Cumulative Link Mixed Model fitted with the adaptive Gauss-Hermite
#> quadrature approximation with 3 quadrature points
#>
#> Call:
#> clmm2(location = SURENESS ~ PROD, random = RESP, data = dat,
#> Hess = FALSE, link = "probit", nAGQ = 3, threshold = "symmetric",
#> method = "ucminf")
#>
#> Random effects:
#> Var Std.Dev
#> RESP 0.05975237 0.244443
#>
#> Location coefficients:
#> PRODTest
#> 1.090143
#>
#> No Scale coefficients
#>
#> Threshold coefficients:
#> central spacing.1 spacing.2
#> 0.03098246 0.21462764 0.74515323
#>
#> Thresholds:
#> 1|2 2|3 3|4 4|5 5|6
#> -0.71417077 -0.18364519 0.03098246 0.24561010 0.77613569
#>
#> log-likelihood: -283.6692
#> AIC: 577.3384
## Use standard Gauss-Hermite quadrature:
update(m1, Hess = FALSE, nAGQ = -7)
#> Cumulative Link Mixed Model fitted with the Gauss-Hermite
#> quadrature approximation with 7 quadrature points
#>
#> Call:
#> clmm2(location = SURENESS ~ PROD, random = RESP, data = dat,
#> Hess = FALSE, link = "probit", nAGQ = -7, threshold = "symmetric",
#> method = "ucminf")
#>
#> Random effects:
#> Var Std.Dev
#> RESP 0.06023489 0.245428
#>
#> Location coefficients:
#> PRODTest
#> 1.090219
#>
#> No Scale coefficients
#>
#> Threshold coefficients:
#> central spacing.1 spacing.2
#> 0.0311355 0.2146261 0.7451619
#>
#> Thresholds:
#> 1|2 2|3 3|4 4|5 5|6
#> -0.7140264 -0.1834906 0.0311355 0.2457616 0.7762974
#>
#> log-likelihood: -283.6759
#> AIC: 577.3517
##################################################################
## Binomial example with the cbpp data from the lme4-package:
if(require(lme4)) {
cbpp2 <- rbind(cbpp[,-(2:3)], cbpp[,-(2:3)])
cbpp2 <- within(cbpp2, {
incidence <- as.factor(rep(0:1, each=nrow(cbpp)))
freq <- with(cbpp, c(incidence, size - incidence))
})
## Fit with Laplace approximation:
fm1 <- clmm2(incidence ~ period, random = herd, weights = freq,
data = cbpp2, Hess = 1)
summary(fm1)
## Fit with the adaptive Gauss-Hermite quadrature approximation:
fm2 <- clmm2(incidence ~ period, random = herd, weights = freq,
data = cbpp2, Hess = 1, nAGQ = 7)
summary(fm2)
}
#> Loading required package: lme4
#> Loading required package: Matrix
#> Cumulative Link Mixed Model fitted with the adaptive Gauss-Hermite
#> quadrature approximation with 7 quadrature points
#>
#> Call:
#> clmm2(location = incidence ~ period, random = herd, data = cbpp2,
#> weights = freq, Hess = 1, nAGQ = 7)
#>
#> Random effects:
#> Var Std.Dev
#> herd 0.4192826 0.6475204
#>
#> Location coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> period2 0.9914 0.3068 3.2318 0.00123027
#> period3 1.1278 0.3268 3.4514 0.00055762
#> period4 1.5795 0.4276 3.6938 0.00022089
#>
#> No scale coefficients
#>
#> Threshold coefficients:
#> Estimate Std. Error z value
#> 0|1 -1.3992 0.2335 -5.9921
#>
#> log-likelihood: -277.459
#> AIC: 564.918
#> Condition number of Hessian: 5.698143