Fitting Vector Generalized Linear Models
vglm.Rdvglm fits vector generalized linear models (VGLMs).
This very large class of models includes
generalized linear models (GLMs) as a special case.
Usage
vglm(formula,
family = stop("argument 'family' needs to be assigned"),
data = list(), weights = NULL, subset = NULL,
na.action, etastart = NULL, mustart = NULL,
coefstart = NULL, control = vglm.control(...), offset = NULL,
method = "vglm.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE,
contrasts = NULL, constraints = NULL, extra = list(),
form2 = NULL, qr.arg = TRUE, smart = TRUE, ...)Arguments
- formula
a symbolic description of the model to be fit. The RHS of the formula is applied to each linear predictor. The effect of different variables in each linear predictor can be controlled by specifying constraint matrices—see
constraintsbelow.- family
a function of class
"vglmff"(seevglmff-class) describing what statistical model is to be fitted. This is called a “VGAM family function”. SeeCommonVGAMffArgumentsfor general information about many types of arguments found in this type of function. The argument name"family"is used loosely and for the ease of existingglmusers; there is no concept of a formal “error distribution” for VGLMs. Possibly the argument name should be better"model"but unfortunately that name has already been taken.- data
an optional data frame containing the variables in the model. By default the variables are taken from
environment(formula), typically the environment from whichvglmis called.- weights
an optional vector or matrix of (prior fixed and known) weights to be used in the fitting process. If the VGAM family function handles multiple responses (\(Q > 1\) of them, say) then
weightscan be a matrix with \(Q\) columns. Each column matches the respective response. If it is a vector (the usually case) then it is recycled into a matrix with \(Q\) columns. The values ofweightsmust be positive; try setting a very small value such as1.0e-8to effectively delete an observation.Currently the
weightsargument supports sampling weights from complex sampling designs via svyVGAM. Some details can be found at https://CRAN.R-project.org/package=svyVGAM.- subset
an optional logical vector specifying a subset of observations to be used in the fitting process.
- na.action
a function which indicates what should happen when the data contain
NAs. The default is set by thena.actionsetting ofoptions, and isna.failif that is unset. The “factory-fresh” default isna.omitwhich is known as complete case analysis and applied to both sides of the formula.- etastart
optional starting values for the linear predictors. It is a \(M\)-column matrix with the same number of rows as the response. If \(M = 1\) then it may be a vector. Note that
etastartand the output ofpredict(fit)should be comparable. Here,fitis the fitted object. Almost all VGAM family functions are self-starting.- mustart
optional starting values for the fitted values. It can be a vector or a matrix; if a matrix, then it has the same number of rows as the response. Usually
mustartand the output offitted(fit)should be comparable. Most family functions do not make use of this argument because it is not possible to compute all \(M\) columns ofetafrommu.- coefstart
optional starting values for the coefficient vector. The length and order must match that of
coef(fit).- control
a list of parameters for controlling the fitting process. See
vglm.controlfor details.- offset
a vector or \(M\)-column matrix of offset values. These are a priori known and are added to the linear/additive predictors during fitting.
- method
the method to be used in fitting the model. The default (and presently only) method
vglm.fit()uses iteratively reweighted least squares (IRLS).- model
a logical value indicating whether the model frame should be assigned in the
modelslot.- x.arg, y.arg
logical values indicating whether the LM matrix and response vector/matrix used in the fitting process should be assigned in the
xandyslots. Note that the model matrix is the LM matrix; to get the VGLM matrix typemodel.matrix(vglmfit)wherevglmfitis avglmobject.- contrasts
an optional list. See the
contrasts.argofmodel.matrix.default.- constraints
an optional
listof constraint matrices. The components of the list must be named (labelled) with the term it corresponds to (and it must match in character format exactly—see below). There are two types of input:"lm"-type and"vlm"-type. The former is a subset of the latter. The former has a matrix for each term of the LM matrix. The latter has a matrix for each column of the big VLM matrix. After fitting, theconstraintsextractor function may be applied; it returns the"vlm"-type list of constraint matrices by default. If"lm"-type are returned byconstraintsthen these can be fed into this argument and it should give the same model as before.If the
constraintsargument is used then the family function'szeroargument (if it exists) needs to be set toNULL. This avoids what could be a probable contradiction. Sometimes setting other arguments related to constraint matrices toFALSEis also a good idea, e.g.,parallel = FALSE,exchangeable = FALSE.Properties: each constraint matrix must have \(M\) rows, and be of full-column rank. By default, constraint matrices are the \(M\) by \(M\) identity matrix unless arguments in the family function itself override these values, e.g.,
parallel(seeCommonVGAMffArguments). Ifconstraintsis used then it must contain all the terms; an incomplete list is not accepted.As mentioned above, the labelling of each constraint matrix must match exactly, e.g.,
list("s(x2,df=3)"=diag(2))will fail asas.character(~ s(x2,df=3))produces white spaces:"s(x2, df = 3)". Thuslist("s(x2, df = 3)" = diag(2))is needed. See Example 6 below. More details are given in Yee (2015; Section 3.3.1.3) which is on p.101. Note that the label for the intercept is"(Intercept)".- extra
an optional list with any extra information that might be needed by the VGAM family function.
- form2
the second (optional) formula. If argument
xijis used (seevglm.control) thenform2needs to have all terms in the model. Also, some VGAM family functions such asmicmenuse this argument to input the regressor variable. If given, the slots@Xm2and@Ym2may be assigned. Note that smart prediction applies to terms inform2too.- qr.arg
logical value indicating whether the slot
qr, which returns the QR decomposition of the VLM model matrix, is returned on the object.- smart
logical value indicating whether smart prediction (
smartpred) will be used.- ...
further arguments passed into
vglm.control.
Details
A vector generalized linear model (VGLM) is loosely defined as a statistical model that is a function of \(M\) linear predictors and can be estimated by Fisher scoring. The central formula is given by $$\eta_j = \beta_j^T x$$ where \(x\) is a vector of explanatory variables (sometimes just a 1 for an intercept), and \(\beta_j\) is a vector of regression coefficients to be estimated. Here, \(j=1,\ldots,M\), where \(M\) is finite. Then one can write \(\eta=(\eta_1,\ldots,\eta_M)^T\) as a vector of linear predictors.
Most users will find vglm similar in flavour to
glm.
The function vglm.fit actually does the work.
Value
An object of class "vglm", which has the
following slots. Some of these may not be assigned to save
space, and will be recreated if necessary later.
- extra
the list
extraat the end of fitting.- family
the family function (of class
"vglmff").- iter
the number of IRLS iterations used.
- predictors
a \(M\)-column matrix of linear predictors.
- assign
a named list which matches the columns and the (LM) model matrix terms.
- call
the matched call.
- coefficients
a named vector of coefficients.
- constraints
a named list of constraint matrices used in the fitting.
- contrasts
the contrasts used (if any).
- control
list of control parameter used in the fitting.
- criterion
list of convergence criterion evaluated at the final IRLS iteration.
- df.residual
the residual degrees of freedom.
- df.total
the total degrees of freedom.
- dispersion
the scaling parameter.
- effects
the effects.
- fitted.values
the fitted values, as a matrix. This is often the mean but may be quantiles, or the location parameter, e.g., in the Cauchy model.
- misc
a list to hold miscellaneous parameters.
- model
the model frame.
- na.action
a list holding information about missing values.
- offset
if non-zero, a \(M\)-column matrix of offsets.
- post
a list where post-analysis results may be put.
- preplot
used by
plotvgam, the plotting parameters may be put here.- prior.weights
initially supplied weights (the
weightsargument). Also seeweightsvglm.- qr
the QR decomposition used in the fitting.
- R
the R matrix in the QR decomposition used in the fitting.
- rank
numerical rank of the fitted model.
- residuals
the working residuals at the final IRLS iteration.
- ResSS
residual sum of squares at the final IRLS iteration with the adjusted dependent vectors and weight matrices.
- smart.prediction
a list of data-dependent parameters (if any) that are used by smart prediction.
- terms
the
termsobject used.- weights
the working weight matrices at the final IRLS iteration. This is in matrix-band form.
- x
the model matrix (linear model LM, not VGLM).
- xlevels
the levels of the factors, if any, used in fitting.
- y
the response, in matrix form.
This slot information is repeated at vglm-class.
References
Yee, T. W. (2015). Vector Generalized Linear and Additive Models: With an Implementation in R. New York, USA: Springer.
Yee, T. W. and Hastie, T. J. (2003). Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15–41.
Yee, T. W. and Wild, C. J. (1996). Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481–493.
Yee, T. W. (2014). Reduced-rank vector generalized linear models with two linear predictors. Computational Statistics and Data Analysis, 71, 889–902.
Yee, T. W. (2008).
The VGAM Package.
R News, 8, 28–39.
Note
This function can fit a wide variety of
statistical models. Some of
these are harder to fit than others because
of inherent numerical
difficulties associated with some of them.
Successful model fitting
benefits from cumulative experience.
Varying the values of arguments
in the VGAM family function itself
is a good first step if
difficulties arise, especially if initial
values can be inputted.
A second, more general step, is to vary the
values of arguments in
vglm.control.
A third step is to make use of arguments such
as etastart,
coefstart and mustart.
Some VGAM family functions end in "ff" to avoid
interference with other functions, e.g.,
binomialff,
poissonff.
This is because VGAM family
functions are incompatible with glm
(and also gam() in gam and
gam in the mgcv library).
The smart prediction (smartpred)
library is incorporated
within the VGAM library.
The theory behind the scaling parameter is currently being made more rigorous, but it it should give the same value as the scale parameter for GLMs.
In Example 5 below, the xij argument to
illustrate covariates
that are specific to a linear predictor.
Here, lop/rop
are
the ocular pressures of the left/right eye
(artificial data).
Variables leye and reye might be
the presence/absence of
a particular disease on the LHS/RHS eye respectively.
See
vglm.control
and
fill1
for more details and examples.
WARNING
See warnings in vglm.control.
Also, see warnings under weights above regarding
sampling weights from complex sampling designs.
See also
vglm.control,
vglm-class,
vglmff-class,
smartpred,
vglm.fit,
fill1,
rrvglm,
vgam.
Methods functions include
add1.vglm,
anova.vglm,
AICvlm,
coefvlm,
confintvglm,
constraints.vlm,
drop1.vglm,
fittedvlm,
hatvaluesvlm,
hdeff.vglm,
Influence.vglm,
linkfunvlm,
lrt.stat.vlm,
score.stat.vlm,
wald.stat.vlm,
nobs.vlm,
npred.vlm,
plotvglm,
predictvglm,
residualsvglm,
step4vglm,
summaryvglm,
lrtest_vglm,
update,
TypicalVGAMfamilyFunction,
etc.
Examples
# Example 1. See help(glm)
(d.AD <- data.frame(treatment = gl(3, 3),
outcome = gl(3, 1, 9),
counts = c(18,17,15,20,10,20,25,13,12)))
#> treatment outcome counts
#> 1 1 1 18
#> 2 1 2 17
#> 3 1 3 15
#> 4 2 1 20
#> 5 2 2 10
#> 6 2 3 20
#> 7 3 1 25
#> 8 3 2 13
#> 9 3 3 12
vglm.D93 <- vglm(counts ~ outcome + treatment, poissonff,
data = d.AD, trace = TRUE)
#> Iteration 1: deviance = 5.181115
#> Iteration 2: deviance = 5.129147
#> Iteration 3: deviance = 5.129141
#> Iteration 4: deviance = 5.129141
summary(vglm.D93)
#>
#> Call:
#> vglm(formula = counts ~ outcome + treatment, family = poissonff,
#> data = d.AD, trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 3.045e+00 1.709e-01 17.815 <2e-16 ***
#> outcome2 -4.543e-01 2.022e-01 -2.247 0.0246 *
#> outcome3 -2.930e-01 1.927e-01 -1.520 0.1285
#> treatment2 5.532e-16 2.000e-01 0.000 1.0000
#> treatment3 3.790e-16 2.000e-01 0.000 1.0000
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Name of linear predictor: loglink(lambda)
#>
#> Residual deviance: 5.1291 on 4 degrees of freedom
#>
#> Log-likelihood: -23.3807 on 4 degrees of freedom
#>
#> Number of Fisher scoring iterations: 4
#>
# Example 2. Multinomial logit model
pneumo <- transform(pneumo, let = log(exposure.time))
vglm(cbind(normal, mild, severe) ~ let, multinomial, pneumo)
#>
#> Call:
#> vglm(formula = cbind(normal, mild, severe) ~ let, family = multinomial,
#> data = pneumo)
#>
#>
#> Coefficients:
#> (Intercept):1 (Intercept):2 let:1 let:2
#> 11.9750920 3.0390622 -3.0674665 -0.9020936
#>
#> Degrees of Freedom: 16 Total; 12 Residual
#> Residual deviance: 5.347382
#> Log-likelihood: -25.25054
#>
#> This is a multinomial logit model with 3 levels
# Example 3. Proportional odds model
fit3 <- vglm(cbind(normal, mild, severe) ~ let, propodds, pneumo)
coef(fit3, matrix = TRUE)
#> logitlink(P[Y>=2]) logitlink(P[Y>=3])
#> (Intercept) -9.676093 -10.581725
#> let 2.596807 2.596807
constraints(fit3)
#> $`(Intercept)`
#> [,1] [,2]
#> [1,] 1 0
#> [2,] 0 1
#>
#> $let
#> [,1]
#> [1,] 1
#> [2,] 1
#>
model.matrix(fit3, type = "lm") # LM model matrix
#> (Intercept) let
#> 1 1 1.757858
#> 2 1 2.708050
#> 3 1 3.068053
#> 4 1 3.314186
#> 5 1 3.511545
#> 6 1 3.676301
#> 7 1 3.828641
#> 8 1 3.941582
#> attr(,"assign")
#> attr(,"assign")$`(Intercept)`
#> [1] 1
#>
#> attr(,"assign")$let
#> [1] 2
#>
#> attr(,"orig.assign.lm")
#> [1] 0 1
model.matrix(fit3) # Larger VGLM (or VLM) matrix
#> (Intercept):1 (Intercept):2 let
#> 1:1 1 0 1.757858
#> 1:2 0 1 1.757858
#> 2:1 1 0 2.708050
#> 2:2 0 1 2.708050
#> 3:1 1 0 3.068053
#> 3:2 0 1 3.068053
#> 4:1 1 0 3.314186
#> 4:2 0 1 3.314186
#> 5:1 1 0 3.511545
#> 5:2 0 1 3.511545
#> 6:1 1 0 3.676301
#> 6:2 0 1 3.676301
#> 7:1 1 0 3.828641
#> 7:2 0 1 3.828641
#> 8:1 1 0 3.941582
#> 8:2 0 1 3.941582
#> attr(,"assign")
#> attr(,"assign")$`(Intercept)`
#> [1] 1 2
#>
#> attr(,"assign")$let
#> [1] 3
#>
#> attr(,"vassign")
#> attr(,"vassign")$`(Intercept):1`
#> [1] 1
#>
#> attr(,"vassign")$`(Intercept):2`
#> [1] 2
#>
#> attr(,"vassign")$let
#> [1] 3
#>
#> attr(,"constraints")
#> attr(,"constraints")$`(Intercept)`
#> [,1] [,2]
#> [1,] 1 0
#> [2,] 0 1
#>
#> attr(,"constraints")$let
#> [,1]
#> [1,] 1
#> [2,] 1
#>
#> attr(,"orig.assign.lm")
#> [1] 0 1
# Example 4. Bivariate logistic model
fit4 <- vglm(cbind(nBnW, nBW, BnW, BW) ~ age, binom2.or, coalminers)
coef(fit4, matrix = TRUE)
#> logitlink(mu1) logitlink(mu2) loglink(oratio)
#> (Intercept) -6.5858766 -4.23347003 2.832535
#> age 0.1029391 0.06534392 0.000000
depvar(fit4) # Response are proportions
#> nBnW nBW BnW BW
#> 1 0.9431352 0.04866803 0.003586066 0.004610656
#> 2 0.9235064 0.05862647 0.005025126 0.012841988
#> 3 0.8816848 0.08376716 0.008991955 0.025556081
#> 4 0.8469278 0.09234639 0.017247575 0.043478261
#> 5 0.7818821 0.12005277 0.023746702 0.074318382
#> 6 0.7154200 0.13539490 0.036773924 0.112411199
#> 7 0.6334928 0.11722488 0.055980861 0.193301435
#> 8 0.5525714 0.12857143 0.086857143 0.232000000
#> 9 0.4630282 0.11619718 0.093309859 0.327464789
weights(fit4, type = "prior")
#> [,1]
#> 1 1952
#> 2 1791
#> 3 2113
#> 4 2783
#> 5 2274
#> 6 2393
#> 7 2090
#> 8 1750
#> 9 1136
# Example 5. The use of the xij argument (simple case).
# The constraint matrix for 'op' has one column.
nn <- 1000
eyesdat <- round(data.frame(lop = runif(nn),
rop = runif(nn),
op = runif(nn)), digits = 2)
eyesdat <- transform(eyesdat, eta1 = -1 + 2 * lop,
eta2 = -1 + 2 * lop)
eyesdat <- transform(eyesdat,
leye = rbinom(nn, 1, prob = logitlink(eta1, inv = TRUE)),
reye = rbinom(nn, 1, prob = logitlink(eta2, inv = TRUE)))
head(eyesdat)
#> lop rop op eta1 eta2 leye reye
#> 1 0.75 0.91 0.33 0.50 0.50 1 0
#> 2 0.70 0.02 0.47 0.40 0.40 1 1
#> 3 0.10 0.19 0.73 -0.80 -0.80 0 0
#> 4 0.61 0.38 0.74 0.22 0.22 1 0
#> 5 0.02 0.31 0.89 -0.96 -0.96 0 0
#> 6 0.65 0.53 0.30 0.30 0.30 0 1
fit5 <- vglm(cbind(leye, reye) ~ op,
binom2.or(exchangeable = TRUE, zero = 3),
data = eyesdat, trace = TRUE,
xij = list(op ~ lop + rop + fill1(lop)),
form2 = ~ op + lop + rop + fill1(lop))
#> Iteration 1: deviance = 2727.2353
#> Iteration 2: deviance = 2726.0386
#> Iteration 3: deviance = 2726.0015
#> Iteration 4: deviance = 2726.0004
#> Iteration 5: deviance = 2726.0004
#> Iteration 6: deviance = 2726.0004
coef(fit5)
#> (Intercept):1 (Intercept):2 op
#> -0.52717612 0.08909758 1.01465047
coef(fit5, matrix = TRUE)
#> logitlink(mu1) logitlink(mu2) loglink(oratio)
#> (Intercept) -0.5271761 -0.5271761 0.08909758
#> op 1.0146505 1.0146505 0.00000000
constraints(fit5)
#> $`(Intercept)`
#> [,1] [,2]
#> [1,] 1 0
#> [2,] 1 0
#> [3,] 0 1
#>
#> $op
#> [,1]
#> [1,] 1
#> [2,] 1
#> [3,] 0
#>
fit5@control$xij
#> [[1]]
#> op ~ lop + rop + fill1(lop)
#> <environment: 0x6141fb8805e8>
#>
head(model.matrix(fit5))
#> (Intercept):1 (Intercept):2 op
#> 1:1 1 0 0.75
#> 1:2 1 0 0.91
#> 1:3 0 1 0.00
#> 2:1 1 0 0.70
#> 2:2 1 0 0.02
#> 2:3 0 1 0.00
# Example 6. The use of the 'constraints' argument.
as.character(~ bs(year,df=3)) # Get the white spaces right
#> [1] "~" "bs(year, df = 3)"
clist <- list("(Intercept)" = diag(3),
"bs(year, df = 3)" = rbind(1, 0, 0))
fit1 <- vglm(r1 ~ bs(year,df=3), gev(zero = NULL),
data = venice, constraints = clist, trace = TRUE)
#> Iteration 1: loglikelihood = -217.53311
#> Iteration 2: loglikelihood = -215.92358
#> Iteration 3: loglikelihood = -215.79841
#> Iteration 4: loglikelihood = -215.79279
#> Iteration 5: loglikelihood = -215.79236
#> Iteration 6: loglikelihood = -215.79233
#> Iteration 7: loglikelihood = -215.79232
#> Iteration 8: loglikelihood = -215.79232
coef(fit1, matrix = TRUE) # Check
#> location loglink(scale) logofflink(shape, offset = 0.5)
#> (Intercept) 93.45545 2.664918 -0.7177728
#> bs(year, df = 3)1 25.60150 0.000000 0.0000000
#> bs(year, df = 3)2 10.42492 0.000000 0.0000000
#> bs(year, df = 3)3 36.32781 0.000000 0.0000000