Summarizing Vector Generalized Linear Model Fits

These functions are all methods for class vglm or summary.vglm objects.

Usage

summaryvglm(object, correlation = FALSE, dispersion = NULL,
     digits = NULL, presid = FALSE,
     HDEtest = FALSE, hde.NA = TRUE, threshold.hde = 0.001,
     signif.stars = getOption("show.signif.stars"),
     nopredictors = FALSE,
     lrt0.arg = FALSE, score0.arg = FALSE, wald0.arg = FALSE,
     values0 = 0, subset = NULL, omit1s = TRUE,
     wsdm.arg = FALSE, hdiff = 0.005,
     retry = TRUE, mux.hdiff = 1, eps.wsdm = 0.15,
     Mux.div = 3, doffset.wsdm = NULL, ...)
# S3 method for class 'summary.vglm'
show(x, digits = max(3L, getOption("digits") - 3L),
     quote = TRUE, prefix = "", presid = length(x@pearson.resid) > 0,
     HDEtest = TRUE, hde.NA = TRUE, threshold.hde = 0.001,
     signif.stars = NULL, nopredictors = NULL,
     top.half.only = FALSE, ...)

Arguments

object

an object of class "vglm", usually, a result of a call to vglm.

x

an object of class "summary.vglm", usually, a result of a call to summaryvglm().

dispersion

used mainly for GLMs. See summary.glm. This argument should not be used because VGAM now steers away from quasi-likelihood models.

correlation

logical; if TRUE, the correlation matrix of the estimated parameters is returned and printed.

digits

the number of significant digits to use when printing.

signif.stars: logical; if TRUE, ‘significance stars’ are printed for each coefficient.

presid

Pearson residuals; print out some summary statistics of these?

HDEtest

logical; if TRUE then a test for the HDE is performed, else all arguments related to the HDE are ignored.

hde.NA

logical; if a test for the Hauck-Donner effect is done (for each coefficient) and it is affirmative should that Wald test p-value be replaced by an NA? The default is to do so. Setting hde.NA = FALSE will print the p-value even though it will be biased upwards. Also see argument threshold.hde.

threshold.hde

numeric; used if hde.NA = TRUE and is present for some coefficients. Only p-values greater than this argument will be replaced by an NA, the reason being that small p-values will already be statistically significant. Hence setting threshold.hde = 0 will print out a NA if the HDE is present.

quote

Fed into print().

nopredictors

logical; if TRUE the names of the linear predictors are not printed out. The default is that they are.

lrt0.arg, score0.arg, wald0.arg

logical; if lrt0.arg = TRUE then the other arguments are passed into lrt.stat.vlm and the equivalent of the so-called Wald table is outputted. Similarly, if score0.arg = TRUE then the other arguments are passed into score.stat.vlm and the equivalent of the so-called Wald table is outputted. Similarly, if wald0.arg = TRUE then the other arguments are passed into wald.stat.vlm and the Wald table corresponding to that is outputted. See details below. Setting any of these will result in further IRLS iterations being performed, therefore may be computationally expensive.

values0, subset, omit1s

These arguments are used if any of the lrt0.arg, score0.arg, wald0.arg arguments are used. They are passed into the appropriate function, such as wald.stat.vlm.

top.half.only

logical; if TRUE then only print out the top half of the usual output. Used for P-VGAMs.

prefix

Not used.

wsdm.arg

logical; compute the WSDM statistics? If so, wsdm is called and they are printed as a new fifth column. Also printed is the max-WSDM statistic at the bottom. See hdiff about choosing a suitable \(h\). Note that the arguments supplied here is a subset of those of wsdm, hence a more detailed WSDM analysis should be conducted by calling wsdm directly as well.

Note: this argument might not work if lrt0 = TRUE, score0 = TRUE and/or wald0 = TRUE.

hdiff

numeric; fed into wsdm. An important argument if wsdm.arg = TRUE. If it is too small or large then the max-WSDM statistic will be described as "inaccurate" in which case trying another value is advised.

retry

logical; fed into wsdm. If TRUE then the computation will take three times longer in order to confirm the reasonable accuracy of the WSDM statistics.

mux.hdiff

fed into wsdm.

eps.wsdm, Mux.div

fed into wsdm.

doffset.wsdm

numeric; fed into wsdm. The default means the vector is searched for on object (such as logistic regression). If nothing is found, then a vector of 1s is used.

...

Not used.

Details

Originally, summaryvglm() was written to be very similar to summary.glm, however now there are a quite a few more options available. By default, show.summary.vglm() tries to be smart about formatting the coefficients, standard errors, etc. and additionally gives ‘significance stars’ if signif.stars is TRUE. The coefficients component of the result gives the estimated coefficients and their estimated standard errors, together with their ratio. This third column is labelled z value regardless of whether the dispersion is estimated or known (or fixed by the family). A fourth column gives the two-tailed p-value corresponding to the z ratio based on a Normal reference distribution. In general, the t distribution is not used, but the normal distribution is.

Correlations are printed to two decimal places (or symbolically): to see the actual correlations print summary(object)@correlation directly.

The Hauck-Donner effect (HDE) is tested for almost all models; see hdeff.vglm for details. Arguments hde.NA and threshold.hde here are meant to give some control of the output if this aberration of the Wald statistic occurs (so that the p-value is biased upwards). If the HDE is present then using lrt.stat.vlm to get a more accurate p-value is a good alternative as p-values based on the likelihood ratio test (LRT) tend to be more accurate than Wald tests and do not suffer from the HDE. Alternatively, if the HDE is present then using wald0.arg = TRUE will compute Wald statistics that are HDE-free; see wald.stat.

The arguments lrt0.arg and score0.arg enable the so-called Wald table to be replaced by the equivalent LRT and Rao score test table; see lrt.stat.vlm, score.stat. Further IRLS iterations are performed for both of these, hence the computational cost might be significant.

It is possible for programmers to write a methods function to print out extra quantities when summary(vglmObject) is called. The generic function is summaryvglmS4VGAM(), and one can use the S4 function setMethod to compute the quantities needed. Also needed is the generic function is showsummaryvglmS4VGAM() to actually print the quantities out.

Value

summaryvglm returns an object of class "summary.vglm"; see summary.vglm-class.

Author

T. W. Yee.

Warning

Currently the SE column is deleted when lrt0 = TRUE because SEs are not so meaningful with the LRT. In the future an SE column may be inserted (with NA values) so that it has 4-column output like the other tests. In the meantime, the columns of this matrix should be accessed by name and not number.

Examples

## For examples see example(glm)
pneumo <- transform(pneumo, let = log(exposure.time))
(afit <- vglm(cbind(normal, mild, severe) ~ let, acat, pneumo))
#> 
#> Call:
#> vglm(formula = cbind(normal, mild, severe) ~ let, family = acat, 
#>     data = pneumo)
#> 
#> 
#> Coefficients:
#> (Intercept):1 (Intercept):2         let:1         let:2 
#>    -8.9360297    -3.0390622     2.1653729     0.9020936 
#> 
#> Degrees of Freedom: 16 Total; 12 Residual
#> Residual deviance: 5.347382 
#> Log-likelihood: -25.25054 
#> 
#> This is an adjacent categories model with 3 levels
coef(afit, matrix = TRUE)
#>             loglink(P[Y=2]/P[Y=1]) loglink(P[Y=3]/P[Y=2])
#> (Intercept)              -8.936030             -3.0390622
#> let                       2.165373              0.9020936
summary(afit)  # Might suffer from the HDE?
#> 
#> Call:
#> vglm(formula = cbind(normal, mild, severe) ~ let, family = acat, 
#>     data = pneumo)
#> 
#> Coefficients: 
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept):1  -8.9360     1.5804  -5.654 1.57e-08 ***
#> (Intercept):2  -3.0391     2.3761  -1.279    0.201    
#> let:1           2.1654     0.4575   4.733 2.21e-06 ***
#> let:2           0.9021     0.6690   1.348    0.178    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: loglink(P[Y=2]/P[Y=1]), loglink(P[Y=3]/P[Y=2])
#> 
#> Residual deviance: 5.3474 on 12 degrees of freedom
#> 
#> Log-likelihood: -25.2505 on 12 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 4 
#> 
#> 
#> Exponentiated coefficients:
#>    let:1    let:2 
#> 8.717852 2.464758 
coef(summary(afit))
#>                 Estimate Std. Error   z value     Pr(>|z|)
#> (Intercept):1 -8.9360297  1.5804347 -5.654159 1.566108e-08
#> (Intercept):2 -3.0390622  2.3760683 -1.279030 2.008866e-01
#> let:1          2.1653729  0.4574859  4.733202 2.210057e-06
#> let:2          0.9020936  0.6689816  1.348458 1.775111e-01
summary(afit, lrt0 = TRUE, score0 = TRUE, wald0 = TRUE)
#> 
#> Call:
#> vglm(formula = cbind(normal, mild, severe) ~ let, family = acat, 
#>     data = pneumo)
#> 
#> Likelihood ratio test coefficients: 
#>       Estimate z value Pr(>|z|)    
#> let:1   2.1654   6.445 1.16e-10 ***
#> let:2   0.9021   1.364    0.173    
#> 
#> Rao score test coefficients: 
#>       Estimate Std. Error z value Pr(>|z|)    
#> let:1   2.1654     0.2260   5.654 1.57e-08 ***
#> let:2   0.9021     0.6576   1.364    0.172    
#> 
#> Wald (modified by IRLS iterations) coefficients: 
#>       Estimate Std. Error z value Pr(>|z|)    
#> let:1   2.1654     0.2260   9.583   <2e-16 ***
#> let:2   0.9021     0.6576   1.372     0.17    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: loglink(P[Y=2]/P[Y=1]), loglink(P[Y=3]/P[Y=2])
#> 
#> Residual deviance: 5.3474 on 12 degrees of freedom
#> 
#> Log-likelihood: -25.2505 on 12 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 4 
#> 
#> 
#> Exponentiated coefficients:
#>    let:1    let:2 
#> 8.717852 2.464758 
summary(afit, HDEtest = TRUE)
#> 
#> Call:
#> vglm(formula = cbind(normal, mild, severe) ~ let, family = acat, 
#>     data = pneumo)
#> 
#> Coefficients: 
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept):1  -8.9360     1.5804  -5.654 1.57e-08 ***
#> (Intercept):2  -3.0391     2.3761  -1.279    0.201    
#> let:1           2.1654     0.4575   4.733 2.21e-06 ***
#> let:2           0.9021     0.6690   1.348    0.178    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: loglink(P[Y=2]/P[Y=1]), loglink(P[Y=3]/P[Y=2])
#> 
#> Residual deviance: 5.3474 on 12 degrees of freedom
#> 
#> Log-likelihood: -25.2505 on 12 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 4 
#> 
#> Warning: Hauck-Donner effect detected in the following estimate(s):
#> '(Intercept):1'
#> 
#> 
#> Exponentiated coefficients:
#>    let:1    let:2 
#> 8.717852 2.464758