Wald Test Statistics Evaluated at the Null Values

Generic function that computes Wald test statistics evaluated at the null values (consequently they do not suffer from the Hauck-Donner effect).

Usage

wald.stat(object, ...)
wald.stat.vlm(object, values0 = 0, subset = NULL, omit1s = TRUE,
          all.out = FALSE, orig.SE = FALSE, iterate.SE = TRUE,
          trace = FALSE, ...)

Arguments

object

A vglm fit.

values0

Numeric vector. The null values corresponding to the null hypotheses. Recycled if necessary.

subset

Same as in hdeff.

omit1s

Logical. Does one omit the intercepts? Because the default would be to test that each intercept is equal to 0, which often does not make sense or is unimportant, the intercepts are not tested by default. If they are tested then each linear predictor must have at least one coefficient (from another variable) to be estimated.

all.out

Logical. If TRUE then a list is returned containing various quantities such as the SEs, instead of just the Wald statistics.

orig.SE

Logical. If TRUE then the standard errors are computed at the MLE (of the original object). In practice, the (usual or unmodified) Wald statistics etc. are extracted from summary(object) because it was computed there. These may suffer from the HDE since all the SEs are evaluated at the MLE of the original object. If TRUE then argument iterate.SE may be ignored or overwritten. If orig.SE = FALSE then the \(k\)th SE uses the \(k\)th value of values0 in its computation and iterate.SE specifies the choice of the other coefficients.

This argument was previously called as.summary because if TRUE then the Wald statistics are the same as summary(glm()).

For one-parameter models setting orig.SE = FALSE results in what is called the null Wald (NW) statistic by some people, e.g., Laskar and King (1997) and Goh and King (1999). The NW statistic does not suffer from the HDE.

iterate.SE

Logical, for the standard error computations. If TRUE then IRLS iterations are performed to get MLEs of the other regression coefficients, subject to one coefficient being equal to the appropriate values0 value. If FALSE then the other regression coefficients have values obtained at the original fit. It is recommended that a TRUE be used as the answer tends to be more accurate. If the large (VLM) model matrix only has one column and iterate.SE = TRUE then an error will occur because there are no other regression coefficients to estimate.

trace

Logical. If TRUE then some output is produced as the IRLS iterations proceed. The value NULL means to use the trace value of the fitted object; see vglm.control.

...

Ignored for now.

Details

By default, summaryvglm and most regression modelling functions such as summary.glm compute all the standard errors (SEs) of the estimates at the MLE and not at 0. This corresponds to orig.SE = TRUE and it is vulnerable to the Hauck-Donner effect (HDE; see hdeff). One solution is to compute the SEs at 0 (or more generally, at the values of the argument values0). This function does that. The two variants of Wald statistics are asymptotically equivalent; however in small samples there can be an appreciable difference, and the difference can be large if the estimates are near to the boundary of the parameter space.

None of the tests here are joint, hence the degrees of freedom is always unity. For a factor with more than 2 levels one can use anova.vglm to test for the significance of the factor. If orig.SE = FALSE and iterate.SE = FALSE then one retains the MLEs of the original fit for the values of the other coefficients, and replaces one coefficient at a time by the value 0 (or whatever specified by values0). One alternative would be to recompute the MLEs of the other coefficients after replacing one of the values; this is the default because iterate.SE = TRUE and orig.SE = FALSE. Just like with the original IRLS iterations, the iterations here are not guaranteed to converge.

Almost all VGAM family functions use the EIM and not the OIM; this affects the resulting standard errors. Also, regularity conditions are assumed for the Wald, likelihood ratio and score tests; some VGAM family functions such as alaplace1 are experimental and do not satisfy such conditions, therefore naive inference is hazardous.

The default output of this function can be seen by setting wald0.arg = TRUE in summaryvglm.

Value

By default the signed square root of the Wald statistics whose SEs are computed at one each of the null values. If all.out = TRUE then a list is returned with the following components: wald.stat the Wald statistic, SE0 the standard error of that coefficient, values0 the null values. Approximately, the default Wald statistics output are standard normal random variates if each null hypothesis is true.

Altogether, by the four combinations of iterate.SE and orig.SE, there are three different variants of the Wald statistic that can be returned.

References

Laskar, M. R. and M. L. King (1997). Modified Wald test for regression disturbances. Economics Letters, 56, 5–11.

Goh, K.-L. and M. L. King (1999). A correction for local biasedness of the Wald and null Wald tests. Oxford Bulletin of Economics and Statistics 61, 435–450.

Author

Thomas W. Yee

Warning

This function has been tested but not thoroughly. Convergence failure is possible for some models applied to certain data sets; it is a good idea to set trace = TRUE to monitor convergence. For example, for a particular explanatory variable, the estimated regression coefficients of a non-parallel cumulative logit model (see cumulative) are ordered, and perturbing one coefficient might disrupt the order and create numerical problems.

Examples

set.seed(1)
pneumo <- transform(pneumo, let = log(exposure.time),
                            x3 = rnorm(nrow(pneumo)))
(fit <- vglm(cbind(normal, mild, severe) ~ let + x3, propodds, pneumo))
#> 
#> Call:
#> vglm(formula = cbind(normal, mild, severe) ~ let + x3, family = propodds, 
#>     data = pneumo)
#> 
#> 
#> Coefficients:
#> (Intercept):1 (Intercept):2           let            x3 
#>   -9.66744415  -10.57344562    2.58865720    0.08444356 
#> 
#> Degrees of Freedom: 16 Total; 12 Residual
#> Residual deviance: 4.763992 
#> Log-likelihood: -24.95885 
wald.stat(fit)  # No HDE here
#>        let         x3 
#> 13.1890479  0.5144113 
summary(fit, wald0 = TRUE)  # See them here
#> 
#> Call:
#> vglm(formula = cbind(normal, mild, severe) ~ let + x3, family = propodds, 
#>     data = pneumo)
#> 
#> Wald (modified by IRLS iterations) coefficients: 
#>     Estimate Std. Error z value Pr(>|z|)    
#> let  2.58866    0.19627  13.189   <2e-16 ***
#> x3   0.08444    0.16416   0.514    0.607    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: logitlink(P[Y>=2]), logitlink(P[Y>=3])
#> 
#> Residual deviance: 4.764 on 12 degrees of freedom
#> 
#> Log-likelihood: -24.9588 on 12 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 4 
#> 
#> 
#> Exponentiated coefficients:
#>       let        x3 
#> 13.311884  1.088111 
coef(summary(fit))  # Usual Wald statistics evaluated at the MLE
#>                   Estimate Std. Error    z value     Pr(>|z|)
#> (Intercept):1  -9.66744415  1.3377918 -7.2264190 4.958962e-13
#> (Intercept):2 -10.57344562  1.3588755 -7.7810263 7.193855e-15
#> let             2.58865720  0.3850708  6.7225483 1.785736e-11
#> x3              0.08444356  0.1632077  0.5173993 6.048774e-01
wald.stat(fit, orig.SE = TRUE)  # Same as previous line
#>       let        x3 
#> 6.7225483 0.5173993