Hauck-Donner Effects: A Detection Test for Wald Tests

A detection test for the Hauck-Donner effect on each regression coefficient of a fitted VGLM or 2 x 2 table.

Usage

hdeff(object, ...)
hdeff.vglm(object, derivative = NULL, se.arg = FALSE,
           subset = NULL, theta0 = 0, hstep = 0.005,
           fd.only = FALSE, ...)
hdeff.numeric(object, byrow = FALSE, ...)
hdeff.matrix(object, ...)

Arguments

object

Usually a vglm object. Although only a limited number of family functions have an analytical solution to the HDE detection test (binomialff, borel.tanner, cumulative, erlang, felix, lindley, poissonff, topple, uninormal, zipoissonff, and zipoisson; hopefully some more will be implemented in the short future!) the finite-differences (FDs) method can be applied to almost all VGAM family functions to get a numerical solution.

Alternatively object may represent a 2 x 2 table of positive counts. If so, then the first row corresponds to \(x2=0\) (baseline group) and the second row \(x2=1\). The first column corresponds to \(y=0\) (failure) and the second column \(y=1\) (success).

Another alternative is that object is a numerical vector of length 4, representing a 2 x 2 table of positive counts. If so then it is fed into hdeff.matrix using the argument byrow, which matches matrix. See the examples below.

derivative

Numeric. Either 1 or 2. Currently only a few models having one linear predictor are handled analytically for derivative = 2, e.g., binomialff, poissonff. However, the numerical method can return the first two derivatives for almost all models.

se.arg

Logical. If TRUE then the derivatives of the standard errors are returned as well, because usually the derivatives of the Wald statistics are of central interest. Requires derivative to be assigned the value 1 or 2 for this argument to operate.

subset

Logical or vector of indices, to select the regression coefficients of interest. The default is to select all coefficients. Recycled if necessary if logical. If numeric then they should comprise elements from 1:length(coef(object)). This argument can be useful for computing the derivatives of a Cox regression (coxph) fitted using artificially created Poisson data; then there are many coefficients that are effectively nuisance parameters.

theta0

Numeric. Vector recycled to the necessary length which is the number of regression coefficients. The null hypotheses for the regression coefficients are that they equal those respective values, and the alternative hypotheses are all two-sided. It is not recommended that argument subset be used if a vector of values is assigned here because theta0[subset] is implied and might not work.

hstep

Positive numeric and recycled to length 2; it is the so-called step size when using finite-differences and is often called \(h\) in the calculus literature, e.g., \(f'(x)\) is approximately \((f(x+h) - f(x)) / h\). For the 2nd-order partial derivatives, there are two step sizes and hence this argument is recycled to length 2. The default is to have the same values. The 1st-order derivatives use the first value only. It is recommended that a few values of this argument be tried because values of the first and second derivatives can vary accordingly. If any values are too large then the derivatives may be inaccurate; and if too small then the derivatives may be unstable and subject to too much round-off/cancellation error (in fact it may create an error or a NA).

fd.only

Logical; if TRUE then finite-differences are used to estimate the derivatives even if an analytical solution has been coded, By default, finite-differences will be used when an analytical solution has not been implemented.

It is possible that NAs are returned. If so, and if fd.only = FALSE, then a warning is issued and a recursive call is made with fd.only = TRUE—this is more likely to return an answer without any NAs.

byrow

Logical; fed into matrix if object is a vector of length 4 so that there are two choices in the order of the elements.

...

currently unused but may be used in the future for further arguments passed into the other methods functions.

Details

Almost all of statistical inference based on the likelihood assumes that the parameter estimates are located in the interior of the parameter space. The nonregular case of being located on the boundary is not considered very much and leads to very different results from the regular case. Practically, an important question is: how close is close to the boundary? One might answer this as: the parameter estimates are too close to the boundary when the Hauck-Donner effect (HDE) is present, whereby the Wald statistic becomes aberrant.

Hauck and Donner (1977) first observed an aberration of the Wald test statistic not monotonically increasing as a function of increasing distance between the parameter estimate and the null value. This "disturbing" and "undesirable" underappreciated effect has since been observed in other regression models by various authors. This function computes the first, and possibly second, derivative of the Wald statistic for each regression coefficient. A negative value of the first derivative is indicative of the HDE being present. More information can be obtained from hdeffsev regarding HDE severity: there may be none, faint, weak, moderate, strong and extreme amounts of HDE present.

In general, most models have derivatives that are computed numerically using finite-difference approximations. The reason is that it takes a lot of work to program in the analytical solution (this includes a few very common models, such as poissonff and binomialff, where the first two derivatives have been implemented).

Value

By default this function returns a labelled logical vector; a TRUE means the HDE is affirmative for that coefficient (negative slope). Hence ideally all values are FALSE. Any TRUE values suggests that the MLE is too near the boundary of the parameter space, and that the p-value for that regression coefficient is biased upwards. When present a highly significant variable might be deemed nonsignificant, and thus the HDE can create havoc for variable selection. If the HDE is present then more accurate p-values can generally be obtained by conducting a likelihood ratio test (see lrt.stat.vlm) or Rao's score test (see score.stat.vlm); indeed the default of wald.stat.vlm does not suffer from the HDE.

Setting deriv = 1 returns a numerical vector of first derivatives of the Wald statistics. Setting deriv = 2 returns a 2-column matrix of first and second derivatives of the Wald statistics. Then setting se.arg = TRUE returns an additional 1 or 2 columns.

Some 2nd derivatives are NA if only a partial analytic solution has been programmed in.

For those VGAM family functions whose HDE test has not yet been implemented explicitly (the vast majority of them), finite-difference approximations to the derivatives will be used—see the arguments hstep and fd.only for getting some control on them.

References

Hauck, J. W. W. and A. Donner (1977). Wald's test as applied to hypotheses in logit analysis. Journal of the American Statistical Association, 72, 851–853.

Yee, T. W. (2022). On the Hauck-Donner effect in Wald tests: Detection, tipping points and parameter space characterization, Journal of the American Statistical Association, 117, 1763–1774. doi:10.1080/01621459.2021.1886936 .

Author

Thomas W. Yee.

Note

The function summaryvglm conducts the HDE detection test if possible and prints out a line at the bottom if the HDE is detected for some regression coefficients. By “if possible”, only a few family functions are exempt and they have an infos slot with component hadof = FALSE; such as normal.vcm, rec.normal because it uses the BFGS-IRLS method for computing the working weights. For these few a NULL is returned by hdeff.

If the second derivatives are of interest then it is recommended that crit = "c" be added to the fitting so that a slightly more accurate model results (usually one more IRLS iteration). This is because the FD approximation is very sensitive to values of the working weights, so they need to be computed accurately. Occasionally, if the coefficient is close to 0, then its Wald statistic's second derivative may be unusually large in magnitude (this could be due to something such as roundoff error).

This function is currently under development and may change a little in the short future. For HDE severity measures see hdeffsev.

Examples

pneumo <- transform(pneumo, let = log(exposure.time))
fit <- vglm(cbind(normal, mild, severe) ~ let, data = pneumo,
            trace = TRUE, crit = "c",  # Get some more accuracy
            cumulative(reverse = TRUE,  parallel = TRUE))
#> Iteration 1: coefficients = 
#>  -9.5689529, -10.4389146,   2.5663064
#> Iteration 2: coefficients = 
#>  -9.6770689, -10.5823198,   2.5971219
#> Iteration 3: coefficients = 
#>  -9.6761243, -10.5817555,   2.5968162
#> Iteration 4: coefficients = 
#>  -9.6760928, -10.5817253,   2.5968065
#> Iteration 5: coefficients = 
#>  -9.6760926, -10.5817251,   2.5968065
cumulative()@infos()$hadof  # Analytical solution implemented
#> [1] TRUE
hdeff(fit)
#> (Intercept):1 (Intercept):2           let 
#>          TRUE         FALSE         FALSE 
hdeff(fit, deriv = 1)  # Analytical solution
#> (Intercept):1 (Intercept):2           let 
#>    -1.6226841     0.6352025     7.3861886 
hdeff(fit, deriv = 2)  # It is a partial analytical solution
#>                   deriv1     deriv2
#> (Intercept):1 -1.6226869  0.2690279
#> (Intercept):2  0.6351971  0.2034829
#> let            7.3861953 -1.6727407
hdeff(fit, deriv = 2, se.arg = TRUE,
      fd.only = TRUE)  # All derivatives solved numerically by FDs
#>                   deriv1     deriv2   SE.deriv1  SE.deriv2
#> (Intercept):1 -1.6226869  0.2690279 -0.43084724 0.24008144
#> (Intercept):2  0.6351971  0.2034829 -0.01848513 0.03182355
#> let            7.3861953 -1.6727407 -0.26634826 0.67098783

# 2 x 2 table of counts
R0 <- 25; N0 <- 100  # Hauck Donner (1977) data set
mymat <- c(N0-R0, R0, 8, 92)  # HDE present
(mymat <- matrix(mymat, 2, 2, byrow = TRUE))
#>      [,1] [,2]
#> [1,]   75   25
#> [2,]    8   92
hdeff(mymat)
#> [1] TRUE
hdeff(c(mymat))  # Input is a vector
#> [1] TRUE
hdeff(c(t(mymat)), byrow = TRUE)  # Reordering of the data
#> [1] TRUE