Skip to contents

Returns a distance measure for how far regression coefficients are from the parameter space interior, so that warning can be given if they are too close to the parameter space boundary, for a fitted VGLM.

Usage

copsd(object, ...)
copsdvglm(object, doffset = 0.1, ...)

Arguments

object

A vglm object, e.g., representing a logistic regression.

doffset

Numeric, positive and of unit length. Called the denominator offset.

...

fed into copsvglm.

Details

The values returned by this function can be thought of as scaled distances. They are diagnostic for assessing whether there are boundary problems in the regression. The model must have a Centre of the Parameter Space (COPS) in the interior, e.g., it excludes ordinary Poisson regression. If the computed COPS is too large, relative to the regression coefficient, in that is indicative of there being boundary problems. A value of 5 is recommended for concluding that statistical inference is fraught. Higher values indicate a worsening problem. Common reasons for trouble are linearly separable problems and outliers.

An S3 version of this function is available, called copsd3.glm, because the convergence criteria for vglm and glm differ. The latter iterates closer to the parameter space boundary in general, so the COPSDs can be higher.

The COPSD is probably a better alternative to the WSDM. Both should ideally be used to assess a fitted model.

Value

A named vector, similar to coefvlm.

References

Yee, T. W. (2025). Mapping the parameter space by the WSDM function: A diagnostic for logistic regression and beyond. In preparation.

Author

Thomas W. Yee.

Note

Some minor changes might occur in the short- to medium-term future.

Initially, “d” stood for “divergence” because regression coefficients start exploding during IRLS iterations, but the values returned by this function are not divergence measures in the strict sense because there are no probability distributions as such. However, the values here do tend to Inf as violations to the regularity conditions worsen.

Examples

if (FALSE)    # Example 1: flour beetles
copsd3(glm(cbind(dead, n-dead) ~ logdose, binomial  , fbeetle))
copsd(vglm(cbind(dead, n-dead) ~ logdose, binomialff, fbeetle))
#> (Intercept)     logdose 
#>   1.0236986   0.9669587 

# Example 2: quasi-separation
Nmax <-  25
data1 <- data.frame(y = c(rep(0, Nmax), 1, rep(1, Nmax-1)),
   x = c(seq(0, 0.5, len = Nmax), seq(0.5, 1, len = Nmax)))
copsd3(glm(y ~ x, binomial, data1, maxit = 3, tr = TRUE))  # OK
#> Deviance = 22.81182 Iterations - 1
#> Deviance = 14.72133 Iterations - 2
#> Deviance = 10.18121 Iterations - 3
#> (Intercept)           x 
#>   0.9890603   0.5713343 
copsd3(glm(y ~ x, binomial, data1, tr = TRUE))  # Not OK
#> Deviance = 22.81182 Iterations - 1
#> Deviance = 14.72133 Iterations - 2
#> Deviance = 10.18121 Iterations - 3
#> Deviance = 7.306668 Iterations - 4
#> Deviance = 5.431766 Iterations - 5
#> Deviance = 4.22934 Iterations - 6
#> Deviance = 3.494252 Iterations - 7
#> Deviance = 3.088289 Iterations - 8
#> Deviance = 2.897501 Iterations - 9
#> Deviance = 2.819719 Iterations - 10
#> Deviance = 2.790078 Iterations - 11
#> Deviance = 2.779043 Iterations - 12
#> Deviance = 2.774966 Iterations - 13
#> Deviance = 2.773463 Iterations - 14
#> Deviance = 2.772911 Iterations - 15
#> Deviance = 2.772707 Iterations - 16
#> Deviance = 2.772632 Iterations - 17
#> Deviance = 2.772605 Iterations - 18
#> Deviance = 2.772595 Iterations - 19
#> Deviance = 2.772591 Iterations - 20
#> Deviance = 2.77259 Iterations - 21
#> Deviance = 2.772589 Iterations - 22
#> Deviance = 2.772589 Iterations - 23
#> Deviance = 2.772589 Iterations - 24
#> Deviance = 2.772589 Iterations - 25
#> (Intercept)           x 
#>    635.2094    633.2777 
 # \dontrun{}