Trimmed Constraint Matrices
trim.constraints.RdDeletes statistically nonsignficant regression coefficients via their constraint matrices, for future refitting.
Arguments
- object
Some VGAM object, especially having class
vglmff-class. It has not yet been tested on non-"vglm"objects.- sig.level
Significance levels, with values in \([0, 1]\). Columns of constraint matices whose p-values are larger than this argument are deleted. With terms that generate more than one column of the
"lm"model matrix, all p-values must be greater than this argument for deletion. This argument is recycled to the total number of regression coefficients ofobject.- max.num
Numeric, positive and integer-valued. Maximum number of regression coefficients allowable for deletion. This allows one to limit the number of deleted coefficients. For example, if
max.num = 1then only the largest p-value is used for the deletion, provided it is larger thansig.level. The default is to delete all those coefficients whose p-values are greater thansig.level. With a finite value, this argument will probably not work properly when there are terms that generate more than one column of the LM model matrix. Having a value greater than unity might be unsuitable in the presence of multicollinearity because all correlated variables might be eliminated at once.- intercepts
Logical. Trim the intercept term? If
FALSEthen the constraint matrix for the"(Intercept)"term is left unchanged.- ...
Unused but for provision in the future.
Details
This utility function is intended to simplify an existing
vglm object having
variables (terms) that affect unnecessary parameters.
Suppose the explanatory variables in the formula
includes a simple numeric covariate called x2.
This variable will affect every linear predictor if
zero = NULL in the VGAM family function.
This situation may correspond to the constraint matrices having
unnecessary columns because their regression coefficients are
statistically nonsignificant.
This function attempts to delete those columns and
return a possibly simplified list of constraint matrices
that can make refitting a simpler model easy to do.
P-values obtained from summaryvglm
(with HDEtest = FALSE for increased speed)
are compared to sig.level to test for
statistical significance.
For terms that generate more than one column of the
"lm" model matrix,
such as bs and poly,
the column is deleted if all regression coefficients
are statistically nonsignificant.
Incidentally, users should instead use
sm.bs,
sm.ns,
sm.poly,
etc.,
for smart and safe prediction.
One can think of this function as facilitating
backward elimination for variable selection,
especially if max.num = 1 and \(M=1\),
however usually more than one regression coefficient is deleted
here by default.
Value
A list of possibly simpler constraint matrices
that can be fed back into the model using the
constraints argument
(usually zero = NULL is needed to avoid a warning).
Consequently, they are required to be of the "term"-type.
After the model is refitted, applying
summaryvglm should result in
regression coefficients that are `all' statistically
significant.
Warning
This function has not been tested thoroughly.
One extreme is that a term is totally deleted because
none of its regression coefficients are needed,
and that situation has not yet been finalized.
Ideally, object only contains terms where at least
one regression coefficient has a p-value less than
sig.level.
For ordered factors and other situations, deleting
certain columns may not make sense and destroy interpretability.
As stated above, max.num may not work properly
when there are terms that
generate more than one column of the LM model matrix.
However, this limitation may change in the future.
Note
This function is experimental and may be replaced by some other function in the future. This function does not use S4 object oriented programming but may be converted to such in the future.
Examples
if (FALSE) data("xs.nz", package = "VGAMdata")
fit1 <-
vglm(cbind(worry, worrier) ~ bs(age) + sex + ethnicity + cat + dog,
binom2.or(zero = NULL), data = xs.nz, trace = TRUE)
#> Error in eval(mf, parent.frame()): object 'xs.nz' not found
summary(fit1, HDEtest = FALSE) # 'cat' is not significant at all
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'summary': object 'fit1' not found
dim(constraints(fit1, matrix = TRUE))
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'constraints': object 'fit1' not found
(tclist1 <- trim.constraints(fit1)) # No 'cat'
#> Error: object 'fit1' not found
fit2 <- # Delete 'cat' manually from the formula:
vglm(cbind(worry, worrier) ~ bs(age) + sex + ethnicity + dog,
binom2.or(zero = NULL), data = xs.nz,
constraints = tclist1, trace = TRUE)
#> Error in eval(mf, parent.frame()): object 'xs.nz' not found
summary(fit2, HDEtest = FALSE) # A simplified model
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'summary': object 'fit2' not found
dim(constraints(fit2, matrix = TRUE)) # Fewer regression coefficients
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'constraints': object 'fit2' not found
# \dontrun{}