General Linear Hypotheses — glht • multcomp

General linear hypotheses and multiple comparisons for parametric models, including generalized linear models, linear mixed effects models, and survival models.

Usage

# S3 method for class 'matrix'
glht(model, linfct, 
    alternative = c("two.sided", "less", "greater"), 
    rhs = 0, ...)
# S3 method for class 'character'
glht(model, linfct, ...)
# S3 method for class 'expression'
glht(model, linfct, ...)
# S3 method for class 'mcp'
glht(model, linfct, ...)
# S3 method for class 'mlf'
glht(model, linfct, ...)
mcp(..., interaction_average = FALSE, covariate_average = FALSE)

Arguments

model: a fitted model, for example an object returned by lm, glm, or aov etc. It is assumed that coef and vcov methods are available for model. For multiple comparisons of means, methods model.matrix, model.frame and terms are expected to be available for model as well.
linfct: a specification of the linear hypotheses to be tested. Linear functions can be specified by either the matrix of coefficients or by symbolic descriptions of one or more linear hypotheses. Multiple comparisons in AN(C)OVA models are specified by objects returned from function mcp.

alternative: a character string specifying the alternative hypothesis, must be one of '"two.sided"' (default), '"greater"' or '"less"'. You can specify just the initial letter.
rhs: an optional numeric vector specifying the right hand side of the hypothesis.
interaction_average: logical indicating if comparisons are averaging over interaction terms. Experimental!
covariate_average: logical indicating if comparisons are averaging over additional covariates. Experimental!
...: additional arguments to function modelparm in all glht methods. For function mcp, multiple comparisons are defined by matrices or symbolic descriptions specifying contrasts of factor levels where the arguments correspond to factor names.

Details

A general linear hypothesis refers to null hypotheses of the form \(H_0: K \theta = m\) for some parametric model model with parameter estimates coef(model).

The null hypothesis is specified by a linear function \(K \theta\), the direction of the alternative and the right hand side \(m\). Here, alternative equal to "two.sided" refers to a null hypothesis \(H_0: K \theta = m\), whereas "less" corresponds to \(H_0: K \theta \ge m\) and "greater" refers to \(H_0: K \theta \le m\). The right hand side vector \(m\) can be defined via the rhs argument.

The generic method glht dispatches on its second argument (linfct). There are three ways, and thus methods, to specify linear functions to be tested:

1) The matrix of coefficients \(K\) can be specified directly via the linfct argument. In this case, the number of columns of this matrix needs to correspond to the number of parameters estimated by model. It is assumed that appropriate coef and vcov methods are available for model (modelparm deals with some exceptions).

2) A symbolic description, either a character or expression vector passed to glht via its linfct argument, can be used to define the null hypothesis. A symbolic description must be interpretable as a valid R expression consisting of both the left and right hand side of a linear hypothesis. Only the names of coef(model) must be used as variable names. The alternative is given by the direction under the null hypothesis (= or == refer to "two.sided", <= means "greater" and >= indicates "less"). Numeric vectors of length one are valid values for the right hand side.

3) Multiple comparisons of means are defined by objects of class mcp as returned by the mcp function. For each factor, which is included in model as independent variable, a contrast matrix or a symbolic description of the contrasts can be specified as arguments to mcp. A symbolic description may be a character or expression where the factor levels are only used as variables names. In addition, the type argument to the contrast generating function contrMat may serve as a symbolic description of contrasts as well.

4) The lsm function in package lsmeans offers a symbolic interface for the definition of least-squares means for factor combinations which is very helpful when more complex contrasts are of special interest.

The mcp function must be used with care when defining parameters of interest in two-way ANOVA or ANCOVA models. Here, the definition of treatment differences (such as Tukey's all-pair comparisons or Dunnett's comparison with a control) might be problem specific. Because it is impossible to determine the parameters of interest automatically in this case, mcp in multcomp version 1.0-0 and higher generates comparisons for the main effects only, ignoring covariates and interactions (older versions automatically averaged over interaction terms). A warning is given. We refer to multcomp::hsu1996, Chapter 7, and multcomp::searle1971, Chapter 7.3, for further discussions and examples on this issue.

glht extracts the number of degrees of freedom for models of class lm (via modelparm) and the exact multivariate t distribution is evaluated. For all other models, results rely on the normal approximation. Alternatively, the degrees of freedom to be used for the evaluation of multivariate t distributions can be given by the additional df argument to modelparm specified via ....

glht methods return a specification of the null hypothesis \(H_0: K \theta = m\). The value of the linear function \(K \theta\) can be extracted using the coef method and the corresponding covariance matrix is available from the vcov method. Various simultaneous and univariate tests and confidence intervals are available from summary.glht and confint.glht methods, respectively.

A more detailed description of the underlying methodology is available from hothorn2008b and Bretz+Hothorn+Westfall_2010.

Value

An object of class glht, more specifically a list with elements

model: a fitted model, used in the call to glht
linfct: the matrix of linear functions
rhs: the vector of right hand side values \(m\)
coef: the values of the linear functions
vcov: the covariance matrix of the values of the linear functions
df: optionally, the degrees of freedom when the exact t distribution is used for inference
alternative: a character string specifying the alternative hypothesis
type: optionally, a character string giving the name of the specific procedure

with print, summary, confint, coef and vcov methods being available. When called with linfct being an mcp object, an additional element focus is available storing the names of the factors under test.

References

Examples


  ### multiple linear model, swiss data
  lmod <- lm(Fertility ~ ., data = swiss)

  ### test of H_0: all regression coefficients are zero 
  ### (ignore intercept)

  ### define coefficients of linear function directly
  K <- diag(length(coef(lmod)))[-1,]
  rownames(K) <- names(coef(lmod))[-1]
  K
#>                  [,1] [,2] [,3] [,4] [,5] [,6]
#> Agriculture         0    1    0    0    0    0
#> Examination         0    0    1    0    0    0
#> Education           0    0    0    1    0    0
#> Catholic            0    0    0    0    1    0
#> Infant.Mortality    0    0    0    0    0    1

  ### set up general linear hypothesis
  glht(lmod, linfct = K)
#> 
#> 	 General Linear Hypotheses
#> 
#> Linear Hypotheses:
#>                       Estimate
#> Agriculture == 0       -0.1721
#> Examination == 0       -0.2580
#> Education == 0         -0.8709
#> Catholic == 0           0.1041
#> Infant.Mortality == 0   1.0770
#> 

  ### alternatively, use a symbolic description 
  ### instead of a matrix 
  glht(lmod, linfct = c("Agriculture = 0",
                        "Examination = 0",
                        "Education = 0",
                        "Catholic = 0",
                        "Infant.Mortality = 0"))
#> 
#> 	 General Linear Hypotheses
#> 
#> Linear Hypotheses:
#>                       Estimate
#> Agriculture == 0       -0.1721
#> Examination == 0       -0.2580
#> Education == 0         -0.8709
#> Catholic == 0           0.1041
#> Infant.Mortality == 0   1.0770
#> 


  ### multiple comparison procedures
  ### set up a one-way ANOVA
  amod <- aov(breaks ~ tension, data = warpbreaks)

  ### set up all-pair comparisons for factor `tension'
  ### using a symbolic description (`type' argument 
  ### to `contrMat()')
  glht(amod, linfct = mcp(tension = "Tukey"))
#> 
#> 	 General Linear Hypotheses
#> 
#> Multiple Comparisons of Means: Tukey Contrasts
#> 
#> 
#> Linear Hypotheses:
#>            Estimate
#> M - L == 0  -10.000
#> H - L == 0  -14.722
#> H - M == 0   -4.722
#> 

  ### alternatively, describe differences symbolically
  glht(amod, linfct = mcp(tension = c("M - L = 0", 
                                      "H - L = 0",
                                      "H - M = 0")))
#> 
#> 	 General Linear Hypotheses
#> 
#> Multiple Comparisons of Means: User-defined Contrasts
#> 
#> 
#> Linear Hypotheses:
#>            Estimate
#> M - L == 0  -10.000
#> H - L == 0  -14.722
#> H - M == 0   -4.722
#> 

  ### alternatively, define contrast matrix directly
  contr <- rbind("M - L" = c(-1, 1, 0),
                 "H - L" = c(-1, 0, 1), 
                 "H - M" = c(0, -1, 1))
  glht(amod, linfct = mcp(tension = contr))
#> 
#> 	 General Linear Hypotheses
#> 
#> Multiple Comparisons of Means: User-defined Contrasts
#> 
#> 
#> Linear Hypotheses:
#>            Estimate
#> M - L == 0  -10.000
#> H - L == 0  -14.722
#> H - M == 0   -4.722
#> 

  ### alternatively, define linear function for coef(amod)
  ### instead of contrasts for `tension'
  ### (take model contrasts and intercept into account)
  glht(amod, linfct = cbind(0, contr %*% contr.treatment(3)))
#> 
#> 	 General Linear Hypotheses
#> 
#> Linear Hypotheses:
#>            Estimate
#> M - L == 0  -10.000
#> H - L == 0  -14.722
#> H - M == 0   -4.722
#> 


  ### mix of one- and two-sided alternatives
  warpbreaks.aov <- aov(breaks ~ wool + tension,
                      data = warpbreaks)

  ### contrasts for `tension'
  K <- rbind("L - M" = c( 1, -1,  0),     
             "M - L" = c(-1,  1,  0),       
             "L - H" = c( 1,  0, -1),     
             "M - H" = c( 0,  1, -1))

  warpbreaks.mc <- glht(warpbreaks.aov, 
                        linfct = mcp(tension = K),
                        alternative = "less")

  ### correlation of first two tests is -1
  cov2cor(vcov(warpbreaks.mc))
#>       L - M M - L L - H M - H
#> L - M   1.0  -1.0   0.5  -0.5
#> M - L  -1.0   1.0  -0.5   0.5
#> L - H   0.5  -0.5   1.0   0.5
#> M - H  -0.5   0.5   0.5   1.0

  ### use smallest of the two one-sided
  ### p-value as two-sided p-value -> 0.0232
  summary(warpbreaks.mc)
#> 
#> 	 Simultaneous Tests for General Linear Hypotheses
#> 
#> Multiple Comparisons of Means: User-defined Contrasts
#> 
#> 
#> Fit: aov(formula = breaks ~ wool + tension, data = warpbreaks)
#> 
#> Linear Hypotheses:
#>            Estimate Std. Error t value Pr(<t)  
#> L - M >= 0   10.000      3.872   2.582 1.0000  
#> M - L >= 0  -10.000      3.872  -2.582 0.0231 *
#> L - H >= 0   14.722      3.872   3.802 1.0000  
#> M - H >= 0    4.722      3.872   1.219 1.0000  
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> (Adjusted p values reported -- single-step method)
#> 

  ### more complex models: Continuous outcome logistic
  ### regression; parameters are log-odds ratios
  if (require("tram", quietly = TRUE, warn.conflicts = FALSE)) {
      confint(glht(Colr(breaks ~ wool + tension, 
                        data = warpbreaks), 
                   linfct = mcp("tension" = "Tukey")))
  }
#> 
#> 	 Simultaneous Confidence Intervals
#> 
#> Multiple Comparisons of Means: Tukey Contrasts
#> 
#> 
#> Fit: Colr(formula = breaks ~ wool + tension, data = warpbreaks)
#> 
#> Quantile = 2.3425
#> 95% family-wise confidence level
#>  
#> 
#> Linear Hypotheses:
#>            Estimate lwr     upr    
#> M - L == 0  1.1730  -0.2504  2.5965
#> H - L == 0  2.1108   0.6370  3.5846
#> H - M == 0  0.9378  -0.4461  2.3217
#>