Marginal Homogeneity Tests — MarginalHomogeneityTests • coin

Testing the marginal homogeneity of a repeated measurements factor in a complete block design.

# S3 method for class 'formula'
mh_test(formula, data, subset = NULL, ...)
# S3 method for class 'table'
mh_test(object, ...)
# S3 method for class 'SymmetryProblem'
mh_test(object, ...)

Arguments

formula: a formula of the form y ~ x | block where y and x are factors and block is an optional factor (which is generated automatically if omitted).
data: an optional data frame containing the variables in the model formula.
subset: an optional vector specifying a subset of observations to be used. Defaults to NULL.
object: an object inheriting from classes "table" (with identical dimnames components) or "SymmetryProblem".
...: further arguments to be passed to symmetry_test().

Details

mh_test() provides the McNemar test, the Cochran \(Q\) test, the Stuart(-Maxwell) test and the Madansky test of interchangeability. A general description of these methods is given by Agresti (2002).

The null hypothesis of marginal homogeneity is tested. The response variable and the measurement conditions are given by y and x, respectively, and block is a factor where each level corresponds to exactly one subject with repeated measurements.

This procedure is known as the McNemar test (McNemar, 1947) when both y and x are binary factors, as the Cochran \(Q\) test (Cochran, 1950) when y is a binary factor and x is a factor with an arbitrary number of levels, as the Stuart(-Maxwell) test (Stuart, 1955; Maxwell, 1970) when y is a factor with an arbitrary number of levels and x is a binary factor, and as the Madansky test of interchangeability (Madansky, 1963), which implies marginal homogeneity, when both y and x are factors with an arbitrary number of levels.

If y and/or x are ordered factors, the default scores, 1:nlevels(y) and 1:nlevels(x), respectively, can be altered using the scores argument (see symmetry_test()); this argument can also be used to coerce nominal factors to class "ordered". If both y and x are ordered factors, a linear-by-linear association test is computed and the direction of the alternative hypothesis can be specified using the alternative argument. This extension was given by Birch (1965) who also discussed the situation when either the response or the measurement condition is an ordered factor; see also White, Landis and Cooper (1982).

The conditional null distribution of the test statistic is used to obtain \(p\)-values and an asymptotic approximation of the exact distribution is used by default (distribution = "asymptotic"). Alternatively, the distribution can be approximated via Monte Carlo resampling or computed exactly for univariate two-sample problems by setting distribution to "approximate" or "exact", respectively. See asymptotic(), approximate() and exact() for details.

Value

An object inheriting from class "IndependenceTest".

Note

This function is currently computationally inefficient for data with a large number of pairs or sets.

References

Agresti, A. (2002). Categorical Data Analysis, Second Edition. Hoboken, New Jersey: John Wiley & Sons.

Birch, M. W. (1965). The detection of partial association, II: The general case. Journal of the Royal Statistical Society B 27(1), 111–124. doi:10.1111/j.2517-6161.1965.tb00593.x

Cochran, W. G. (1950). The comparison of percentages in matched samples. Biometrika 37(3/4), 256–266. doi:10.1093/biomet/37.3-4.256

Madansky, A. (1963). Tests of homogeneity for correlated samples. Journal of the American Statistical Association 58(301), 97–119. doi:10.1080/01621459.1963.10500835

Maxwell, A. E. (1970). Comparing the classification of subjects by two independent judges. British Journal of Psychiatry 116(535), 651–655. doi:10.1192/bjp.116.535.651

McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2), 153–157. doi:10.1007/BF02295996

Stuart, A. (1955). A test for homogeneity of the marginal distributions in a two-way classification. Biometrika 42(3/4), 412–416. doi:10.1093/biomet/42.3-4.412

White, A. A., Landis, J. R. and Cooper, M. M. (1982). A note on the equivalence of several marginal homogeneity test criteria for categorical data. International Statistical Review 50(1), 27–34. doi:10.2307/1402457

Examples

## Performance of prime minister
## Agresti (2002, p. 409)
performance <- matrix(
    c(794, 150,
       86, 570),
    nrow = 2, byrow = TRUE,
    dimnames = list(
         "First" = c("Approve", "Disprove"),
        "Second" = c("Approve", "Disprove")
    )
)
performance <- as.table(performance)
diag(performance) <- 0 # speed-up: only off-diagonal elements contribute

## Asymptotic McNemar Test
mh_test(performance)
#> 
#> 	Asymptotic Marginal Homogeneity Test
#> 
#> data:  response by
#> 	 conditions (First, Second) 
#> 	 stratified by block
#> chi-squared = 17.356, df = 1, p-value = 3.099e-05
#> 

## Exact McNemar Test
mh_test(performance, distribution = "exact")
#> 
#> 	Exact Marginal Homogeneity Test
#> 
#> data:  response by
#> 	 conditions (First, Second) 
#> 	 stratified by block
#> chi-squared = 17.356, p-value = 3.716e-05
#> 


## Effectiveness of different media for the growth of diphtheria
## Cochran (1950, Tab. 2)
cases <- c(4, 2, 3, 1, 59)
n <- sum(cases)
cochran <- data.frame(
    diphtheria = factor(
        unlist(rep(list(c(1, 1, 1, 1),
                        c(1, 1, 0, 1),
                        c(0, 1, 1, 1),
                        c(0, 1, 0, 1),
                        c(0, 0, 0, 0)),
                   cases))
    ),
    media = factor(rep(LETTERS[1:4], n)),
    case =  factor(rep(seq_len(n), each = 4))
)

## Asymptotic Cochran Q test (Cochran, 1950, p. 260)
mh_test(diphtheria ~ media | case, data = cochran) # Q = 8.05
#> 
#> 	Asymptotic Marginal Homogeneity Test
#> 
#> data:  diphtheria by media (A, B, C, D) 
#> 	 stratified by case
#> chi-squared = 8.0526, df = 3, p-value = 0.04494
#> 

## Approximative Cochran Q test
mt <- mh_test(diphtheria ~ media | case, data = cochran,
              distribution = approximate(nresample = 10000))
pvalue(mt)             # standard p-value
#> [1] 0.0539
#> 99 percent confidence interval:
#>  0.04824687 0.05998244 
#> 
midpvalue(mt)          # mid-p-value
#> [1] 0.0441
#> 99 percent confidence interval:
#>  0.03902513 0.04960849 
#> 
pvalue_interval(mt)    # p-value interval
#>    p_0    p_1 
#> 0.0343 0.0539 
size(mt, alpha = 0.05) # test size at alpha = 0.05 using the p-value
#> [1] 0.0343


## Opinions on Pre- and Extramarital Sex
## Agresti (2002, p. 421)
opinions <- c("Always wrong", "Almost always wrong",
              "Wrong only sometimes", "Not wrong at all")
PreExSex <- matrix(
    c(144, 33, 84, 126,
        2,  4, 14,  29,
        0,  2,  6,  25,
        0,  0,  1,   5),
    nrow = 4,
    dimnames = list(
          "Premarital Sex" = opinions,
        "Extramarital Sex" = opinions
    )
)
PreExSex <- as.table(PreExSex)

## Asymptotic Stuart test
mh_test(PreExSex)
#> 
#> 	Asymptotic Marginal Homogeneity Test
#> 
#> data:  response by
#> 	 conditions (Premarital.Sex, Extramarital.Sex) 
#> 	 stratified by block
#> chi-squared = 271.92, df = 3, p-value < 2.2e-16
#> 

## Asymptotic Stuart-Birch test
## Note: response as ordinal
mh_test(PreExSex, scores = list(response = 1:length(opinions)))
#> 
#> 	Asymptotic Marginal Homogeneity Test for Ordered Data
#> 
#> data:  response (ordered) by
#> 	 conditions (Premarital.Sex, Extramarital.Sex) 
#> 	 stratified by block
#> Z = 16.454, p-value < 2.2e-16
#> alternative hypothesis: two.sided
#> 


## Vote intention
## Madansky (1963, pp. 107-108)
vote <- array(
    c(120, 1,  8, 2,   2,  1, 2, 1,  7,
        6, 2,  1, 1, 103,  5, 1, 4,  8,
       20, 3, 31, 1,   6, 30, 2, 1, 81),
    dim = c(3, 3, 3),
    dimnames = list(
          "July" = c("Republican", "Democratic", "Uncertain"),
        "August" = c("Republican", "Democratic", "Uncertain"),
          "June" = c("Republican", "Democratic", "Uncertain")
    )
)
vote <- as.table(vote)

## Asymptotic Madansky test (Q = 70.77)
mh_test(vote)
#> 
#> 	Asymptotic Marginal Homogeneity Test
#> 
#> data:  response by
#> 	 conditions (July, August, June) 
#> 	 stratified by block
#> chi-squared = 70.763, df = 4, p-value = 1.565e-14
#> 


## Cross-over study
## http://www.nesug.org/proceedings/nesug00/st/st9005.pdf
dysmenorrhea <- array(
    c(6, 2, 1,  3, 1, 0,  1, 2, 1,
      4, 3, 0, 13, 3, 0,  8, 1, 1,
      5, 2, 2, 10, 1, 0, 14, 2, 0),
    dim = c(3, 3, 3),
    dimnames =  list(
          "Placebo" = c("None", "Moderate", "Complete"),
         "Low dose" = c("None", "Moderate", "Complete"),
        "High dose" = c("None", "Moderate", "Complete")
    )
)
dysmenorrhea <- as.table(dysmenorrhea)

## Asymptotic Madansky-Birch test (Q = 53.76)
## Note: response as ordinal
mh_test(dysmenorrhea, scores = list(response = 1:3))
#> 
#> 	Asymptotic Marginal Homogeneity Test for Ordered Data
#> 
#> data:  response (ordered) by
#> 	 conditions (Placebo, Low.dose, High.dose) 
#> 	 stratified by block
#> chi-squared = 53.762, df = 2, p-value = 2.117e-12
#> 

## Asymptotic Madansky-Birch test (Q = 47.29)
## Note: response and measurement conditions as ordinal
mh_test(dysmenorrhea, scores = list(response = 1:3,
                                    conditions = 1:3))
#> 
#> 	Asymptotic Marginal Homogeneity Test for Ordered Data
#> 
#> data:  response (ordered) by
#> 	 conditions (Placebo < Low.dose < High.dose) 
#> 	 stratified by block
#> Z = 6.8764, p-value = 6.138e-12
#> alternative hypothesis: two.sided
#>