Crossed effects and other sparse correlations

Defines a design object with multiple dimensions of correlation: observations that share any of the id variables are correlated, or you can supply an adjacency matrix or Matrix to specify which are correlated. Supports crossed designs (eg multiple raters of multiple objects) and non-nested observational correlation (eg observations sharing primary school or secondary school). Has methods for svymean, svytotal, svyglm (so far).

Usage

xdesign(id = NULL, strata = NULL, weights = NULL, data, fpc = NULL,
adjacency = NULL, overlap = c("unbiased", "positive"), allow.non.binary = FALSE)

Arguments

id: list of formulas specifying cluster identifiers for each clustering dimension (or NULL)
strata: Not implemented
weights: model formula specifying (sampling) weights
data: data frame containing all the variables
fpc: Not implemented
adjacency: Adjacency matrix or Matrix indicating which pairs of observations are correlated
overlap: See details below
allow.non.binary: If FALSE check that adjacency is a binary 0/1 or TRUE/FALSE matrix or Matrix.

Details

Subsetting for these objects actually drops observations; it is not equivalent to just setting weights to zero as for survey designs. So, for example, a subset of a balanced design will not be a balanced design.

The overlap option controls double-counting of some variance terms. Suppose there are two clustering dimensions, ~a and ~b. If we compute variance matrices clustered on a and clustered on b and add them, observations that share both a and b will be counted twice, giving a positively biased estimator. We can subtract off a variance matrix clustered on combinations of a and b to give an unbiased variance estimator. However, the unbiased estimator is not guaranteed to be positive definite. In the references, Miglioretti and Heagerty use the overlap="positive" estimator and Cameron et al use the overlap="unbiased" estimator.

Value

An object of class xdesign

References

Miglioretti D, Heagerty PJ (2007) Marginal modeling of nonnested multilevel data using standard software. Am J Epidemiol 165(4):453-63

Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2011). Robust Inference With Multiway Clustering. Journal of Business & Economic Statistics, 29(2), 238-249.

https://notstatschat.rbind.io/2021/09/18/crossed-clustering-and-parallel-invention/

Examples



## With one clustering dimension, is close to the with-replacement
##   survey estimator, but not identical unless clusters are equal size
data(api)
dclus1r<-svydesign(id=~dnum, weights=~pw, data=apiclus1)
xclus1<-xdesign(id=list(~dnum), weights=~pw, data=apiclus1)
#> Warning: only one clustering dimension?
xclus1
#> 1-way crossed design:
#> xdesign(id = list(~dnum), weights = ~pw, data = apiclus1)

svymean(~enroll,dclus1r)
#>          mean     SE
#> enroll 549.72 45.646
svymean(~enroll,xclus1)
#>          mean     SE
#> enroll 549.72 46.964

data(salamander)
xsalamander<-xdesign(id=list(~Male, ~Female), data=salamander,
    overlap="unbiased")
xsalamander
#> 2-way crossed design:
#> xdesign(id = list(~Male, ~Female), data = salamander, overlap = "unbiased")
degf(xsalamander)
#> [1] 32.72727