Crossed effects and other sparse correlations
xdesign.RdDefines a design object with multiple dimensions of correlation:
observations that share any of the id variables are correlated,
or you can supply an adjacency matrix or Matrix to specify which are
correlated. Supports crossed designs (eg multiple raters of multiple
objects) and non-nested observational correlation (eg observations
sharing primary school or secondary school). Has methods for
svymean, svytotal, svyglm (so far).
Usage
xdesign(id = NULL, strata = NULL, weights = NULL, data, fpc = NULL,
adjacency = NULL, overlap = c("unbiased", "positive"), allow.non.binary = FALSE)Arguments
- id
list of formulas specifying cluster identifiers for each clustering dimension (or
NULL)- strata
Not implemented
- weights
model formula specifying (sampling) weights
- data
data frame containing all the variables
- fpc
Not implemented
- adjacency
Adjacency matrix or Matrix indicating which pairs of observations are correlated
- overlap
See details below
- allow.non.binary
-
If
FALSEcheck thatadjacencyis a binary 0/1 orTRUE/FALSEmatrix or Matrix.
Details
Subsetting for these objects actually drops observations; it is not equivalent to just setting weights to zero as for survey designs. So, for example, a subset of a balanced design will not be a balanced design.
The overlap option controls double-counting of some variance
terms. Suppose there are two clustering dimensions, ~a and
~b. If we compute variance matrices clustered on a and
clustered on b and add them, observations that share both
a and b will be counted twice, giving a positively
biased estimator. We can subtract off a variance matrix clustered
on combinations of a and b to give an unbiased
variance estimator. However, the unbiased estimator is not
guaranteed to be positive definite. In the references, Miglioretti
and Heagerty use the overlap="positive" estimator and Cameron
et al use the overlap="unbiased" estimator.
References
Miglioretti D, Heagerty PJ (2007) Marginal modeling of nonnested multilevel data using standard software. Am J Epidemiol 165(4):453-63
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2011). Robust Inference With Multiway Clustering. Journal of Business & Economic Statistics, 29(2), 238-249.
https://notstatschat.rbind.io/2021/09/18/crossed-clustering-and-parallel-invention/
Examples
## With one clustering dimension, is close to the with-replacement
## survey estimator, but not identical unless clusters are equal size
data(api)
dclus1r<-svydesign(id=~dnum, weights=~pw, data=apiclus1)
xclus1<-xdesign(id=list(~dnum), weights=~pw, data=apiclus1)
#> Warning: only one clustering dimension?
xclus1
#> 1-way crossed design:
#> xdesign(id = list(~dnum), weights = ~pw, data = apiclus1)
svymean(~enroll,dclus1r)
#> mean SE
#> enroll 549.72 45.646
svymean(~enroll,xclus1)
#> mean SE
#> enroll 549.72 46.964
data(salamander)
xsalamander<-xdesign(id=list(~Male, ~Female), data=salamander,
overlap="unbiased")
xsalamander
#> 2-way crossed design:
#> xdesign(id = list(~Male, ~Female), data = salamander, overlap = "unbiased")
degf(xsalamander)
#> [1] 32.72727