Exploratory Factor Analysis
efa.RdFit one or more Exploratory Factor Analysis (EFA) model(s).
Usage
efa(data = NULL, nfactors = 1L, sample.cov = NULL, sample.nobs = NULL,
rotation = "geomin", rotation.args = list(), ov.names = NULL,
bounds = "pos.var", ..., output = "efa")Arguments
- data
A data frame containing the observed variables we need for the EFA. If only a subset of the observed variables is needed, use the
ov.namesargument.- nfactors
Integer or Integer vector. The desired number of factors to extract. Can be a single number, or a vector of numbers (e.g.,
nfactors = 1:4.), For each different number, a model is fitted.- sample.cov
Numeric matrix. A sample variance-covariance matrix. The rownames and/or colnames must contain the observed variable names. Unlike sem and CFA, the matrix may be a correlation matrix.
- sample.nobs
Number of observations if the full data frame is missing and only the sample variance-covariance matrix is given.
- rotation
Character. The rotation method to be used. Possible options are varimax, quartimax, orthomax, oblimin, quartimin, geomin, promax, entropy, mccammon, infomax, tandem1, tandem2, oblimax, bentler, simplimax, target.strict, target (alias for pst), pst (=partially specified target), cf, crawford-ferguson, cf-quartimax, cf-varimax, cf-equamax, cf-parsimax, cf-facparsim, biquartimin, bigeomin. The latter two are for bifactor rotation only. The rotation algorithms (except promax and target) are similar to those from the GPArotation package, but have been reimplemented for better control. The promax method is taken from the stats package. The target.strict method is equal to the target method in the GPArotation package. The target method is in fact the pst method where all non-zero elements (in the target matrix) are ignored.
- rotation.args
List. Options related to the rotation algorithm. The default options (and their alternatives) are
orthogonal = FALSE,row.weights = "default"(or"kaiser","cureton.mulaik"or"none"),std.ov = TRUE,algorithm = "gpa"(or"pairwise"),rstarts = 30,gpa.tol = 1e-05,tol = 1e-08,max.iter = 10000L,warn = FALSE,verbose = FALSE,reflect = TRUE,order.lv.by = "index"(or"sumofsquares"or"none"). Other options are specific for a particular rotation criterion:geomin.epsilon = 0.001,orthomax.gamma = 1,promax.kappa = 4,cf.gamma = 0, andoblimin.gamma = 0.- ov.names
Character vector. The variables names that are needed for the EFA. Should be a subset of the variables names in the data.frame. By default (if NULL), all the variables in the data are used.
- bounds
Per default,
bounds = "pos.var"forces all variances of both observed and latent variables to be strictly nonnegative. See the entry inlavOptionsfor more options.- ...
Aditional options to be passed to lavaan, using 'name = value'. See
lavOptionsfor a complete list.- output
Character. If
"efa"(the default), the output mimics the typical output of an EFA. If"lavaan", a lavaan object returned. The latter is only possible if nfactors contains a single (integer) number.
Details
The efa function is essentially a wrapper around the
lavaan function. It generates the model syntax (for a given number
of factors) and then calls lavaan() treating the factors as
a single block that should be rotated. The function only supports
a single group. Categorical data is handled as usual by first computing
an appropriate (e.g., tetrachoric or polychoric) correlation matrix,
which is then used as input for the EFA.
There is also (limited) support for
twolevel data. The same number of factors is then extracted at the
within and the between level.
The promax rotation method (taken from the stats package) is only
provided for convenience. Because promax is a two-step algorithm (first
varimax, then oblique rotation to get simple structure), it does not
use the gpa or pairwise rotation algorithms, and as a result, no
standard errors are provided.
Value
If output = "lavaan", an object of class
lavaan. If output = "efa",
a list of class efaList for which a print(),
summary() and fitMeasures() method are available. Because
we added the (standardized) loadings as an extra element, the loadings
function (which is not a generic function) from the stats package will
also work on efaList objects.
See also
lav_efalist_summary for a summary method if the output is
of class efaList.
Examples
## The famous Holzinger and Swineford (1939) example
fit <- efa(data = HolzingerSwineford1939,
ov.names = paste("x", 1:9, sep = ""),
nfactors = 1:3,
rotation = "geomin",
rotation.args = list(geomin.epsilon = 0.01, rstarts = 1))
summary(fit, nd = 3L, cutoff = 0.2, dot.cutoff = 0.05)
#> This is lavaan 0.6-21 -- running exploratory factor analysis
#>
#> Estimator ML
#> Rotation method GEOMIN OBLIQUE
#> Geomin epsilon 0.01
#> Rotation algorithm (rstarts) GPA (1)
#> Standardized metric TRUE
#> Row weights None
#>
#> Number of observations 301
#>
#> Overview models:
#> aic bic sabic chisq df pvalue cfi rmsea
#> nfactors = 1 7738.448 7805.176 7748.091 312.264 27 0.000 0.677 0.187
#> nfactors = 2 7572.491 7668.876 7586.418 130.306 19 0.000 0.874 0.140
#> nfactors = 3 7479.081 7601.416 7496.758 22.897 12 0.029 0.988 0.055
#>
#> Eigenvalues correlation matrix:
#>
#> ev1 ev2 ev3 ev4 ev5 ev6 ev7 ev8 ev9
#> 3.216 1.639 1.365 0.699 0.584 0.500 0.473 0.286 0.238
#>
#> Number of factors: 1
#>
#> Standardized loadings: (* = significant at 1% level)
#>
#> f1 unique.var communalities
#> x1 0.438* 0.808 0.192
#> x2 0.220* 0.951 0.049
#> x3 0.223* 0.950 0.050
#> x4 0.848* 0.281 0.719
#> x5 0.841* 0.293 0.707
#> x6 0.838* 0.298 0.702
#> x7 .* 0.967 0.033
#> x8 0.201* 0.960 0.040
#> x9 0.307* 0.906 0.094
#>
#> f1
#> Sum of squared loadings 2.586
#> Proportion of total 1.000
#> Proportion var 0.287
#> Cumulative var 0.287
#>
#> Number of factors: 2
#>
#> Standardized loadings: (* = significant at 1% level)
#>
#> f1 f2 unique.var communalities
#> x1 0.261* 0.430* 0.673 0.327
#> x2 . 0.251* 0.906 0.094
#> x3 0.455* 0.783 0.217
#> x4 0.850* 0.274 0.726
#> x5 0.867* 0.264 0.736
#> x6 0.824* 0.302 0.698
#> x7 0.447* 0.802 0.198
#> x8 . 0.626* 0.630 0.370
#> x9 0.732* 0.458 0.542
#>
#> f1 f2 total
#> Sum of sq (obliq) loadings 2.281 1.628 3.909
#> Proportion of total 0.584 0.416 1.000
#> Proportion var 0.253 0.181 0.434
#> Cumulative var 0.253 0.434 0.434
#>
#> Factor correlations: (* = significant at 1% level)
#>
#> f1 f2
#> f1 1.000
#> f2 0.331* 1.000
#>
#> Number of factors: 3
#>
#> Standardized loadings: (* = significant at 1% level)
#>
#> f1 f2 f3 unique.var communalities
#> x1 0.604* .* 0.513 0.487
#> x2 0.507* . 0.749 0.251
#> x3 0.691* . 0.543 0.457
#> x4 0.839* 0.279 0.721
#> x5 . 0.887* 0.243 0.757
#> x6 . 0.806* 0.305 0.695
#> x7 . 0.726* 0.502 0.498
#> x8 . 0.703* 0.469 0.531
#> x9 0.368* 0.463* 0.543 0.457
#>
#> f2 f1 f3 total
#> Sum of sq (obliq) loadings 2.226 1.345 1.284 4.855
#> Proportion of total 0.458 0.277 0.264 1.000
#> Proportion var 0.247 0.149 0.143 0.539
#> Cumulative var 0.247 0.397 0.539 0.539
#>
#> Factor correlations: (* = significant at 1% level)
#>
#> f1 f2 f3
#> f1 1.000
#> f2 0.327* 1.000
#> f3 0.278 0.230* 1.000
#>
fitMeasures(fit, fit.measures = "all")
#> nfct=1 nfct=2 nfct=3
#> npar 17.000 26.000 33.000
#> fmin 0.519 0.216 0.038
#> chisq 312.264 130.306 22.897
#> df 27.000 19.000 12.000
#> pvalue 0.000 0.000 0.029
#> baseline.chisq 918.852 918.852 918.852
#> baseline.df 36.000 36.000 36.000
#> baseline.pvalue 0.000 0.000 0.000
#> cfi 0.677 0.874 0.988
#> tli 0.569 0.761 0.963
#> nnfi 0.569 0.761 0.963
#> rfi 0.547 0.731 0.925
#> nfi 0.660 0.858 0.975
#> pnfi 0.495 0.453 0.325
#> ifi 0.680 0.876 0.988
#> rni 0.677 0.874 0.988
#> logl -3851.224 -3760.245 -3706.541
#> unrestricted.logl -3695.092 -3695.092 -3695.092
#> aic 7738.448 7572.491 7479.081
#> bic 7805.176 7668.876 7601.416
#> ntotal 301.000 301.000 301.000
#> bic2 7748.091 7586.418 7496.758
#> rmsea 0.187 0.140 0.055
#> rmsea.ci.lower 0.169 0.117 0.017
#> rmsea.ci.upper 0.206 0.163 0.089
#> rmsea.ci.level 0.900 0.900 0.900
#> rmsea.pvalue 0.000 0.000 0.365
#> rmsea.close.h0 0.050 0.050 0.050
#> rmsea.notclose.pvalue 1.000 1.000 0.120
#> rmsea.notclose.h0 0.080 0.080 0.080
#> rmr 0.169 0.096 0.022
#> rmr_nomean 0.169 0.096 0.022
#> srmr 0.143 0.076 0.017
#> srmr_bentler 0.143 0.076 0.017
#> srmr_bentler_nomean 0.143 0.076 0.017
#> crmr 0.160 0.085 0.019
#> crmr_nomean 0.160 0.085 0.019
#> srmr_mplus 0.143 0.076 0.017
#> srmr_mplus_nomean 0.143 0.076 0.017
#> cn_05 39.666 70.630 277.408
#> cn_01 46.269 84.599 345.648
#> gfi 0.792 0.900 0.983
#> agfi 0.653 0.763 0.938
#> pgfi 0.475 0.380 0.262
#> mfi 0.623 0.831 0.982
#> ecvi 1.150 0.606 0.295
# target rotation
target <- matrix(0, 9, 3)
target[1:3, 1] <- 1
target[4:6, 2] <- 1
target[7:9, 3] <- 1
fit2 <- efa(data = HolzingerSwineford1939,
ov.names = paste("x", 1:9, sep = ""),
nfactors = 3,
rotation = "target",
rotation.args = list(target = target))
summary(fit2)
#> This is lavaan 0.6-21 -- running exploratory factor analysis
#>
#> Estimator ML
#> Rotation method PST OBLIQUE
#> Rotation algorithm (rstarts) GPA (30)
#> Standardized metric TRUE
#> Row weights None
#>
#> Number of observations 301
#>
#> Fit measures:
#> aic bic sabic chisq df pvalue cfi rmsea
#> nfactors = 3 7479.081 7601.416 7496.758 22.897 12 0.029 0.988 0.055
#>
#> Eigenvalues correlation matrix:
#>
#> ev1 ev2 ev3 ev4 ev5 ev6 ev7 ev8 ev9
#> 3.216 1.639 1.365 0.699 0.584 0.500 0.473 0.286 0.238
#>
#> Standardized loadings: (* = significant at 1% level)
#>
#> f1 f2 f3 unique.var communalities
#> x1 0.589* .* 0.513 0.487
#> x2 0.510* 0.749 0.251
#> x3 0.676* 0.543 0.457
#> x4 0.840* 0.279 0.721
#> x5 0.890* 0.243 0.757
#> x6 0.805* 0.305 0.695
#> x7 .* 0.736* 0.502 0.498
#> x8 0.732* 0.469 0.531
#> x9 0.309* 0.505* 0.543 0.457
#>
#> f2 f3 f1 total
#> Sum of sq (obliq) loadings 2.216 1.371 1.267 4.855
#> Proportion of total 0.457 0.282 0.261 1.000
#> Proportion var 0.246 0.152 0.141 0.539
#> Cumulative var 0.246 0.399 0.539 0.539
#>
#> Factor correlations: (* = significant at 1% level)
#>
#> f1 f2 f3
#> f1 1.000
#> f2 0.340* 1.000
#> f3 0.314* 0.260* 1.000
#>