Fit Structural Equation Models

Fit a Structural Equation Model (SEM).

Usage

sem(model = NULL, data = NULL, ordered = NULL, sampling.weights = NULL,
    sample.cov = NULL, sample.mean = NULL, sample.th = NULL,
    sample.nobs = NULL, group = NULL, cluster = NULL, 
    constraints = "", WLS.V = NULL, NACOV = NULL, ov.order = "model",
    ...)

Arguments

model: A description of the user-specified model. Typically, the model is described using the lavaan model syntax. See model.syntax for more information. Alternatively, a parameter table (eg. the output of the lavParTable() function) is also accepted.
data: An optional data frame containing the observed variables used in the model. If some variables are declared as ordered factors, lavaan will treat them as ordinal variables.
ordered: Character vector. Only used if the data is in a data.frame. Treat these variables as ordered (ordinal) variables, if they are endogenous in the model. Importantly, all other variables will be treated as numeric (unless they are declared as ordered in the data.frame.) Since 0.6-4, ordered can also be logical. If TRUE, all observed endogenous variables are treated as ordered (ordinal). If FALSE, all observed endogenous variables are considered to be numeric (again, unless they are declared as ordered in the data.frame.)
sampling.weights: A variable name in the data frame containing sampling weight information. Currently only available for non-clustered data. Depending on the sampling.weights.normalization option, these weights may be rescaled (or not) so that their sum equals the number of observations (total or per group).
sample.cov: Numeric matrix. A sample variance-covariance matrix. The rownames and/or colnames must contain the observed variable names. For a multiple group analysis, a list with a variance-covariance matrix for each group.
sample.mean: A sample mean vector. For a multiple group analysis, a list with a mean vector for each group.
sample.th: Vector of sample-based thresholds. For a multiple group analysis, a list with a vector of thresholds for each group.
sample.nobs: Number of observations if the full data frame is missing and only sample moments are given. For a multiple group analysis, a list or a vector with the number of observations for each group.
group: Character. A variable name in the data frame defining the groups in a multiple group analysis.
cluster: Character. A (single) variable name in the data frame defining the clusters in a two-level dataset.
constraints: Additional (in)equality constraints not yet included in the model syntax. See model.syntax for more information.
WLS.V: A user provided weight matrix to be used by estimator "WLS"; if the estimator is "DWLS", only the diagonal of this matrix will be used. For a multiple group analysis, a list with a weight matrix for each group. The elements of the weight matrix should be in the following order (if all data is continuous): first the means (if a meanstructure is involved), then the lower triangular elements of the covariance matrix including the diagonal, ordered column by column. In the categorical case: first the thresholds (including the means for continuous variables), then the slopes (if any), the variances of continuous variables (if any), and finally the lower triangular elements of the correlation/covariance matrix excluding the diagonal, ordered column by column.
NACOV: A user provided matrix containing the elements of (N times) the asymptotic variance-covariance matrix of the sample statistics. For a multiple group analysis, a list with an asymptotic variance-covariance matrix for each group. See the WLS.V argument for information about the order of the elements.
ov.order: Character. If "model" (the default), the order of the observed variable names (as reflected for example in the output of lav_object_vnames()) is determined by the model syntax. If "data", the order is determined by the data (either the full data.frame or the sample (co)variance matrix). If the WLS.V and/or NACOV matrices are provided, this argument is currently set to "data".
...: Many more additional options can be defined, using 'name = value'. See lavOptions for a complete list.

Details

The sem function is a wrapper for the more general lavaan function, but setting the following default options: int.ov.free = TRUE, int.lv.free = FALSE, auto.fix.first = TRUE (unless std.lv = TRUE), auto.fix.single = TRUE, auto.var = TRUE, auto.cov.lv.x = TRUE, auto.efa = TRUE, auto.th = TRUE, auto.delta = TRUE, and auto.cov.y = TRUE.

Value

An object of class lavaan, for which several methods are available, including a summary method.

References

Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. doi:10.18637/jss.v048.i02

Examples

## The industrialization and Political Democracy Example 
## Bollen (1989), page 332
model <- ' 
  # latent variable definitions
     ind60 =~ x1 + x2 + x3
     dem60 =~ y1 + a*y2 + b*y3 + c*y4
     dem65 =~ y5 + a*y6 + b*y7 + c*y8

  # regressions
    dem60 ~ ind60
    dem65 ~ ind60 + dem60

  # residual correlations
    y1 ~~ y5
    y2 ~~ y4 + y6
    y3 ~~ y7
    y4 ~~ y8
    y6 ~~ y8
'

fit <- sem(model, data = PoliticalDemocracy)
summary(fit, fit.measures = TRUE)
#> lavaan 0.6-21 ended normally after 66 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                        31
#>   Number of equality constraints                     3
#> 
#>   Number of observations                            75
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                40.179
#>   Degrees of freedom                                38
#>   P-value (Chi-square)                           0.374
#> 
#> Model Test Baseline Model:
#> 
#>   Test statistic                               730.654
#>   Degrees of freedom                                55
#>   P-value                                        0.000
#> 
#> User Model versus Baseline Model:
#> 
#>   Comparative Fit Index (CFI)                    0.997
#>   Tucker-Lewis Index (TLI)                       0.995
#> 
#> Loglikelihood and Information Criteria:
#> 
#>   Loglikelihood user model (H0)              -1548.818
#>   Loglikelihood unrestricted model (H1)      -1528.728
#>                                                       
#>   Akaike (AIC)                                3153.636
#>   Bayesian (BIC)                              3218.526
#>   Sample-size adjusted Bayesian (SABIC)       3130.277
#> 
#> Root Mean Square Error of Approximation:
#> 
#>   RMSEA                                          0.028
#>   90 Percent confidence interval - lower         0.000
#>   90 Percent confidence interval - upper         0.087
#>   P-value H_0: RMSEA <= 0.050                    0.665
#>   P-value H_0: RMSEA >= 0.080                    0.083
#> 
#> Standardized Root Mean Square Residual:
#> 
#>   SRMR                                           0.056
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                             Standard
#>   Information                                 Expected
#>   Information saturated (h1) model          Structured
#> 
#> Latent Variables:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   ind60 =~                                            
#>     x1                1.000                           
#>     x2                2.180    0.138   15.751    0.000
#>     x3                1.818    0.152   11.971    0.000
#>   dem60 =~                                            
#>     y1                1.000                           
#>     y2         (a)    1.191    0.139    8.551    0.000
#>     y3         (b)    1.175    0.120    9.755    0.000
#>     y4         (c)    1.251    0.117   10.712    0.000
#>   dem65 =~                                            
#>     y5                1.000                           
#>     y6         (a)    1.191    0.139    8.551    0.000
#>     y7         (b)    1.175    0.120    9.755    0.000
#>     y8         (c)    1.251    0.117   10.712    0.000
#> 
#> Regressions:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   dem60 ~                                             
#>     ind60             1.471    0.392    3.750    0.000
#>   dem65 ~                                             
#>     ind60             0.600    0.226    2.661    0.008
#>     dem60             0.865    0.075   11.554    0.000
#> 
#> Covariances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>  .y1 ~~                                               
#>    .y5                0.583    0.356    1.637    0.102
#>  .y2 ~~                                               
#>    .y4                1.440    0.689    2.092    0.036
#>    .y6                2.183    0.737    2.960    0.003
#>  .y3 ~~                                               
#>    .y7                0.712    0.611    1.165    0.244
#>  .y4 ~~                                               
#>    .y8                0.363    0.444    0.817    0.414
#>  .y6 ~~                                               
#>    .y8                1.372    0.577    2.378    0.017
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .x1                0.081    0.019    4.182    0.000
#>    .x2                0.120    0.070    1.729    0.084
#>    .x3                0.467    0.090    5.177    0.000
#>    .y1                1.855    0.433    4.279    0.000
#>    .y2                7.581    1.366    5.549    0.000
#>    .y3                4.956    0.956    5.182    0.000
#>    .y4                3.225    0.723    4.458    0.000
#>    .y5                2.313    0.479    4.831    0.000
#>    .y6                4.968    0.921    5.393    0.000
#>    .y7                3.560    0.710    5.018    0.000
#>    .y8                3.308    0.704    4.701    0.000
#>     ind60             0.449    0.087    5.175    0.000
#>    .dem60             3.875    0.866    4.477    0.000
#>    .dem65             0.164    0.227    0.725    0.469
#>