Instrumental-Variable Regression
ivreg.RdFit instrumental-variable regression by two-stage least squares. This is equivalent to direct instrumental-variables estimation when the number of instruments is equal to the number of predictors.
Usage
ivreg(formula, instruments, data, subset, na.action, weights, offset,
contrasts = NULL, model = TRUE, y = TRUE, x = FALSE, ...)Arguments
- formula, instruments
formula specification(s) of the regression relationship and the instruments. Either
instrumentsis missing andformulahas three parts as iny ~ x1 + x2 | z1 + z2 + z3(recommended) orformulaisy ~ x1 + x2andinstrumentsis a one-sided formula~ z1 + z2 + z3(only for backward compatibility).- data
an optional data frame containing the variables in the model. By default the variables are taken from the environment of the
formula.- subset
an optional vector specifying a subset of observations to be used in fitting the model.
- na.action
a function that indicates what should happen when the data contain
NAs. The default is set by thena.actionoption.- weights
an optional vector of weights to be used in the fitting process.
- offset
an optional offset that can be used to specify an a priori known component to be included during fitting.
- contrasts
an optional list. See the
contrasts.argofmodel.matrix.default.- model, x, y
logicals. If
TRUEthe corresponding components of the fit (the model frame, the model matrices , the response) are returned.- ...
further arguments passed to
ivreg.fit.
Details
ivreg is the high-level interface to the work-horse function ivreg.fit,
a set of standard methods (including print, summary, vcov, anova,
hatvalues, predict, terms, model.matrix, bread,
estfun) is available and described on summary.ivreg.
Regressors and instruments for ivreg are most easily specified in a formula
with two parts on the right-hand side, e.g., y ~ x1 + x2 | z1 + z2 + z3,
where x1 and x2 are the regressors and z1,
z2, and z3 are the instruments. Note that exogenous
regressors have to be included as instruments for themselves. For
example, if there is one exogenous regressor ex and one endogenous
regressor en with instrument in, the appropriate formula
would be y ~ ex + en | ex + in. Equivalently, this can be specified as
y ~ ex + en | . - en + in, i.e., by providing an update formula with a
. in the second part of the formula. The latter is typically more convenient,
if there is a large number of exogenous regressors.
Value
ivreg returns an object of class "ivreg", with the following components:
- coefficients
parameter estimates.
- residuals
a vector of residuals.
- fitted.values
a vector of predicted means.
- weights
either the vector of weights used (if any) or
NULL(if none).- offset
either the offset used (if any) or
NULL(if none).- n
number of observations.
- nobs
number of observations with non-zero weights.
- rank
the numeric rank of the fitted linear model.
- df.residual
residual degrees of freedom for fitted model.
- cov.unscaled
unscaled covariance matrix for the coefficients.
- sigma
residual standard error.
- call
the original function call.
- formula
the model formula.
- terms
a list with elements
"regressors"and"instruments"containing the terms objects for the respective components.- levels
levels of the categorical regressors.
- contrasts
the contrasts used for categorical regressors.
- model
the full model frame (if
model = TRUE).- y
the response vector (if
y = TRUE).- x
a list with elements
"regressors","instruments","projected", containing the model matrices from the respective components (ifx = TRUE)."projected"is the matrix of regressors projected on the image of the instruments.
Examples
## data
data("CigarettesSW", package = "AER")
CigarettesSW <- transform(CigarettesSW,
rprice = price/cpi,
rincome = income/population/cpi,
tdiff = (taxs - tax)/cpi
)
## model
fm <- ivreg(log(packs) ~ log(rprice) + log(rincome) | log(rincome) + tdiff + I(tax/cpi),
data = CigarettesSW, subset = year == "1995")
summary(fm)
#>
#> Call:
#> ivreg(formula = log(packs) ~ log(rprice) + log(rincome) | log(rincome) +
#> tdiff + I(tax/cpi), data = CigarettesSW, subset = year ==
#> "1995")
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.6006931 -0.0862222 -0.0009999 0.1164699 0.3734227
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 9.8950 1.0586 9.348 4.12e-12 ***
#> log(rprice) -1.2774 0.2632 -4.853 1.50e-05 ***
#> log(rincome) 0.2804 0.2386 1.175 0.246
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 0.1879 on 45 degrees of freedom
#> Multiple R-Squared: 0.4294, Adjusted R-squared: 0.4041
#> Wald test: 13.28 on 2 and 45 DF, p-value: 2.931e-05
#>
summary(fm, vcov = sandwich, df = Inf, diagnostics = TRUE)
#>
#> Call:
#> ivreg(formula = log(packs) ~ log(rprice) + log(rincome) | log(rincome) +
#> tdiff + I(tax/cpi), data = CigarettesSW, subset = year ==
#> "1995")
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.6006931 -0.0862222 -0.0009999 0.1164699 0.3734227
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 9.8950 0.9288 10.654 < 2e-16 ***
#> log(rprice) -1.2774 0.2417 -5.286 1.25e-07 ***
#> log(rincome) 0.2804 0.2458 1.141 0.254
#>
#> Diagnostic tests:
#> df1 df2 statistic p-value
#> Weak instruments 2 44 228.738 <2e-16 ***
#> Wu-Hausman 1 44 3.823 0.0569 .
#> Sargan 1 NA 0.333 0.5641
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 0.1879 on Inf degrees of freedom
#> Multiple R-Squared: 0.4294, Adjusted R-squared: 0.4041
#> Wald test: 34.51 on 2 DF, p-value: 3.214e-08
#>
## ANOVA
fm2 <- ivreg(log(packs) ~ log(rprice) | tdiff, data = CigarettesSW, subset = year == "1995")
anova(fm, fm2)
#> Analysis of Variance Table
#>
#> Model 1: log(packs) ~ log(rprice) + log(rincome) | log(rincome) + tdiff +
#> I(tax/cpi)
#> Model 2: log(packs) ~ log(rprice) | tdiff
#> Res.Df RSS Df Sum of Sq F Pr(>F)
#> 1 45 1.5880
#> 2 46 1.6668 -1 -0.078748 1.3815 0.246