Skip to contents

Cross-section data from the 1980 US Census on married women aged 21–35 with two or more children.

Usage

data("Fertility")
data("Fertility2")

Format

A data frame containing 254,654 (and 30,000, respectively) observations on 8 variables.

morekids

factor. Does the mother have more than 2 children?

gender1

factor indicating gender of first child.

gender2

factor indicating gender of second child.

age

age of mother at census.

afam

factor. Is the mother African-American?

hispanic

factor. Is the mother Hispanic?

other

factor. Is the mother's ethnicity neither African-American nor Hispanic, nor Caucasian? (see below)

work

number of weeks in which the mother worked in 1979.

Details

Fertility2 is a random subset of Fertility with 30,000 observations.

There are conflicts in the ethnicity coding (see also examples). Hence, it was not possible to create a single factor and the original three indicator variables have been retained.

Not all variables from Angrist and Evans (1998) have been included.

Source

Online complements to Stock and Watson (2007).

References

Angrist, J.D., and Evans, W.N. (1998). Children and Their Parents' Labor Supply: Evidence from Exogenous Variation in Family Size American Economic Review, 88, 450–477.

Stock, J.H. and Watson, M.W. (2007). Introduction to Econometrics, 2nd ed. Boston: Addison Wesley.

See also

Examples

data("Fertility2")

## conflicts in ethnicity coding
ftable(xtabs(~ afam + hispanic + other, data = Fertility2))
#>               other    no   yes
#> afam hispanic                  
#> no   no             25389   811
#>      yes             1308   894
#> yes  no              1568     0
#>      yes               30     0

## create convenience variables
Fertility2$mkids <- with(Fertility2, as.numeric(morekids) - 1)
Fertility2$samegender <- with(Fertility2, factor(gender1 == gender2))
Fertility2$twoboys <- with(Fertility2, factor(gender1 == "male" & gender2 == "male"))
Fertility2$twogirls <- with(Fertility2, factor(gender1 == "female" & gender2 == "female"))

## similar to Angrist and Evans, p. 462
fm1 <- lm(mkids ~ samegender, data = Fertility2)
summary(fm1)
#> 
#> Call:
#> lm(formula = mkids ~ samegender, data = Fertility2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -0.4108 -0.4108 -0.3440  0.5892  0.6560 
#> 
#> Coefficients:
#>                Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)    0.343979   0.003962   86.83   <2e-16 ***
#> samegenderTRUE 0.066820   0.005585   11.96   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.4837 on 29998 degrees of freedom
#> Multiple R-squared:  0.004749,	Adjusted R-squared:  0.004716 
#> F-statistic: 143.1 on 1 and 29998 DF,  p-value: < 2.2e-16
#> 

fm2 <- lm(mkids ~ gender1 + gender2 + samegender + age + afam + hispanic + other, data = Fertility2)
summary(fm2)
#> 
#> Call:
#> lm(formula = mkids ~ gender1 + gender2 + samegender + age + afam + 
#>     hispanic + other, data = Fertility2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -0.7078 -0.3823 -0.3000  0.5715  0.8317 
#> 
#> Coefficients:
#>                  Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)    -0.1645094  0.0255081  -6.449 1.14e-10 ***
#> gender1male    -0.0056508  0.0055293  -1.022   0.3068    
#> gender2male    -0.0128023  0.0055300  -2.315   0.0206 *  
#> samegenderTRUE  0.0683138  0.0055294  12.355  < 2e-16 ***
#> age             0.0164593  0.0008182  20.116  < 2e-16 ***
#> afamyes         0.0962535  0.0123360   7.803 6.26e-15 ***
#> hispanicyes     0.1481327  0.0116248  12.743  < 2e-16 ***
#> otheryes        0.0240816  0.0131694   1.829   0.0675 .  
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.4786 on 29992 degrees of freedom
#> Multiple R-squared:  0.02573,	Adjusted R-squared:  0.0255 
#> F-statistic: 113.1 on 7 and 29992 DF,  p-value: < 2.2e-16
#> 

fm3 <- lm(mkids ~ gender1 + twoboys + twogirls + age + afam + hispanic + other, data = Fertility2)
summary(fm3)
#> 
#> Call:
#> lm(formula = mkids ~ gender1 + twoboys + twogirls + age + afam + 
#>     hispanic + other, data = Fertility2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -0.7078 -0.3823 -0.3000  0.5715  0.8317 
#> 
#> Coefficients:
#>                Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  -0.1773117  0.0255629  -6.936 4.11e-12 ***
#> gender1male   0.0071515  0.0078416   0.912   0.3618    
#> twoboysTRUE   0.0555115  0.0077023   7.207 5.85e-13 ***
#> twogirlsTRUE  0.0811161  0.0079363  10.221  < 2e-16 ***
#> age           0.0164593  0.0008182  20.116  < 2e-16 ***
#> afamyes       0.0962535  0.0123360   7.803 6.26e-15 ***
#> hispanicyes   0.1481327  0.0116248  12.743  < 2e-16 ***
#> otheryes      0.0240816  0.0131694   1.829   0.0675 .  
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.4786 on 29992 degrees of freedom
#> Multiple R-squared:  0.02573,	Adjusted R-squared:  0.0255 
#> F-statistic: 113.1 on 7 and 29992 DF,  p-value: < 2.2e-16
#>