Australian Health Service Utilization Data
DoctorVisits.RdCross-section data originating from the 1977–1978 Australian Health Survey.
Usage
data("DoctorVisits")Format
A data frame containing 5,190 observations on 12 variables.
- visits
Number of doctor visits in past 2 weeks.
- gender
Factor indicating gender.
- age
Age in years divided by 100.
- income
Annual income in tens of thousands of dollars.
- illness
Number of illnesses in past 2 weeks.
- reduced
Number of days of reduced activity in past 2 weeks due to illness or injury.
- health
General health questionnaire score using Goldberg's method.
- private
Factor. Does the individual have private health insurance?
- freepoor
Factor. Does the individual have free government health insurance due to low income?
- freerepat
Factor. Does the individual have free government health insurance due to old age, disability or veteran status?
- nchronic
Factor. Is there a chronic condition not limiting activity?
- lchronic
Factor. Is there a chronic condition limiting activity?
References
Cameron, A.C. and Trivedi, P.K. (1986). Econometric Models Based on Count Data: Comparisons and Applications of Some Estimators and Tests. Journal of Applied Econometrics, 1, 29–53.
Cameron, A.C. and Trivedi, P.K. (1998). Regression Analysis of Count Data. Cambridge: Cambridge University Press.
Mullahy, J. (1997). Heterogeneity, Excess Zeros, and the Structure of Count Data Models. Journal of Applied Econometrics, 12, 337–350.
Examples
data("DoctorVisits", package = "AER")
library("MASS")
## Cameron and Trivedi (1986), Table III, col. (1)
dv_lm <- lm(visits ~ . + I(age^2), data = DoctorVisits)
summary(dv_lm)
#>
#> Call:
#> lm(formula = visits ~ . + I(age^2), data = DoctorVisits)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.1352 -0.2588 -0.1435 -0.0433 7.0327
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.027632 0.072220 0.383 0.70202
#> genderfemale 0.033811 0.021604 1.565 0.11764
#> age 0.203201 0.410016 0.496 0.62020
#> income -0.057323 0.033089 -1.732 0.08326 .
#> illness 0.059946 0.008357 7.173 8.39e-13 ***
#> reduced 0.103192 0.003657 28.216 < 2e-16 ***
#> health 0.016976 0.005190 3.271 0.00108 **
#> privateyes 0.035179 0.024882 1.414 0.15748
#> freepooryes -0.103314 0.052471 -1.969 0.04901 *
#> freerepatyes 0.033241 0.038157 0.871 0.38371
#> nchronicyes 0.004384 0.023740 0.185 0.85349
#> lchronicyes 0.041617 0.035863 1.160 0.24592
#> I(age^2) -0.062103 0.458716 -0.135 0.89231
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 0.7139 on 5177 degrees of freedom
#> Multiple R-squared: 0.2018, Adjusted R-squared: 0.2
#> F-statistic: 109.1 on 12 and 5177 DF, p-value: < 2.2e-16
#>
## Cameron and Trivedi (1998), Table 3.3
dv_pois <- glm(visits ~ . + I(age^2), data = DoctorVisits, family = poisson)
summary(dv_pois) ## MLH standard errors
#>
#> Call:
#> glm(formula = visits ~ . + I(age^2), family = poisson, data = DoctorVisits)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -2.223848 0.189816 -11.716 <2e-16 ***
#> genderfemale 0.156882 0.056137 2.795 0.0052 **
#> age 1.056299 1.000780 1.055 0.2912
#> income -0.205321 0.088379 -2.323 0.0202 *
#> illness 0.186948 0.018281 10.227 <2e-16 ***
#> reduced 0.126846 0.005034 25.198 <2e-16 ***
#> health 0.030081 0.010099 2.979 0.0029 **
#> privateyes 0.123185 0.071640 1.720 0.0855 .
#> freepooryes -0.440061 0.179811 -2.447 0.0144 *
#> freerepatyes 0.079798 0.092060 0.867 0.3860
#> nchronicyes 0.114085 0.066640 1.712 0.0869 .
#> lchronicyes 0.141158 0.083145 1.698 0.0896 .
#> I(age^2) -0.848704 1.077784 -0.787 0.4310
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> (Dispersion parameter for poisson family taken to be 1)
#>
#> Null deviance: 5634.8 on 5189 degrees of freedom
#> Residual deviance: 4379.5 on 5177 degrees of freedom
#> AIC: 6737.1
#>
#> Number of Fisher Scoring iterations: 6
#>
coeftest(dv_pois, vcov = vcovOPG) ## MLOP standard errors
#>
#> z test of coefficients:
#>
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -2.2238482 0.1443306 -15.4080 < 2.2e-16 ***
#> genderfemale 0.1568820 0.0406153 3.8626 0.0001122 ***
#> age 1.0562990 0.7498654 1.4087 0.1589382
#> income -0.2053206 0.0619209 -3.3159 0.0009136 ***
#> illness 0.1869484 0.0141893 13.1753 < 2.2e-16 ***
#> reduced 0.1268465 0.0035073 36.1661 < 2.2e-16 ***
#> health 0.0300810 0.0073544 4.0902 4.31e-05 ***
#> privateyes 0.1231854 0.0560472 2.1979 0.0279571 *
#> freepooryes -0.4400609 0.1163511 -3.7822 0.0001555 ***
#> freerepatyes 0.0797984 0.0700594 1.1390 0.2546984
#> nchronicyes 0.1140853 0.0514849 2.2159 0.0266986 *
#> lchronicyes 0.1411583 0.0586310 2.4076 0.0160591 *
#> I(age^2) -0.8487036 0.8092146 -1.0488 0.2942705
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
logLik(dv_pois)
#> 'log Lik.' -3355.541 (df=13)
## standard errors denoted RS ("unspecified omega robust sandwich estimate")
coeftest(dv_pois, vcov = sandwich)
#>
#> z test of coefficients:
#>
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -2.2238482 0.2544322 -8.7404 < 2.2e-16 ***
#> genderfemale 0.1568820 0.0792133 1.9805 0.04765 *
#> age 1.0562990 1.3643427 0.7742 0.43880
#> income -0.2053206 0.1292447 -1.5886 0.11215
#> illness 0.1869484 0.0239364 7.8102 5.709e-15 ***
#> reduced 0.1268465 0.0077691 16.3271 < 2.2e-16 ***
#> health 0.0300810 0.0142345 2.1132 0.03458 *
#> privateyes 0.1231854 0.0951560 1.2946 0.19547
#> freepooryes -0.4400609 0.2899945 -1.5175 0.12915
#> freerepatyes 0.0797984 0.1257832 0.6344 0.52581
#> nchronicyes 0.1140853 0.0908453 1.2558 0.20918
#> lchronicyes 0.1411583 0.1227108 1.1503 0.25001
#> I(age^2) -0.8487036 1.4595426 -0.5815 0.56091
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
## Cameron and Trivedi (1986), Table III, col. (4)
dv_nb <- glm.nb(visits ~ . + I(age^2), data = DoctorVisits)
summary(dv_nb)
#>
#> Call:
#> glm.nb(formula = visits ~ . + I(age^2), data = DoctorVisits,
#> init.theta = 0.9284725333, link = log)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -2.190007 0.233592 -9.375 < 2e-16 ***
#> genderfemale 0.216644 0.069697 3.108 0.00188 **
#> age -0.216159 1.266701 -0.171 0.86450
#> income -0.142202 0.108417 -1.312 0.18965
#> illness 0.214341 0.023579 9.090 < 2e-16 ***
#> reduced 0.143754 0.007311 19.662 < 2e-16 ***
#> health 0.038060 0.013654 2.788 0.00531 **
#> privateyes 0.118064 0.085806 1.376 0.16884
#> freepooryes -0.496611 0.210803 -2.356 0.01848 *
#> freerepatyes 0.144982 0.115970 1.250 0.21124
#> nchronicyes 0.099355 0.079303 1.253 0.21026
#> lchronicyes 0.190327 0.104357 1.824 0.06818 .
#> I(age^2) 0.609158 1.383245 0.440 0.65966
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> (Dispersion parameter for Negative Binomial(0.9285) family taken to be 1)
#>
#> Null deviance: 3928.7 on 5189 degrees of freedom
#> Residual deviance: 3028.3 on 5177 degrees of freedom
#> AIC: 6425.5
#>
#> Number of Fisher Scoring iterations: 1
#>
#>
#> Theta: 0.9285
#> Std. Err.: 0.0864
#>
#> 2 x log-likelihood: -6397.4880
logLik(dv_nb)
#> 'log Lik.' -3198.744 (df=14)