Australian Health Service Utilization Data — DoctorVisits • AER

Cross-section data originating from the 1977–1978 Australian Health Survey.

Usage

data("DoctorVisits")

Format

A data frame containing 5,190 observations on 12 variables.

visits: Number of doctor visits in past 2 weeks.
gender: Factor indicating gender.
age: Age in years divided by 100.
income: Annual income in tens of thousands of dollars.
illness: Number of illnesses in past 2 weeks.
reduced: Number of days of reduced activity in past 2 weeks due to illness or injury.
health: General health questionnaire score using Goldberg's method.
private: Factor. Does the individual have private health insurance?
freepoor: Factor. Does the individual have free government health insurance due to low income?
freerepat: Factor. Does the individual have free government health insurance due to old age, disability or veteran status?
nchronic: Factor. Is there a chronic condition not limiting activity?
lchronic: Factor. Is there a chronic condition limiting activity?

Source

Journal of Applied Econometrics Data Archive.

http://qed.econ.queensu.ca/jae/1997-v12.3/mullahy/

References

Cameron, A.C. and Trivedi, P.K. (1986). Econometric Models Based on Count Data: Comparisons and Applications of Some Estimators and Tests. Journal of Applied Econometrics, 1, 29–53.

Cameron, A.C. and Trivedi, P.K. (1998). Regression Analysis of Count Data. Cambridge: Cambridge University Press.

Mullahy, J. (1997). Heterogeneity, Excess Zeros, and the Structure of Count Data Models. Journal of Applied Econometrics, 12, 337–350.

See also

CameronTrivedi1998

Examples

data("DoctorVisits", package = "AER")
library("MASS")

## Cameron and Trivedi (1986), Table III, col. (1)
dv_lm <- lm(visits ~ . + I(age^2), data = DoctorVisits)
summary(dv_lm)
#> 
#> Call:
#> lm(formula = visits ~ . + I(age^2), data = DoctorVisits)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -2.1352 -0.2588 -0.1435 -0.0433  7.0327 
#> 
#> Coefficients:
#>               Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   0.027632   0.072220   0.383  0.70202    
#> genderfemale  0.033811   0.021604   1.565  0.11764    
#> age           0.203201   0.410016   0.496  0.62020    
#> income       -0.057323   0.033089  -1.732  0.08326 .  
#> illness       0.059946   0.008357   7.173 8.39e-13 ***
#> reduced       0.103192   0.003657  28.216  < 2e-16 ***
#> health        0.016976   0.005190   3.271  0.00108 ** 
#> privateyes    0.035179   0.024882   1.414  0.15748    
#> freepooryes  -0.103314   0.052471  -1.969  0.04901 *  
#> freerepatyes  0.033241   0.038157   0.871  0.38371    
#> nchronicyes   0.004384   0.023740   0.185  0.85349    
#> lchronicyes   0.041617   0.035863   1.160  0.24592    
#> I(age^2)     -0.062103   0.458716  -0.135  0.89231    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.7139 on 5177 degrees of freedom
#> Multiple R-squared:  0.2018,	Adjusted R-squared:    0.2 
#> F-statistic: 109.1 on 12 and 5177 DF,  p-value: < 2.2e-16
#> 

## Cameron and Trivedi (1998), Table 3.3 
dv_pois <- glm(visits ~ . + I(age^2), data = DoctorVisits, family = poisson)
summary(dv_pois)                  ## MLH standard errors
#> 
#> Call:
#> glm(formula = visits ~ . + I(age^2), family = poisson, data = DoctorVisits)
#> 
#> Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  -2.223848   0.189816 -11.716   <2e-16 ***
#> genderfemale  0.156882   0.056137   2.795   0.0052 ** 
#> age           1.056299   1.000780   1.055   0.2912    
#> income       -0.205321   0.088379  -2.323   0.0202 *  
#> illness       0.186948   0.018281  10.227   <2e-16 ***
#> reduced       0.126846   0.005034  25.198   <2e-16 ***
#> health        0.030081   0.010099   2.979   0.0029 ** 
#> privateyes    0.123185   0.071640   1.720   0.0855 .  
#> freepooryes  -0.440061   0.179811  -2.447   0.0144 *  
#> freerepatyes  0.079798   0.092060   0.867   0.3860    
#> nchronicyes   0.114085   0.066640   1.712   0.0869 .  
#> lchronicyes   0.141158   0.083145   1.698   0.0896 .  
#> I(age^2)     -0.848704   1.077784  -0.787   0.4310    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 5634.8  on 5189  degrees of freedom
#> Residual deviance: 4379.5  on 5177  degrees of freedom
#> AIC: 6737.1
#> 
#> Number of Fisher Scoring iterations: 6
#> 
coeftest(dv_pois, vcov = vcovOPG) ## MLOP standard errors
#> 
#> z test of coefficients:
#> 
#>                Estimate Std. Error  z value  Pr(>|z|)    
#> (Intercept)  -2.2238482  0.1443306 -15.4080 < 2.2e-16 ***
#> genderfemale  0.1568820  0.0406153   3.8626 0.0001122 ***
#> age           1.0562990  0.7498654   1.4087 0.1589382    
#> income       -0.2053206  0.0619209  -3.3159 0.0009136 ***
#> illness       0.1869484  0.0141893  13.1753 < 2.2e-16 ***
#> reduced       0.1268465  0.0035073  36.1661 < 2.2e-16 ***
#> health        0.0300810  0.0073544   4.0902  4.31e-05 ***
#> privateyes    0.1231854  0.0560472   2.1979 0.0279571 *  
#> freepooryes  -0.4400609  0.1163511  -3.7822 0.0001555 ***
#> freerepatyes  0.0797984  0.0700594   1.1390 0.2546984    
#> nchronicyes   0.1140853  0.0514849   2.2159 0.0266986 *  
#> lchronicyes   0.1411583  0.0586310   2.4076 0.0160591 *  
#> I(age^2)     -0.8487036  0.8092146  -1.0488 0.2942705    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
logLik(dv_pois)
#> 'log Lik.' -3355.541 (df=13)
## standard errors denoted RS ("unspecified omega robust sandwich estimate")
coeftest(dv_pois, vcov = sandwich)
#> 
#> z test of coefficients:
#> 
#>                Estimate Std. Error z value  Pr(>|z|)    
#> (Intercept)  -2.2238482  0.2544322 -8.7404 < 2.2e-16 ***
#> genderfemale  0.1568820  0.0792133  1.9805   0.04765 *  
#> age           1.0562990  1.3643427  0.7742   0.43880    
#> income       -0.2053206  0.1292447 -1.5886   0.11215    
#> illness       0.1869484  0.0239364  7.8102 5.709e-15 ***
#> reduced       0.1268465  0.0077691 16.3271 < 2.2e-16 ***
#> health        0.0300810  0.0142345  2.1132   0.03458 *  
#> privateyes    0.1231854  0.0951560  1.2946   0.19547    
#> freepooryes  -0.4400609  0.2899945 -1.5175   0.12915    
#> freerepatyes  0.0797984  0.1257832  0.6344   0.52581    
#> nchronicyes   0.1140853  0.0908453  1.2558   0.20918    
#> lchronicyes   0.1411583  0.1227108  1.1503   0.25001    
#> I(age^2)     -0.8487036  1.4595426 -0.5815   0.56091    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

## Cameron and Trivedi (1986), Table III, col. (4)
dv_nb <- glm.nb(visits ~ . + I(age^2), data = DoctorVisits)
summary(dv_nb)
#> 
#> Call:
#> glm.nb(formula = visits ~ . + I(age^2), data = DoctorVisits, 
#>     init.theta = 0.9284725333, link = log)
#> 
#> Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  -2.190007   0.233592  -9.375  < 2e-16 ***
#> genderfemale  0.216644   0.069697   3.108  0.00188 ** 
#> age          -0.216159   1.266701  -0.171  0.86450    
#> income       -0.142202   0.108417  -1.312  0.18965    
#> illness       0.214341   0.023579   9.090  < 2e-16 ***
#> reduced       0.143754   0.007311  19.662  < 2e-16 ***
#> health        0.038060   0.013654   2.788  0.00531 ** 
#> privateyes    0.118064   0.085806   1.376  0.16884    
#> freepooryes  -0.496611   0.210803  -2.356  0.01848 *  
#> freerepatyes  0.144982   0.115970   1.250  0.21124    
#> nchronicyes   0.099355   0.079303   1.253  0.21026    
#> lchronicyes   0.190327   0.104357   1.824  0.06818 .  
#> I(age^2)      0.609158   1.383245   0.440  0.65966    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for Negative Binomial(0.9285) family taken to be 1)
#> 
#>     Null deviance: 3928.7  on 5189  degrees of freedom
#> Residual deviance: 3028.3  on 5177  degrees of freedom
#> AIC: 6425.5
#> 
#> Number of Fisher Scoring iterations: 1
#> 
#> 
#>               Theta:  0.9285 
#>           Std. Err.:  0.0864 
#> 
#>  2 x log-likelihood:  -6397.4880 
logLik(dv_nb)
#> 'log Lik.' -3198.744 (df=14)