Medicaid Utilization Data
Medicaid1986.RdCross-section data originating from the 1986 Medicaid Consumer Survey. The data comprise two groups of Medicaid eligibles at two sites in California (Santa Barbara and Ventura counties): a group enrolled in a managed care demonstration program and a fee-for-service comparison group of non-enrollees.
Usage
data("Medicaid1986")Format
A data frame containing 996 observations on 14 variables.
- visits
Number of doctor visits.
- exposure
Length of observation period for ambulatory care (days).
- children
Total number of children in the household.
- age
Age of the respondent.
- income
Annual household income (average of income range in million USD).
- health1
The first principal component (divided by 1000) of three health-status variables: functional limitations, acute conditions, and chronic conditions.
- health2
The second principal component (divided by 1000) of three health-status variables: functional limitations, acute conditions, and chronic conditions.
- access
Availability of health services (0 = low access, 1 = high access).
- married
Factor. Is the individual married?
- gender
Factor indicating gender.
- ethnicity
Factor indicating ethnicity (
"cauc"or"other").- school
Number of years completed in school.
- enroll
Factor. Is the individual enrolled in a demonstration program?
- program
Factor indicating the managed care demonstration program: Aid to Families with Dependent Children (
"afdc") or non-institutionalized Supplementary Security Income ("ssi").
References
Gurmu, S. (1997). Semi-Parametric Estimation of Hurdle Regression Models with an Application to Medicaid Utilization. Journal of Applied Econometrics, 12, 225–242.
Examples
## data and packages
data("Medicaid1986")
library("MASS")
library("pscl")
## scale regressors
Medicaid1986$age2 <- Medicaid1986$age^2 / 100
Medicaid1986$school <- Medicaid1986$school / 10
Medicaid1986$income <- Medicaid1986$income / 10
## subsets
afdc <- subset(Medicaid1986, program == "afdc")[, c(1, 3:4, 15, 5:9, 11:13)]
ssi <- subset(Medicaid1986, program == "ssi")[, c(1, 3:4, 15, 5:13)]
## Gurmu (1997):
## Table VI., Poisson and negbin models
afdc_pois <- glm(visits ~ ., data = afdc, family = poisson)
summary(afdc_pois)
#>
#> Call:
#> glm(formula = visits ~ ., family = poisson, data = afdc)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.48189 0.48791 -0.988 0.32332
#> children -0.20121 0.03648 -5.516 3.47e-08 ***
#> age 0.05421 0.03056 1.774 0.07608 .
#> age2 -0.10411 0.04407 -2.363 0.01814 *
#> income 0.32013 0.12336 2.595 0.00946 **
#> health1 0.33107 0.02185 15.152 < 2e-16 ***
#> health2 0.04000 0.04108 0.974 0.33019
#> access 0.87690 0.20086 4.366 1.27e-05 ***
#> marriedyes -0.20296 0.11827 -1.716 0.08616 .
#> ethnicitycaucasian -0.20377 0.07976 -2.555 0.01063 *
#> school 0.31488 0.13701 2.298 0.02155 *
#> enrollyes -0.25880 0.07617 -3.397 0.00068 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> (Dispersion parameter for poisson family taken to be 1)
#>
#> Null deviance: 1766.2 on 484 degrees of freedom
#> Residual deviance: 1443.8 on 473 degrees of freedom
#> AIC: 2130.5
#>
#> Number of Fisher Scoring iterations: 6
#>
coeftest(afdc_pois, vcov = sandwich)
#>
#> z test of coefficients:
#>
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.481891 0.856037 -0.5629 0.57348
#> children -0.201205 0.081674 -2.4635 0.01376 *
#> age 0.054207 0.050390 1.0758 0.28203
#> age2 -0.104113 0.070500 -1.4768 0.13974
#> income 0.320133 0.304417 1.0516 0.29297
#> health1 0.331068 0.050414 6.5670 5.135e-11 ***
#> health2 0.039997 0.114934 0.3480 0.72784
#> access 0.876905 0.625975 1.4009 0.16126
#> marriedyes -0.202960 0.208669 -0.9726 0.33073
#> ethnicitycaucasian -0.203767 0.263443 -0.7735 0.43924
#> school 0.314880 0.251846 1.2503 0.21119
#> enrollyes -0.258796 0.215547 -1.2006 0.22989
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
afdc_nb <- glm.nb(visits ~ ., data = afdc)
ssi_pois <- glm(visits ~ ., data = ssi, family = poisson)
ssi_nb <- glm.nb(visits ~ ., data = ssi)
## Table VII., Hurdle models (without semi-parametric effects)
afdc_hurdle <- hurdle(visits ~ . | . - access, data = afdc, dist = "negbin")
ssi_hurdle <- hurdle(visits ~ . | . - access, data = ssi, dist = "negbin")
## Table VIII., Observed and expected frequencies
round(cbind(
Observed = table(afdc$visits)[1:8],
Poisson = sapply(0:7, function(x) sum(dpois(x, fitted(afdc_pois)))),
Negbin = sapply(0:7, function(x) sum(dnbinom(x, mu = fitted(afdc_nb), size = afdc_nb$theta))),
Hurdle = colSums(predict(afdc_hurdle, type = "prob")[,1:8])
)/nrow(afdc), digits = 3) * 100
#> Observed Poisson Negbin Hurdle
#> 0 49.7 28.4 50.1 49.7
#> 1 19.8 30.2 19.3 20.3
#> 2 11.3 19.7 10.4 10.2
#> 3 6.2 10.5 6.2 6.0
#> 4 2.9 5.2 4.0 3.8
#> 5 1.6 2.6 2.6 2.6
#> 6 2.5 1.4 1.8 1.8
#> 7 1.4 0.8 1.3 1.3
round(cbind(
Observed = table(ssi$visits)[1:8],
Poisson = sapply(0:7, function(x) sum(dpois(x, fitted(ssi_pois)))),
Negbin = sapply(0:7, function(x) sum(dnbinom(x, mu = fitted(ssi_nb), size = ssi_nb$theta))),
Hurdle = colSums(predict(ssi_hurdle, type = "prob")[,1:8])
)/nrow(ssi), digits = 3) * 100
#> Observed Poisson Negbin Hurdle
#> 0 33.1 16.4 32.7 33.1
#> 1 20.2 24.7 22.0 21.1
#> 2 13.9 22.2 14.3 14.2
#> 3 11.0 15.6 9.4 9.6
#> 4 8.4 9.6 6.3 6.5
#> 5 2.7 5.4 4.3 4.5
#> 6 3.5 2.9 3.0 3.1
#> 7 1.8 1.5 2.1 2.2