Skip to contents

Cross-section data on the number of recreational boating trips to Lake Somerville, Texas, in 1980, based on a survey administered to 2,000 registered leisure boat owners in 23 counties in eastern Texas.

Usage

data("RecreationDemand")

Format

A data frame containing 659 observations on 8 variables.

trips

Number of recreational boating trips.

quality

Facility's subjective quality ranking on a scale of 1 to 5.

ski

factor. Was the individual engaged in water-skiing at the lake?

income

Annual household income of the respondent (in 1,000 USD).

userfee

factor. Did the individual pay an annual user fee at Lake Somerville?

costC

Expenditure when visiting Lake Conroe (in USD).

costS

Expenditure when visiting Lake Somerville (in USD).

costH

Expenditure when visiting Lake Houston (in USD).

Details

According to the original source (Seller, Stoll and Chavas, 1985, p. 168), the quality rating is on a scale from 1 to 5 and gives 0 for those who had not visited the lake. This explains the remarkably low mean for this variable, but also suggests that its treatment in various more recent publications is far from ideal. For consistency with other sources we handle the variable as a numerical variable, including the zeros.

Source

Journal of Business & Economic Statistics Data Archive.

http://www.amstat.org/publications/jbes/upload/index.cfm?fuseaction=ViewArticles&pub=JBES&issue=96-4-OCT

References

Cameron, A.C. and Trivedi, P.K. (1998). Regression Analysis of Count Data. Cambridge: Cambridge University Press.

Gurmu, S. and Trivedi, P.K. (1996). Excess Zeros in Count Models for Recreational Trips. Journal of Business & Economic Statistics, 14, 469–477.

Ozuna, T. and Gomez, I.A. (1995). Specification and Testing of Count Data Recreation Demand Functions. Empirical Economics, 20, 543–550.

Seller, C., Stoll, J.R. and Chavas, J.-P. (1985). Validation of Empirical Measures of Welfare Change: A Comparison of Nonmarket Techniques. Land Economics, 61, 156–175.

Examples

data("RecreationDemand")

## Poisson model:
## Cameron and Trivedi (1998), Table 6.11
## Ozuna and Gomez (1995), Table 2, col. 3
fm_pois <- glm(trips ~ ., data = RecreationDemand, family = poisson)
summary(fm_pois)
#> 
#> Call:
#> glm(formula = trips ~ ., family = poisson, data = RecreationDemand)
#> 
#> Coefficients:
#>              Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  0.264993   0.093722   2.827  0.00469 ** 
#> quality      0.471726   0.017091  27.602  < 2e-16 ***
#> skiyes       0.418214   0.057190   7.313 2.62e-13 ***
#> income      -0.111323   0.019588  -5.683 1.32e-08 ***
#> userfeeyes   0.898165   0.078985  11.371  < 2e-16 ***
#> costC       -0.003430   0.003118  -1.100  0.27131    
#> costS       -0.042536   0.001670 -25.467  < 2e-16 ***
#> costH        0.036134   0.002710  13.335  < 2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 4849.7  on 658  degrees of freedom
#> Residual deviance: 2305.8  on 651  degrees of freedom
#> AIC: 3074.9
#> 
#> Number of Fisher Scoring iterations: 7
#> 
logLik(fm_pois)
#> 'log Lik.' -1529.431 (df=8)
coeftest(fm_pois, vcov = sandwich)
#> 
#> z test of coefficients:
#> 
#>               Estimate Std. Error z value  Pr(>|z|)    
#> (Intercept)  0.2649934  0.4324810  0.6127 0.5400559    
#> quality      0.4717259  0.0488508  9.6565 < 2.2e-16 ***
#> skiyes       0.4182137  0.1938713  2.1572 0.0309922 *  
#> income      -0.1113232  0.0503083 -2.2128 0.0269101 *  
#> userfeeyes   0.8981653  0.2469086  3.6376 0.0002751 ***
#> costC       -0.0034297  0.0146973 -0.2334 0.8154852    
#> costS       -0.0425364  0.0117348 -3.6248 0.0002892 ***
#> costH        0.0361336  0.0093860  3.8497 0.0001183 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

## Negbin model:
## Cameron and Trivedi (1998), Table 6.11
## Ozuna and Gomez (1995), Table 2, col. 5
library("MASS")
fm_nb <- glm.nb(trips ~ ., data = RecreationDemand)
coeftest(fm_nb, vcov = vcovOPG)
#> 
#> z test of coefficients:
#> 
#>               Estimate Std. Error  z value  Pr(>|z|)    
#> (Intercept) -1.1219363  0.1909098  -5.8768 4.183e-09 ***
#> quality      0.7219990  0.0399627  18.0668 < 2.2e-16 ***
#> skiyes       0.6121388  0.1395255   4.3873 1.148e-05 ***
#> income      -0.0260588  0.0401183  -0.6495     0.516    
#> userfeeyes   0.6691676  0.4488554   1.4908     0.136    
#> costC        0.0480087  0.0103573   4.6353 3.565e-06 ***
#> costS       -0.0926910  0.0060193 -15.3990 < 2.2e-16 ***
#> costH        0.0388357  0.0087604   4.4331 9.288e-06 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

## ZIP model:
## Cameron and Trivedi (1998), Table 6.11
library("pscl")
fm_zip <- zeroinfl(trips ~  . | quality + income, data = RecreationDemand)
summary(fm_zip)
#> 
#> Call:
#> zeroinfl(formula = trips ~ . | quality + income, data = RecreationDemand)
#> 
#> Pearson residuals:
#>     Min      1Q  Median      3Q     Max 
#> -6.3255 -0.2714 -0.1809 -0.1646 13.3126 
#> 
#> Count model coefficients (poisson with log link):
#>              Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  2.099163   0.111397  18.844  < 2e-16 ***
#> quality      0.033833   0.023914   1.415    0.157    
#> skiyes       0.471691   0.058187   8.106 5.21e-16 ***
#> income      -0.099780   0.020779  -4.802 1.57e-06 ***
#> userfeeyes   0.610488   0.079435   7.685 1.53e-14 ***
#> costC        0.002369   0.003818   0.620    0.535    
#> costS       -0.037600   0.002038 -18.454  < 2e-16 ***
#> costH        0.025234   0.003355   7.522 5.40e-14 ***
#> 
#> Zero-inflation model coefficients (binomial with logit link):
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  3.29191    0.51608   6.379 1.79e-10 ***
#> quality     -1.91407    0.20619  -9.283  < 2e-16 ***
#> income      -0.04502    0.10797  -0.417    0.677    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
#> 
#> Number of iterations in BFGS optimization: 23 
#> Log-likelihood: -1181 on 11 Df

## Hurdle models
## Cameron and Trivedi (1998), Table 6.13
## poisson-poisson
fm_hp <- hurdle(trips ~ ., data = RecreationDemand, dist = "poisson", zero = "poisson")
## negbin-negbin
fm_hnb <- hurdle(trips ~ ., data = RecreationDemand, dist = "negbin", zero = "negbin")
## binom-negbin == geo-negbin
fm_hgnb <- hurdle(trips ~ ., data = RecreationDemand, dist = "negbin")

## Note: quasi-complete separation
with(RecreationDemand, table(trips > 0, userfee))
#>        userfee
#>          no yes
#>   FALSE 417   0
#>   TRUE  229  13