Ship Accidents — ShipAccidents • AER

Data on ship accidents.

Usage

data("ShipAccidents")

Format

A data frame containing 40 observations on 5 ship types in 4 vintages and 2 service periods.

type: factor with levels "A" to "E" for the different ship types,
construction: factor with levels "1960-64", "1965-69", "1970-74", "1975-79" for the periods of construction,
operation: factor with levels "1960-74", "1975-79" for the periods of operation,
service: aggregate months of service,
incidents: number of damage incidents.

Details

The data are from McCullagh and Nelder (1989, p. 205, Table 6.2) and were also used by Greene (2003, Ch. 21), see below.

There are five ships (observations 7, 15, 23, 31, 39) with an operation period before the construction period, hence the variables service and incidents are necessarily 0. An additional observation (34) has entries representing accidentally empty cells (see McCullagh and Nelder, 1989, p. 205).

It is a bit unclear what exactly the above means. In any case, the models are fit only to those observations with service > 0.

Source

Online complements to Greene (2003).

https://pages.stern.nyu.edu/~wgreene/Text/tables/tablelist5.htm

References

Greene, W.H. (2003). Econometric Analysis, 5th edition. Upper Saddle River, NJ: Prentice Hall.

McCullagh, P. and Nelder, J.A. (1989). Generalized Linear Models, 2nd edition. London: Chapman & Hall.

See also

Examples

data("ShipAccidents")
sa <- subset(ShipAccidents, service > 0)

## Greene (2003), Table 21.20
## (see also McCullagh and Nelder, 1989, Table 6.3)
sa_full <- glm(incidents ~ type + construction + operation, family = poisson,
  data = sa, offset = log(service))
summary(sa_full)
#> 
#> Call:
#> glm(formula = incidents ~ type + construction + operation, family = poisson, 
#>     data = sa, offset = log(service))
#> 
#> Coefficients:
#>                     Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)         -6.40288    0.21752 -29.435  < 2e-16 ***
#> typeB               -0.54471    0.17761  -3.067  0.00216 ** 
#> typeC               -0.68876    0.32903  -2.093  0.03632 *  
#> typeD               -0.07431    0.29056  -0.256  0.79815    
#> typeE                0.32053    0.23575   1.360  0.17396    
#> construction1965-69  0.69585    0.14966   4.650 3.33e-06 ***
#> construction1970-74  0.81746    0.16984   4.813 1.49e-06 ***
#> construction1975-79  0.44497    0.23324   1.908  0.05642 .  
#> operation1975-79     0.38386    0.11826   3.246  0.00117 ** 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 146.328  on 33  degrees of freedom
#> Residual deviance:  38.963  on 25  degrees of freedom
#> AIC: 154.83
#> 
#> Number of Fisher Scoring iterations: 5
#> 

sa_notype <- glm(incidents ~ construction + operation, family = poisson,
  data = sa, offset = log(service))
summary(sa_notype)
#> 
#> Call:
#> glm(formula = incidents ~ construction + operation, family = poisson, 
#>     data = sa, offset = log(service))
#> 
#> Coefficients:
#>                     Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)          -6.9470     0.1269 -54.725  < 2e-16 ***
#> construction1965-69   0.7536     0.1488   5.066 4.07e-07 ***
#> construction1970-74   1.0503     0.1576   6.666 2.63e-11 ***
#> construction1975-79   0.6999     0.2203   3.177  0.00149 ** 
#> operation1975-79      0.3872     0.1181   3.279  0.00104 ** 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 146.328  on 33  degrees of freedom
#> Residual deviance:  62.536  on 29  degrees of freedom
#> AIC: 170.4
#> 
#> Number of Fisher Scoring iterations: 4
#> 

sa_noperiod <- glm(incidents ~ type + operation, family = poisson,
  data = sa, offset = log(service))
summary(sa_noperiod)
#> 
#> Call:
#> glm(formula = incidents ~ type + operation, family = poisson, 
#>     data = sa, offset = log(service))
#> 
#> Coefficients:
#>                  Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)       -5.8000     0.1784 -32.508  < 2e-16 ***
#> typeB             -0.7437     0.1691  -4.397 1.10e-05 ***
#> typeC             -0.7549     0.3276  -2.304   0.0212 *  
#> typeD             -0.1843     0.2876  -0.641   0.5215    
#> typeE              0.3842     0.2348   1.636   0.1018    
#> operation1975-79   0.5001     0.1116   4.483 7.37e-06 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 146.328  on 33  degrees of freedom
#> Residual deviance:  70.364  on 28  degrees of freedom
#> AIC: 180.23
#> 
#> Number of Fisher Scoring iterations: 5
#> 

## model comparison
anova(sa_full, sa_notype, test = "Chisq")
#> Analysis of Deviance Table
#> 
#> Model 1: incidents ~ type + construction + operation
#> Model 2: incidents ~ construction + operation
#>   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
#> 1        25     38.963                          
#> 2        29     62.536 -4  -23.573 9.725e-05 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
anova(sa_full, sa_noperiod, test = "Chisq")
#> Analysis of Deviance Table
#> 
#> Model 1: incidents ~ type + construction + operation
#> Model 2: incidents ~ type + operation
#>   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
#> 1        25     38.963                          
#> 2        28     70.364 -3  -31.401 6.998e-07 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

## test for overdispersion
dispersiontest(sa_full)
#> 
#> 	Overdispersion test
#> 
#> data:  sa_full
#> z = 0.93429, p-value = 0.1751
#> alternative hypothesis: true dispersion is greater than 1
#> sample estimates:
#> dispersion 
#>   1.317918 
#> 
dispersiontest(sa_full, trafo = 2)
#> 
#> 	Overdispersion test
#> 
#> data:  sa_full
#> z = -0.6129, p-value = 0.73
#> alternative hypothesis: true alpha is greater than 0
#> sample estimates:
#>      alpha 
#> -0.0111868 
#>