Positive Binomial Distribution Family Function
posbinomial.RdFits a positive binomial distribution.
Usage
posbinomial(link = "logitlink", multiple.responses = FALSE,
parallel = FALSE, omit.constant = FALSE, p.small = 1e-4,
no.warning = FALSE, zero = NULL)Arguments
- link, multiple.responses, parallel, zero
Details at
CommonVGAMffArguments.- omit.constant
Logical. If
TRUEthen the constant (lchoose(size, size * yprop)is omitted from theloglikelihoodcalculation. If the model is to be compared usingAIC()orBIC()(seeAICvlmorBICvlm) to the likes ofposbernoulli.tbetc. then it is important to setomit.constant = TRUEbecause all models then will not have any normalizing constants in the likelihood function. Hence they become comparable. This is because the \(M_0\) Otis et al. (1978) model coincides withposbinomial(). See below for an example. Also seeposbernoulli.tregarding estimating the population size (N.hatandSE.N.hat) if the number of trials is the same for all observations.- p.small, no.warning
See
posbernoulli.t.
Details
The positive binomial distribution is the ordinary binomial distribution but with the probability of zero being zero. Thus the other probabilities are scaled up (i.e., divided by \(1-P(Y=0)\)). The fitted values are the ordinary binomial distribution fitted values, i.e., the usual mean.
In the capture–recapture literature this model is called the \(M_0\) if it is an intercept-only model. Otherwise it is called the \(M_h\) when there are covariates. It arises from a sum of a sequence of \(\tau\)-Bernoulli random variates subject to at least one success (capture). Here, each animal has the same probability of capture or recapture, regardless of the \(\tau\) sampling occasions. Independence between animals and between sampling occasions etc. is assumed.
Value
An object of class "vglmff"
(see vglmff-class).
The object is used by modelling functions
such as vglm,
and vgam.
References
Otis, D. L. et al. (1978). Statistical inference from capture data on closed animal populations, Wildlife Monographs, 62, 3–135.
Patil, G. P. (1962). Maximum likelihood estimation for generalised power series distributions and its application to a truncated binomial distribution. Biometrika, 49, 227–237.
Pearson, K. (1913). A Monograph on Albinism in Man. Drapers Company Research Memoirs.
Note
The input for this family function is the same as
binomialff.
If multiple.responses = TRUE then each column of the
matrix response should be a count (the number of successes),
and the weights argument should be a matrix of the same
dimension as the response containing the number of trials.
If multiple.responses = FALSE then the response input
should be the same as binomialff.
Yet to be done: a quasi.posbinomial() which estimates a
dispersion parameter.
Examples
# Albinotic children in families with 5 kids (from Patil, 1962) ,,,,
albinos <- data.frame(y = c(rep(1, 25), rep(2, 23), rep(3, 10), 4, 5),
n = rep(5, 60))
fit1 <- vglm(cbind(y, n-y) ~ 1, posbinomial, albinos, trace = TRUE)
#> Iteration 1: loglikelihood = -71.320528
#> Iteration 2: loglikelihood = -71.296042
#> Iteration 3: loglikelihood = -71.296039
#> Iteration 4: loglikelihood = -71.296039
summary(fit1)
#>
#> Call:
#> vglm(formula = cbind(y, n - y) ~ 1, family = posbinomial, data = albinos,
#> trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.8056 0.1504 -5.357 8.46e-08 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Name of linear predictor: logitlink(prob)
#>
#> Log-likelihood: -71.296 on 59 degrees of freedom
#>
#> Number of Fisher scoring iterations: 4
#>
Coef(fit1) # = MLE of p = 0.3088
#> prob
#> 0.3088317
head(fitted(fit1))
#> [,1]
#> 1 0.3666667
#> 2 0.3666667
#> 3 0.3666667
#> 4 0.3666667
#> 5 0.3666667
#> 6 0.3666667
sqrt(vcov(fit1, untransform = TRUE)) # SE = 0.0322
#> prob
#> prob 0.03209962
# Fit a M_0 model (Otis et al. 1978) to the deermice data ,,,,,,,,,,
M.0 <- vglm(cbind( y1 + y2 + y3 + y4 + y5 + y6,
6 - y1 - y2 - y3 - y4 - y5 - y6) ~ 1, trace = TRUE,
posbinomial(omit.constant = TRUE), data = deermice)
#> Iteration 1: loglikelihood = -157.34227
#> Iteration 2: loglikelihood = -157.27221
#> Iteration 3: loglikelihood = -157.27221
coef(M.0, matrix = TRUE)
#> logitlink(prob)
#> (Intercept) 0.0795139
Coef(M.0)
#> prob
#> 0.519868
constraints(M.0, matrix = TRUE)
#> (Intercept)
#> logitlink(prob) 1
summary(M.0)
#>
#> Call:
#> vglm(formula = cbind(y1 + y2 + y3 + y4 + y5 + y6, 6 - y1 - y2 -
#> y3 - y4 - y5 - y6) ~ 1, family = posbinomial(omit.constant = TRUE),
#> data = deermice, trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.07951 0.13739 0.579 0.563
#>
#> Name of linear predictor: logitlink(prob)
#>
#> Log-likelihood: -157.2722 on 37 degrees of freedom
#>
#> Number of Fisher scoring iterations: 3
#>
c( N.hat = M.0@extra$N.hat, # As tau = 6, i.e., 6 Bernoulli trials
SE.N.hat = M.0@extra$SE.N.hat) # per obsn is the same for each obsn
#> N.hat SE.N.hat
#> 38.4713036 0.7203915
# Compare it to the M_b using AIC and BIC
M.b <- vglm(cbind(y1, y2, y3, y4, y5, y6) ~ 1, trace = TRUE,
posbernoulli.b, data = deermice)
#> Iteration 1: loglikelihood = -151.3106
#> Iteration 2: loglikelihood = -150.4366
#> Iteration 3: loglikelihood = -150.43416
#> Iteration 4: loglikelihood = -150.43416
sort(c(M.0 = AIC(M.0), M.b = AIC(M.b))) # Ok since omit.constant=TRUE
#> M.b M.0
#> 304.8683 316.5444
sort(c(M.0 = BIC(M.0), M.b = BIC(M.b))) # Ok since omit.constant=TRUE
#> M.b M.0
#> 308.1435 318.1820