Zero-Inflated Binomial Distribution Family Function
zibinomial.RdFits a zero-inflated binomial distribution by maximum likelihood estimation.
Usage
zibinomial(lpstr0 = "logitlink", lprob = "logitlink",
type.fitted = c("mean", "prob", "pobs0", "pstr0", "onempstr0"),
ipstr0 = NULL, zero = NULL, multiple.responses = FALSE,
imethod = 1)
zibinomialff(lprob = "logitlink", lonempstr0 = "logitlink",
type.fitted = c("mean", "prob", "pobs0", "pstr0", "onempstr0"),
ionempstr0 = NULL, zero = "onempstr0",
multiple.responses = FALSE, imethod = 1)Arguments
- lpstr0, lprob
Link functions for the parameter \(\phi\) and the usual binomial probability \(\mu\) parameter. See
Linksfor more choices. For the zero-deflated model see below.
- type.fitted
See
CommonVGAMffArgumentsandfittedvlm.- ipstr0
Optional initial values for \(\phi\), whose values must lie between 0 and 1. The default is to compute an initial value internally. If a vector then recyling is used.
- lonempstr0, ionempstr0
Corresponding arguments for the other parameterization. See details below.
- multiple.responses
Logical. Currently it must be
FALSEto mean the function does not handle multiple responses. This is to remain compatible with the same argument inbinomialff.- zero, imethod
See
CommonVGAMffArgumentsfor information. Argumentzerochanged its default value for version 0.9-2.
Details
These functions are based on
$$P(Y=0) = \phi + (1-\phi) (1-\mu)^N,$$
for \(y=0\), and
$$P(Y=y) = (1-\phi) {N \choose Ny} \mu^{Ny} (1-\mu)^{N(1-y)}.$$
for \(y=1/N,2/N,\ldots,1\). That is, the response is a sample
proportion out of \(N\) trials, and the argument size in
rzibinom is \(N\) here.
The parameter \(\phi\) is the probability of a structural zero,
and it satisfies \(0 < \phi < 1\).
The mean of \(Y\) is \(E(Y)=(1-\phi) \mu\)
and these are returned as the fitted values
by default.
By default, the two linear/additive predictors
for zibinomial()
are \((logit(\phi), logit(\mu))^T\).
The VGAM family function zibinomialff() has a few
changes compared to zibinomial().
These are:
(i) the order of the linear/additive predictors is switched so the
binomial probability comes first;
(ii) argument onempstr0 is now 1 minus
the probability of a structural zero, i.e.,
the probability of the parent (binomial) component,
i.e., onempstr0 is 1-pstr0;
(iii) argument zero has a new default so that the onempstr0
is intercept-only by default.
Now zibinomialff() is generally recommended over
zibinomial().
Both functions implement Fisher scoring.
Value
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm
and vgam.
References
Welsh, A. H., Lindenmayer, D. B. and Donnelly, C. F. (2013). Fitting and interpreting occupancy models. PLOS One, 8, 1–21.
Note
The response variable must have one of the formats described by
binomialff, e.g., a factor or two column matrix or a
vector of sample proportions with the weights argument
specifying the values of \(N\).
To work well, one needs large values of \(N\) and \(\mu>0\), i.e., the larger \(N\) and \(\mu\) are, the better. If \(N = 1\) then the model is unidentifiable since the number of parameters is excessive.
Setting stepsize = 0.5, say, may aid convergence.
Estimated probabilities of a structural zero and an
observed zero are returned, as in zipoisson.
The zero-deflated binomial distribution might
be fitted by setting lpstr0 = identitylink, albeit,
not entirely reliably. See zipoisson
for information that can be applied here. Else
try the zero-altered binomial distribution (see
zabinomial).
Warning
Numerical problems can occur.
Half-stepping is not uncommon.
If failure to converge occurs, make use of the argument ipstr0
or ionempstr0,
or imethod.
Examples
size <- 10 # Number of trials; N in the notation above
nn <- 200
zdata <- data.frame(pstr0 = logitlink( 0, inverse = TRUE), # 0.50
mubin = logitlink(-1, inverse = TRUE), # Mean of usual binomial
sv = rep(size, length = nn))
zdata <- transform(zdata,
y = rzibinom(nn, size = sv, prob = mubin, pstr0 = pstr0))
with(zdata, table(y))
#> y
#> 0 1 2 3 4 5 6
#> 107 17 27 24 18 4 3
fit <- vglm(cbind(y, sv - y) ~ 1, zibinomialff, data = zdata, trace = TRUE)
#> Iteration 1: loglikelihood = -341.63236
#> Iteration 2: loglikelihood = -415.62537
#> Taking a modified step.
#> Iteration 2 : loglikelihood = -302.47413
#> Iteration 3: loglikelihood = -290.68214
#> Iteration 4: loglikelihood = -286.87521
#> Iteration 5: loglikelihood = -286.86461
#> Iteration 6: loglikelihood = -286.86461
fit <- vglm(cbind(y, sv - y) ~ 1, zibinomialff, data = zdata, trace = TRUE,
stepsize = 0.5)
#> Taking a modified step.
#> Iteration 2 : loglikelihood = -302.47413
#> Taking a modified step.
#> Iteration 3 : loglikelihood = -288.83102
#> Taking a modified step.
#> Iteration 4 : loglikelihood = -287.30599
#> Taking a modified step.
#> Iteration 5 : loglikelihood = -286.97092
#> Taking a modified step.
#> Iteration 6 : loglikelihood = -286.89078
#> Taking a modified step.
#> Iteration 7 : loglikelihood = -286.87111
#> Taking a modified step.
#> Iteration 8 : loglikelihood = -286.86623
#> Taking a modified step.
#> Iteration 9 : loglikelihood = -286.86502
#> Taking a modified step.
#> Iteration 10 : loglikelihood = -286.86471
#> Taking a modified step.
#> Iteration 11 : loglikelihood = -286.86464
#> Taking a modified step.
#> Iteration 12 : loglikelihood = -286.86462
#> Taking a modified step.
#> Iteration 13 : loglikelihood = -286.86462
#> Taking a modified step.
#> Iteration 14 : loglikelihood = -286.86461
coef(fit, matrix = TRUE)
#> logitlink(prob) logitlink(onempstr0)
#> (Intercept) -1.054615 -0.04149962
Coef(fit) # Useful for intercept-only models
#> prob onempstr0
#> 0.2583398 0.4896266
head(fitted(fit, type = "pobs0")) # Estimate of P(Y = 0)
#> [,1]
#> 1 0.535029
#> 2 0.535029
#> 3 0.535029
#> 4 0.535029
#> 5 0.535029
#> 6 0.535029
head(fitted(fit))
#> [,1]
#> 1 0.12649
#> 2 0.12649
#> 3 0.12649
#> 4 0.12649
#> 5 0.12649
#> 6 0.12649
with(zdata, mean(y)) # Compare this with fitted(fit)
#> [1] 1.265
summary(fit)
#>
#> Call:
#> vglm(formula = cbind(y, sv - y) ~ 1, family = zibinomialff, data = zdata,
#> trace = TRUE, stepsize = 0.5)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept):1 -1.05462 0.08086 -13.043 <2e-16 ***
#> (Intercept):2 -0.04150 0.15019 -0.276 0.782
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Names of linear predictors: logitlink(prob), logitlink(onempstr0)
#>
#> Log-likelihood: -286.8646 on 398 degrees of freedom
#>
#> Number of Fisher scoring iterations: 14
#>