The Two-stage Sequential Binomial Distribution Family Function
seq2binomial.RdEstimation of the probabilities of a two-stage binomial distribution.
Usage
seq2binomial(lprob1 = "logitlink", lprob2 = "logitlink",
iprob1 = NULL, iprob2 = NULL,
parallel = FALSE, zero = NULL)Arguments
- lprob1, lprob2
Parameter link functions applied to the two probabilities, called \(p\) and \(q\) below. See
Linksfor more choices.- iprob1, iprob2
Optional initial value for the first and second probabilities respectively. A
NULLmeans a value is obtained in theinitializeslot.- parallel, zero
Details at
Links. Ifparallel = TRUEthen the constraint also applies to the intercept. SeeCommonVGAMffArgumentsfor details.
Details
This VGAM family function fits the model described by
Crowder and Sweeting (1989) which is described as follows.
Each of \(m\) spores has a probability \(p\) of
germinating. Of the \(y_1\) spores that germinate,
each has a probability \(q\) of bending in a particular
direction. Let \(y_2\) be the number that bend in the
specified direction. The probability model for this data is
\(P(y_1,y_2) =\)
$$
{m \choose y_1} p^{y_1} (1-p)^{m-y_1}
{y_1 \choose y_2} q^{y_2} (1-q)^{y_1-y_2}$$
for \(0 < p < 1\), \(0 < q < 1\),
\(y_1=1,\ldots,m\)
and
\(y_2=1,\ldots,y_1\).
Here, \(p\) is prob1,
\(q\) is prob2.
Although the Authors refer to this as the bivariate binomial model, I have named it the (two-stage) sequential binomial model. Fisher scoring is used.
Value
An object of class "vglmff" (see
vglmff-class). The object is used by modelling
functions such as vglm and vgam.
References
Crowder, M. and Sweeting, T. (1989). Bayesian inference for a bivariate binomial distribution. Biometrika, 76, 599–603.
Note
The response must be a two-column matrix of sample proportions
corresponding to \(y_1\) and \(y_2\).
The \(m\) values should be inputted with the weights
argument of vglm
and vgam.
The fitted value is a two-column matrix of estimated
probabilities \(p\) and \(q\).
A common form of error is when there are no trials
for \(y_1\),
e.g., if mvector below has some values which are zero.
Examples
sdata <- data.frame(mvector = round(rnorm(nn <- 100, m = 10, sd = 2)),
x2 = runif(nn))
sdata <- transform(sdata, prob1 = logitlink(+2 - x2, inverse = TRUE),
prob2 = logitlink(-2 + x2, inverse = TRUE))
sdata <- transform(sdata, successes1 = rbinom(nn, size = mvector, prob = prob1))
sdata <- transform(sdata, successes2 = rbinom(nn, size = successes1, prob = prob2))
sdata <- transform(sdata, y1 = successes1 / mvector)
sdata <- transform(sdata, y2 = successes2 / successes1)
fit <- vglm(cbind(y1, y2) ~ x2, seq2binomial, weight = mvector,
data = sdata, trace = TRUE)
#> Iteration 1: loglikelihood = -297.55139
#> Iteration 2: loglikelihood = -297.41156
#> Iteration 3: loglikelihood = -297.41154
#> Iteration 4: loglikelihood = -297.41154
coef(fit)
#> (Intercept):1 (Intercept):2 x2:1 x2:2
#> 1.9763016 -1.8181994 -1.2847068 0.4717196
coef(fit, matrix = TRUE)
#> logitlink(prob1) logitlink(prob2)
#> (Intercept) 1.976302 -1.8181994
#> x2 -1.284707 0.4717196
head(fitted(fit))
#> prob1 prob2
#> 1 0.8374123 0.1552000
#> 2 0.8536777 0.1492937
#> 3 0.8266989 0.1589279
#> 4 0.7893381 0.1711409
#> 5 0.6876752 0.2006326
#> 6 0.7891598 0.1711967
head(depvar(fit))
#> y1 y2
#> 1 0.8461538 0.09090909
#> 2 0.9090909 0.10000000
#> 3 0.8888889 0.25000000
#> 4 0.3333333 0.00000000
#> 5 0.5833333 0.14285714
#> 6 0.6000000 0.16666667
head(weights(fit, type = "prior")) # Same as with(sdata, mvector)
#> [,1]
#> [1,] 13
#> [2,] 11
#> [3,] 9
#> [4,] 6
#> [5,] 12
#> [6,] 10
# Number of first successes:
head(depvar(fit)[, 1] * c(weights(fit, type = "prior")))
#> 1 2 3 4 5 6
#> 11 10 8 2 7 6
# Number of second successes:
head(depvar(fit)[, 2] * c(weights(fit, type = "prior")) *
depvar(fit)[, 1])
#> 1 2 3 4 5 6
#> 1 1 2 0 1 1