Beta-binomial Distribution Family Function
betabinomialff.RdFits a beta-binomial distribution by maximum likelihood estimation. The two parameters here are the shape parameters of the underlying beta distribution.
Usage
betabinomialff(lshape1 = "loglink", lshape2 = "loglink",
ishape1 = 1, ishape2 = NULL, imethod = 1, ishrinkage = 0.95,
nsimEIM = NULL, zero = NULL)Arguments
- lshape1, lshape2
Link functions for the two (positive) shape parameters of the beta distribution. See
Linksfor more choices.- ishape1, ishape2
Initial value for the shape parameters. The first must be positive, and is recyled to the necessary length. The second is optional. If a failure to converge occurs, try assigning a different value to
ishape1and/or usingishape2.- zero
Can be an integer specifying which linear/additive predictor is to be modelled as an intercept only. If assigned, the single value should be either
1or2. The default is to model both shape parameters as functions of the covariates. If a failure to converge occurs, tryzero = 2. SeeCommonVGAMffArgumentsfor more information.- ishrinkage, nsimEIM, imethod
See
CommonVGAMffArgumentsfor more information. The argumentishrinkageis used only ifimethod = 2. Using the argumentnsimEIMmay offer large advantages for large values of \(N\) and/or large data sets.
Details
There are several parameterizations of the beta-binomial
distribution. This family function directly models the two
shape parameters of the associated beta distribution rather than
the probability of success (however, see Note below).
The model can be written
\(T|P=p \sim Binomial(N,p)\)
where \(P\) has a beta distribution with shape parameters
\(\alpha\) and \(\beta\). Here,
\(N\) is the number of trials (e.g., litter size),
\(T=NY\) is the number of successes, and
\(p\) is the probability of a success (e.g., a malformation).
That is, \(Y\) is the proportion of successes. Like
binomialff, the fitted values are the
estimated probability
of success (i.e., \(E[Y]\) and not \(E[T]\))
and the prior weights \(N\) are attached separately on the
object in a slot.
The probability function is $$P(T=t) = {N \choose t} \frac{B(\alpha+t, \beta+N-t)} {B(\alpha, \beta)}$$ where \(t=0,1,\ldots,N\), and \(B\) is the beta function with shape parameters \(\alpha\) and \(\beta\). Recall \(Y = T/N\) is the real response being modelled.
The default model is \(\eta_1 = \log(\alpha)\) and \(\eta_2 = \log(\beta)\) because both parameters are positive. The mean (of \(Y\)) is \(p=\mu=\alpha/(\alpha+\beta)\) and the variance (of \(Y\)) is \(\mu(1-\mu)(1+(N-1)\rho)/N\). Here, the correlation \(\rho\) is given by \(1/(1 + \alpha + \beta)\) and is the correlation between the \(N\) individuals within a litter. A litter effect is typically reflected by a positive value of \(\rho\). It is known as the over-dispersion parameter.
This family function uses Fisher scoring. The two diagonal elements of the second-order expected derivatives with respect to \(\alpha\) and \(\beta\) are computed numerically, which may fail for large \(\alpha\), \(\beta\), \(N\) or else take a long time.
Value
An object of class "vglmff"
(see vglmff-class).
The object is used by modelling functions
such as vglm.
Suppose fit is a fitted beta-binomial model. Then
fit@y (better: depvar(fit)) contains the sample
proportions \(y\), fitted(fit) returns estimates of
\(E(Y)\), and weights(fit, type = "prior") returns
the number of trials \(N\).
References
Moore, D. F. and Tsiatis, A. (1991). Robust estimation of the variance in moment methods for extra-binomial and extra-Poisson variation. Biometrics, 47, 383–401.
Prentice, R. L. (1986). Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. Journal of the American Statistical Association, 81, 321–327.
Note
This function processes the input in the same way
as binomialff. But it does not handle
the case \(N=1\) very well because there are two
parameters to estimate, not one, for each row of the input.
Cases where \(N=1\) can be omitted via the
subset argument of vglm.
Although the two linear/additive predictors given above are in terms of \(\alpha\) and \(\beta\), basic algebra shows that the default amounts to fitting a logit link to the probability of success; subtracting the second linear/additive predictor from the first gives that logistic regression linear/additive predictor. That is, \(logit(p) = \eta_1 - \eta_2\). This is illustated in one of the examples below.
The extended beta-binomial distribution
of Prentice (1986)
implemented by extbetabinomial
is the preferred VGAM
family function for BBD regression.
Warning
This family function is prone to numerical difficulties due to
the expected information matrices not being positive-definite
or ill-conditioned over some regions of the parameter space.
If problems occur try setting ishape1 to be some other
positive value, using ishape2 and/or setting zero
= 2.
This family function may be renamed in the future.
See the warnings in betabinomial.
Examples
# Example 1
N <- 10; s1 <- exp(1); s2 <- exp(2)
y <- rbetabinom.ab(n = 100, size = N, shape1 = s1, shape2 = s2)
fit <- vglm(cbind(y, N-y) ~ 1, betabinomialff, trace = TRUE)
#> Iteration 1: loglikelihood = -199.23938
#> Iteration 2: loglikelihood = -198.19571
#> Iteration 3: loglikelihood = -198.1782
#> Iteration 4: loglikelihood = -198.1782
#> Iteration 5: loglikelihood = -198.1782
coef(fit, matrix = TRUE)
#> loglink(shape1) loglink(shape2)
#> (Intercept) 1.370785 2.240413
Coef(fit)
#> shape1 shape2
#> 3.938441 9.397214
head(fit@misc$rho) # The correlation parameter
#> [1] 0.06975614 0.06975614 0.06975614 0.06975614 0.06975614 0.06975614
head(cbind(depvar(fit), weights(fit, type = "prior")))
#> [,1] [,2]
#> 1 0.4 10
#> 2 0.4 10
#> 3 0.0 10
#> 4 0.2 10
#> 5 0.4 10
#> 6 0.3 10
# Example 2
fit <- vglm(cbind(R, N-R) ~ 1, betabinomialff, data = lirat,
trace = TRUE, subset = N > 1)
#> Iteration 1: loglikelihood = -123.51975
#> Iteration 2: loglikelihood = -122.69831
#> Iteration 3: loglikelihood = -122.69531
#> Iteration 4: loglikelihood = -122.69531
#> Iteration 5: loglikelihood = -122.69531
coef(fit, matrix = TRUE)
#> loglink(shape1) loglink(shape2)
#> (Intercept) -1.161384 -1.042375
Coef(fit)
#> shape1 shape2
#> 0.3130526 0.3526163
fit@misc$rho # The correlation parameter
#> [1] 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594
#> [8] 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594
#> [15] 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594
#> [22] 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594
#> [29] 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594
#> [36] 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594
#> [43] 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594
#> [50] 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594 0.6003594
#> [57] 0.6003594
t(fitted(fit))
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828
#> [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,] 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828
#> [,15] [,16] [,17] [,18] [,19] [,20] [,21]
#> [1,] 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828
#> [,22] [,23] [,24] [,25] [,26] [,27] [,28]
#> [1,] 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828
#> [,29] [,30] [,31] [,32] [,33] [,34] [,35]
#> [1,] 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828
#> [,36] [,37] [,38] [,39] [,40] [,41] [,42]
#> [1,] 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828
#> [,43] [,44] [,45] [,46] [,47] [,48] [,49]
#> [1,] 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828
#> [,50] [,51] [,52] [,53] [,54] [,55] [,56]
#> [1,] 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828 0.4702828
#> [,57]
#> [1,] 0.4702828
t(depvar(fit))
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#> [1,] 0.1 0.3636364 0.75 1 1 0.8181818 1 1 1 0.7 1 0.9 1 0.8181818 0.6666667
#> 16 17 18 19 20 21 22 23
#> [1,] 0.7777778 1 0.5833333 0.8181818 0.6153846 0.3571429 1 0.8333333
#> 24 25 26 27 28 29 30 31 32 33 34 35
#> [1,] 0.6153846 1 0.2142857 1 0.75 1 0.3846154 1 0.1 0.3333333 0.07692308 0
#> 36 37 38 39 40 41 43 44 45 46 47
#> [1,] 0.2857143 0.2222222 0.1538462 0.0625 0 0 0 0 0.09090909 0 0.07142857
#> 48 49 50 51 52 53 54 55 56 57 58
#> [1,] 0 0 0 0.2222222 0.1176471 0 0 0.07142857 0 0 0
t(weights(fit, type = "prior"))
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#> [1,] 10 11 12 4 10 11 9 11 10 10 12 10 8 11 6 9 14 12 11 13 14 10 12 13 10
#> 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 43 44 45 46 47 48 49 50 51
#> [1,] 14 13 4 8 13 12 10 3 13 12 14 9 13 16 11 4 12 8 11 14 14 11 3 13 9
#> 52 53 54 55 56 57 58
#> [1,] 17 15 2 14 8 6 17
# A "loglink" link for the 2 shape params is a logistic regression:
all.equal(c(fitted(fit)),
as.vector(logitlink(predict(fit)[, 1] -
predict(fit)[, 2], inverse = TRUE)))
#> [1] TRUE
# Example 3, which is more complicated
lirat <- transform(lirat, fgrp = factor(grp))
summary(lirat) # Only 5 litters in group 3
#> N R hb grp fgrp
#> Min. : 1.00 Min. : 0.000 Min. : 3.100 Min. :1.000 1:31
#> 1st Qu.: 9.00 1st Qu.: 0.250 1st Qu.: 4.325 1st Qu.:1.000 2:12
#> Median :11.00 Median : 3.500 Median : 6.500 Median :1.000 3: 5
#> Mean :10.47 Mean : 4.603 Mean : 7.798 Mean :1.897 4:10
#> 3rd Qu.:13.00 3rd Qu.: 9.000 3rd Qu.:10.775 3rd Qu.:2.750
#> Max. :17.00 Max. :14.000 Max. :16.600 Max. :4.000
fit2 <- vglm(cbind(R, N-R) ~ fgrp + hb, betabinomialff(zero = 2),
data = lirat, trace = TRUE, subset = N > 1)
#> Iteration 1: loglikelihood = -107.09354
#> Iteration 2: loglikelihood = -99.955137
#> Iteration 3: loglikelihood = -98.976074
#> Iteration 4: loglikelihood = -98.873897
#> Iteration 5: loglikelihood = -98.863768
#> Iteration 6: loglikelihood = -98.862731
#> Iteration 7: loglikelihood = -98.862626
#> Iteration 8: loglikelihood = -98.862615
#> Iteration 9: loglikelihood = -98.862614
coef(fit2, matrix = TRUE)
#> loglink(shape1) loglink(shape2)
#> (Intercept) 1.532068 -0.130058
#> fgrp2 -1.955610 0.000000
#> fgrp3 -2.412000 0.000000
#> fgrp4 -2.176017 0.000000
#> hb -0.105682 0.000000
coef(fit2, matrix = TRUE)[, 1] -
coef(fit2, matrix = TRUE)[, 2] # logitlink(p)
#> (Intercept) fgrp2 fgrp3 fgrp4 hb
#> 1.662126 -1.955610 -2.412000 -2.176017 -0.105682
if (FALSE) with(lirat, plot(hb[N > 1], fit2@misc$rho,
xlab = "Hemoglobin", ylab = "Estimated rho",
pch = as.character(grp[N > 1]), col = grp[N > 1])) # \dontrun{}
if (FALSE) # cf. Figure 3 of Moore and Tsiatis (1991)
with(lirat, plot(hb, R / N, pch = as.character(grp), col = grp,
xlab = "Hemoglobin level", ylab = "Proportion Dead", las = 1,
main = "Fitted values (lines)"))
smalldf <- with(lirat, lirat[N > 1, ])
for (gp in 1:4) {
xx <- with(smalldf, hb[grp == gp])
yy <- with(smalldf, fitted(fit2)[grp == gp])
ooo <- order(xx)
lines(xx[ooo], yy[ooo], col = gp, lwd = 2)
} # \dontrun{}
#> Error in plot.xy(xy.coords(x, y), type = type, ...): plot.new has not been called yet