Extended Beta-binomial Distribution Family Function

Fits an extended beta-binomial distribution by maximum likelihood estimation. The two parameters here are the mean and correlation coefficient.

Usage

extbetabinomial(lmu = "logitlink", lrho = "cloglink", 
      zero = "rho", irho = 0, grho = c(0, 0.05, 0.1, 0.2),
      vfl = FALSE, Form2 = NULL,
      imethod = 1, ishrinkage = 0.95)

Arguments

lmu, lrho

Link functions applied to the two parameters. See Links for more choices. The first default ensure the mean remain in \((0, 1)\), while the second allows for a slightly negative correlation parameter: you could say it lies in \((\max(-\mu/(N-\mu-1), -(1 - \mu)/(N-(1-\mu)-1)), 1)\) where \(\mu\) is the mean (probability) and \(N\) is size. See below for details. For lrho, cloglink is a good choice because it handles parameter values from 1 downwards. Other good choices include logofflink(offset = 1) and rhobitlink.

irho, grho

The first is similar to betabinomial and it is a good idea to use this argument because to conduct a grid search based on grho is expensive. The default is effectively a binomial distribution. Set irho = NULL to perform a grid search which is more reliable but slow.

imethod

Similar to betabinomial.

zero

Similar to betabinomial. Also, see CommonVGAMffArguments for more information. Modelling rho with covariates requires large samples.

ishrinkage

See betabinomial and CommonVGAMffArguments for information.

vfl, Form2

See CommonVGAMffArguments. If vfl = TRUE then Form2 should be a formula specifying the terms for \(\eta_2\) and all others are used for \(\mu\). It is similar to uninormal. If these arguments are used then cbind(0, log(size1 / (size1 - 1))) should be used as an offset, and set zero = NULL too.

Details

The extended beta-binomial distribution (EBBD) proposed by Prentice (1986) allows for a slightly negative correlation parameter whereas the ordinary BBD betabinomial only allows values in \((0, 1)\) so it handles overdispersion only. When negative, the data is underdispersed relative to an ordinary binomial distribution.

Argument rho is used here for the \(\delta\) used in Prentice (1986) because it is the correlation between the (almost) Bernoulli trials. (They are actually simple binary variates.) We use here \(N\) for the number of trials (e.g., litter size), \(T=NY\) is the number of successes, and \(p\) (or \(\mu\)) is the probability of a success (e.g., a malformation). That is, \(Y\) is the proportion of successes. Like binomialff, the fitted values are the estimated probability of success (i.e., \(E[Y]\) and not \(E[T]\)) and the prior weights \(N\) are attached separately on the object in a slot.

The probability function is difficult to write but it involves three series of products. Recall \(Y = T/N\) is the real response being modelled, where \(T\) is the (total) sum of \(N\) correlated (almost) Bernoulli trials.

The default model is \(\eta_1 = logit(\mu)\) and \(\eta_2 = clog(\rho)\) because the first parameter lies between 0 and 1. The second link is cloglink. The mean of \(Y\) is \(p=\mu\) and the variance of \(Y\) is \(\mu(1-\mu)(1+(N-1)\rho)/N\). Here, the correlation \(\rho\) may be slightly negative and is the correlation between the \(N\) individuals within a litter. A litter effect is typically reflected by a positive value of \(\rho\) and corresponds to overdispersion with respect to the binomial distribution. Thus an exchangeable error structure is assumed between units within a litter for the EBBD.

This family function uses Fisher scoring. Elements of the second-order expected derivatives are computed numerically, which may fail for models very near the boundary of the parameter space. Usually, the computations are expensive for large \(N\) because of a for loop, so it may take a long time.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm.

Suppose fit is a fitted EBB model. Then depvar(fit) are the sample proportions \(y\), fitted(fit) returns estimates of \(E(Y)\), and weights(fit, type = "prior") returns the number of trials \(N\).

References

Prentice, R. L. (1986). Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. Journal of the American Statistical Association, 81, 321–327.

Author

T. W. Yee

Note

This function is recommended over betabinomial and betabinomialff. It processes the input in the same way as binomialff. But it does not handle the case \(N \leq 2\) very well because there are two parameters to estimate, not one, for each row of the input. Cases where \(N > 2\) can be selected via the subset argument of vglm.

Warning

Modelling rho using covariates well requires much data so it is usually best to leave zero alone. It is good to set trace = TRUE and play around with irho if there are problems achieving convergence. Convergence problems will occur when the estimated rho is close to the lower bound, i.e., the underdispersion is almost too severe for the EBB to cope.

Examples

if (FALSE) { # \dontrun{
# Example 1
edata <- data.frame(N = 10, mu = logitlink(1, inverse = TRUE),
                    rho = cloglink(0.15, inverse = TRUE))
edata <- transform(edata,
      y = rextbetabinom(100, N, mu, rho = rho))
with(edata, plot(table(y)))
fit1 <- vglm(cbind(y, N-y) ~ 1, extbetabinomial, edata, trace = TRUE)
coef(fit1, matrix = TRUE)
Coef(fit1)
head(cbind(depvar(fit1), weights(fit1, type = "prior")))

# Example 2: VFL model
N <- size1 <- 10; nn <- 2000; set.seed(1)
edata <-  # Generate the data set. Expensive.
    data.frame(x2 = runif(nn),
               ooo =  log(size1 / (size1 - 1)))
edata <- transform(edata, x1copy = 1, x2copy = x2,
  y2 = rextbetabinom(nn, size1,  # Expensive
         logitlink(1 + x2, inverse = TRUE),
         cloglink(ooo + 1 - 0.5 * x2, inv = TRUE)))
fit2 <- vglm(data = edata,
        cbind(y2, N - y2) ~ x2 + x1copy + x2copy,
        extbetabinomial(zero = NULL, vfl = TRUE,
                 Form2 = ~ x1copy + x2copy - 1),
        offset = cbind(0, ooo), trace = TRUE)
coef(fit2, matrix = TRUE)
wald.stat(fit2, values0 = c(1, 1, -0.5))
} # }