The Two-parameter Beta Distribution Family Function

Estimation of the mean and precision parameters of the beta distribution.

Usage

betaff(A = 0, B = 1, lmu = "logitlink", lphi = "loglink",
       imu = NULL, iphi = NULL,
       gprobs.y = ppoints(8), gphi  = exp(-3:5)/4, zero = NULL)

Arguments

A, B

Lower and upper limits of the distribution. The defaults correspond to the standard beta distribution where the response lies between 0 and 1.

lmu, lphi

Link function for the mean and precision parameters. The values $A$ and $B$ are extracted from the min and max arguments of extlogitlink. Consequently, only extlogitlink is allowed.

imu, iphi

Optional initial value for the mean and precision parameters respectively. A NULL value means a value is obtained in the initialize slot.

gprobs.y, gphi, zero

See CommonVGAMffArguments for more information.

Details

The two-parameter beta distribution can be written $f(y) =$ $$(y-A)^{\mu_1 \phi-1} \times (B-y)^{(1-\mu_1) \phi-1} / [beta(\mu_1 \phi,(1-\mu_1) \phi) \times (B-A)^{\phi-1}]$$ for $A < y < B$, and $beta(.,.)$ is the beta function (see beta). The parameter $\mu_1$ satisfies $\mu_1 = (\mu - A) / (B-A)$ where $\mu$ is the mean of $Y$. That is, $\mu_1$ is the mean of of a standard beta distribution: $E(Y) = A + (B-A) \times \mu_1$, and these are the fitted values of the object. Also, $\phi$ is positive and $A < \mu < B$. Here, the limits $A$ and $B$ are known.

Another parameterization of the beta distribution involving the raw shape parameters is implemented in betaR.

For general $A$ and $B$, the variance of $Y$ is $(B-A)^2 \times \mu_1 \times (1-\mu_1) / (1+\phi)$. Then $\phi$ can be interpreted as a precision parameter in the sense that, for fixed $\mu$, the larger the value of $\phi$, the smaller the variance of $Y$. Also, $\mu_1 = shape1/(shape1+shape2)$ and $\phi = shape1+shape2$. Fisher scoring is implemented.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

References

Ferrari, S. L. P. and Francisco C.-N. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31, 799–815.

Author

Thomas W. Yee

Note

The response must have values in the interval ($A$, $B$). The user currently needs to manually choose lmu to match the input of arguments A and B, e.g., with extlogitlink; see the example below.

Examples

bdata <- data.frame(y = rbeta(nn <- 1000, shape1 = exp(0),
                              shape2 = exp(1)))
fit1 <- vglm(y ~ 1, betaff, data = bdata, trace = TRUE)
#> Iteration 1: loglikelihood = 359.12129
#> Iteration 2: loglikelihood = 361.45828
#> Iteration 3: loglikelihood = 361.46405
#> Iteration 4: loglikelihood = 361.46405
coef(fit1, matrix = TRUE)
#>             logitlink(mu) loglink(phi)
#> (Intercept)    -0.9979877     1.251758
Coef(fit1)  # Useful for intercept-only models
#>        mu       phi 
#> 0.2693373 3.4964828 

# General A and B, and with a covariate
bdata <- transform(bdata, x2 = runif(nn))
bdata <- transform(bdata, mu = logitlink(0.5 - x2, inverse = TRUE),
                          prec = exp(3.0 + x2))  # prec == phi
bdata <- transform(bdata, shape2 = prec * (1 - mu),
                          shape1 = mu * prec)
bdata <- transform(bdata,
                   y = rbeta(nn, shape1 = shape1, shape2 = shape2))
bdata <- transform(bdata, Y = 5 + 8 * y)  # From 5--13, not 0--1
fit <- vglm(Y ~ x2, data = bdata, trace = TRUE,
   betaff(A = 5, B = 13, lmu = "extlogitlink(min = 5, max = 13)"))
#> Iteration 1: loglikelihood = -1143.1521
#> Iteration 2: loglikelihood = -1065.7659
#> Iteration 3: loglikelihood = -1055.0457
#> Iteration 4: loglikelihood = -1054.8671
#> Iteration 5: loglikelihood = -1054.867
#> Iteration 6: loglikelihood = -1054.867
coef(fit, matrix = TRUE)
#>             extlogitlink(mu, min = 5, max = 13) loglink(phi)
#> (Intercept)                           0.4789215     2.877085
#> x2                                   -0.9607508     1.148016