Simplex Distribution Family Function

The two parameters of the univariate standard simplex distribution are estimated by full maximum likelihood estimation.

Usage

simplex(lmu = "logitlink", lsigma = "loglink", imu = NULL, isigma = NULL,
        imethod = 1, ishrinkage = 0.95, zero = "sigma")

Arguments

lmu, lsigma

Link function for mu and sigma. See Links for more choices.

imu, isigma

Optional initial values for mu and sigma. A NULL means a value is obtained internally.

imethod, ishrinkage, zero

See CommonVGAMffArguments for information.

Details

The probability density function can be written $$f(y; \mu, \sigma) = [2 \pi \sigma^2 (y (1-y))^3]^{-0.5} \exp[-0.5 (y-\mu)^2 / (\sigma^2 y (1-y) \mu^2 (1-\mu)^2)] $$ for $0 < y < 1$, $0 < \mu < 1$, and $\sigma > 0$. The mean of $Y$ is $\mu$ (called mu, and returned as the fitted values).

The second parameter, sigma, of this standard simplex distribution is known as the dispersion parameter. The unit variance function is $V(\mu) = \mu^3 (1-\mu)^3$. Fisher scoring is applied to both parameters.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

References

Jorgensen, B. (1997). The Theory of Dispersion Models. London: Chapman & Hall

Song, P. X.-K. (2007). Correlated Data Analysis: Modeling, Analytics, and Applications. Springer.

Author

T. W. Yee

Note

This distribution is potentially useful for dispersion modelling. Numerical problems may occur when mu is very close to 0 or 1.

Examples

sdata <- data.frame(x2 = runif(nn <- 1000))
sdata <- transform(sdata, eta1 = 1 + 2 * x2,
                          eta2 = 1 - 2 * x2)
sdata <- transform(sdata, y = rsimplex(nn, mu = logitlink(eta1, inverse = TRUE),
                                       dispersion = exp(eta2)))
(fit <- vglm(y ~ x2, simplex(zero = NULL), data = sdata, trace = TRUE))
#> Iteration 1: loglikelihood = 1316.3575
#> Iteration 2: loglikelihood = 1682.8941
#> Iteration 3: loglikelihood = 1936.7601
#> Iteration 4: loglikelihood = 2056.5911
#> Iteration 5: loglikelihood = 2085.3816
#> Iteration 6: loglikelihood = 2087.2666
#> Iteration 7: loglikelihood = 2087.2804
#> Iteration 8: loglikelihood = 2087.2804
#> Iteration 9: loglikelihood = 2087.2804
#> 
#> Call:
#> vglm(formula = y ~ x2, family = simplex(zero = NULL), data = sdata, 
#>     trace = TRUE)
#> 
#> 
#> Coefficients:
#> (Intercept):1 (Intercept):2          x2:1          x2:2 
#>     0.9786911     1.0332532     2.0166739    -2.0563965 
#> 
#> Degrees of Freedom: 2000 Total; 1996 Residual
#> Log-likelihood: 2087.28 
coef(fit, matrix = TRUE)
#>             logitlink(mu) loglink(sigma)
#> (Intercept)     0.9786911       1.033253
#> x2              2.0166739      -2.056397
summary(fit)
#> 
#> Call:
#> vglm(formula = y ~ x2, family = simplex(zero = NULL), data = sdata, 
#>     trace = TRUE)
#> 
#> Coefficients: 
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept):1  0.97869    0.02836   34.51   <2e-16 ***
#> (Intercept):2  1.03325    0.04354   23.73   <2e-16 ***
#> x2:1           2.01667    0.03357   60.07   <2e-16 ***
#> x2:2          -2.05640    0.07603  -27.05   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: logitlink(mu), loglink(sigma)
#> 
#> Log-likelihood: 2087.28 on 1996 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 9 
#>