Simplex Distribution Family Function
simplex.RdThe two parameters of the univariate standard simplex distribution are estimated by full maximum likelihood estimation.
Usage
simplex(lmu = "logitlink", lsigma = "loglink", imu = NULL, isigma = NULL,
imethod = 1, ishrinkage = 0.95, zero = "sigma")Arguments
- lmu, lsigma
Link function for
muandsigma. SeeLinksfor more choices.- imu, isigma
Optional initial values for
muandsigma. ANULLmeans a value is obtained internally.- imethod, ishrinkage, zero
See
CommonVGAMffArgumentsfor information.
Details
The probability density function can be written
$$f(y; \mu, \sigma) = [2 \pi \sigma^2 (y (1-y))^3]^{-0.5}
\exp[-0.5 (y-\mu)^2 / (\sigma^2 y (1-y) \mu^2 (1-\mu)^2)]
$$
for \(0 < y < 1\),
\(0 < \mu < 1\),
and \(\sigma > 0\).
The mean of \(Y\) is \(\mu\) (called mu, and
returned as the fitted values).
The second parameter, sigma, of this standard simplex
distribution is known as the dispersion parameter.
The unit variance function is
\(V(\mu) = \mu^3 (1-\mu)^3\).
Fisher scoring is applied to both parameters.
Value
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm,
and vgam.
References
Jorgensen, B. (1997). The Theory of Dispersion Models. London: Chapman & Hall
Song, P. X.-K. (2007). Correlated Data Analysis: Modeling, Analytics, and Applications. Springer.
Note
This distribution is potentially useful for dispersion modelling.
Numerical problems may occur when mu is very close to 0 or 1.
Examples
sdata <- data.frame(x2 = runif(nn <- 1000))
sdata <- transform(sdata, eta1 = 1 + 2 * x2,
eta2 = 1 - 2 * x2)
sdata <- transform(sdata, y = rsimplex(nn, mu = logitlink(eta1, inverse = TRUE),
dispersion = exp(eta2)))
(fit <- vglm(y ~ x2, simplex(zero = NULL), data = sdata, trace = TRUE))
#> Iteration 1: loglikelihood = 1316.3575
#> Iteration 2: loglikelihood = 1682.8941
#> Iteration 3: loglikelihood = 1936.7601
#> Iteration 4: loglikelihood = 2056.5911
#> Iteration 5: loglikelihood = 2085.3816
#> Iteration 6: loglikelihood = 2087.2666
#> Iteration 7: loglikelihood = 2087.2804
#> Iteration 8: loglikelihood = 2087.2804
#> Iteration 9: loglikelihood = 2087.2804
#>
#> Call:
#> vglm(formula = y ~ x2, family = simplex(zero = NULL), data = sdata,
#> trace = TRUE)
#>
#>
#> Coefficients:
#> (Intercept):1 (Intercept):2 x2:1 x2:2
#> 0.9786911 1.0332532 2.0166739 -2.0563965
#>
#> Degrees of Freedom: 2000 Total; 1996 Residual
#> Log-likelihood: 2087.28
coef(fit, matrix = TRUE)
#> logitlink(mu) loglink(sigma)
#> (Intercept) 0.9786911 1.033253
#> x2 2.0166739 -2.056397
summary(fit)
#>
#> Call:
#> vglm(formula = y ~ x2, family = simplex(zero = NULL), data = sdata,
#> trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept):1 0.97869 0.02836 34.51 <2e-16 ***
#> (Intercept):2 1.03325 0.04354 23.73 <2e-16 ***
#> x2:1 2.01667 0.03357 60.07 <2e-16 ***
#> x2:2 -2.05640 0.07603 -27.05 <2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Names of linear predictors: logitlink(mu), loglink(sigma)
#>
#> Log-likelihood: 2087.28 on 1996 degrees of freedom
#>
#> Number of Fisher scoring iterations: 9
#>