Linear Model and Poisson Mixed Data Type Family Function
N1poisson.RdEstimate the four parameters of the (bivariate) \(N_1\)–Poisson copula mixed data type model by maximum likelihood estimation.
Usage
N1poisson(lmean = "identitylink", lsd = "loglink",
lvar = "loglink", llambda = "loglink", lapar = "rhobitlink",
zero = c(if (var.arg) "var" else "sd", "apar"),
doff = 5, nnodes = 20, copula = "gaussian",
var.arg = FALSE, imethod = 1, isd = NULL,
ilambda = NULL, iapar = NULL)Arguments
- lmean, lsd, lvar, llambda, lapar
Details at
CommonVGAMffArguments. SeeLinksfor more link function choices. The second response is primarily controlled by the parameter \(\lambda_2\).- imethod, isd, ilambda, iapar
Initial values. Details at
CommonVGAMffArguments.- zero
Details at
CommonVGAMffArguments.- doff
Numeric of unit length, the denominator offset \(\delta>0\). A monotonic transformation \(\Delta^* = \lambda_2^{2/3} / (|\delta| + \lambda_2^{2/3})\) is taken to map the Poisson mean onto the unit interval. This argument is \(\delta\). The default reflects the property that the normal approximation to the Poisson work wells for \(\lambda_2 \geq 10\) or thereabouts, hence that value is mapped to the origin by
qnorm. That's because10**(2/3)is approximately 5. It is known that the \(\lambda_2\) rate parameter raised to the power of \(2/3\) is a transformation that approximates the normal density more closely.Alternatively,
deltamay be assigned a single negative value. If so, then \(\Delta^* = \log(1 + \lambda_2) / [|\delta| + \log(1 + \lambda_2)]\) is used. For this,doff = -log1p(10)is suggested.- nnodes, copula
Details at
N1binomial.- var.arg
See
uninormal.
Details
The bivariate response comprises
\(Y_1\) from a linear model
having parameters
mean and sd for
\(\mu_1\) and \(\sigma_1\),
and the Poisson count
\(Y_2\) having parameter
lambda for its mean \(\lambda_2\).
The
joint probability density/mass function is
\(P(y_1, Y_2 = y_2) = \phi_1(y_1; \mu_1, \sigma_1)
\exp(-h^{-1}(\Delta))
[h^{-1}(\Delta)]^{y_2} / y_2!\)
where \(\Delta\) adjusts \(\lambda_2\)
according to the association parameter
\(\alpha\).
The quantity \(\Delta\) is
\(\Phi((\Phi^{-1}(h(\lambda_2)) -
\alpha Z_1) / \sqrt{1 - \alpha^2})\)
where \(h\) maps
\(\lambda_2\) onto the unit interval.
The quantity \(Z_1\) is \((Y_1-\mu_1) / \sigma_1\).
Thus there is an underlying bivariate normal
distribution, and a copula is used to bring the
two marginal distributions together.
Here,
\(-1 < \alpha < 1\), and
\(\Phi\) is the
cumulative distribution function
pnorm
of a standard univariate normal.
The first marginal
distribution is a normal distribution
for the linear model.
The second column of the response must
have nonnegative integer values.
When \(\alpha = 0\)
then \(\Delta=\Delta^*\).
Together, this family function combines
uninormal and
poissonff.
If the response are correlated then
a more efficient joint analysis
should result.
The second marginal distribution allows for overdispersion relative to an ordinary Poisson distribution—a property due to \(\alpha\).
This VGAM family function cannot handle multiple responses. Only a two-column matrix is allowed. The two-column fitted value matrix has columns \(\mu_1\) and \(\lambda_2\).
Value
An object of class "vglmff"
(see vglmff-class).
The object is used by modelling functions
such as vglm
and vgam.
Note
This VGAM family function is based on
N1binomial and shares many
properties with it.
It pays to set trace = TRUE to
monitor convergence, especially when
abs(apar) is high.
Examples
apar <- rhobitlink(0.3, inverse = TRUE)
nn <- 1000; mymu <- 1; sdev <- exp(1)
lambda <- loglink(1, inverse = TRUE)
mat <- rN1pois(nn, mymu, sdev, lambda, apar)
npdata <- data.frame(y1 = mat[, 1], y2 = mat[, 2])
with(npdata, var(y2) / mean(y2)) # Overdispersion
#> [1] 1.432781
fit1 <- vglm(cbind(y1, y2) ~ 1, N1poisson,
npdata, trace = TRUE)
#> Iteration 1: loglikelihood = -8521.2793
#> Iteration 2: loglikelihood = -8499.6837
#> Iteration 3: loglikelihood = -8496.985
#> Iteration 4: loglikelihood = -8496.598
#> Iteration 5: loglikelihood = -8496.5331
#> Iteration 6: loglikelihood = -8496.521
#> Iteration 7: loglikelihood = -8496.5187
#> Iteration 8: loglikelihood = -8496.5182
#> Iteration 9: loglikelihood = -8496.5181
#> Iteration 10: loglikelihood = -8496.5181
coef(fit1, matrix = TRUE)
#> mean loglink(sd) loglink(lambda) rhobitlink(apar)
#> (Intercept) 0.9159481 0.9963348 1.008717 0.312851
Coef(fit1)
#> mean sd lambda apar
#> 0.9159481 2.7083370 2.7420807 0.1551620
head(fitted(fit1))
#> y1 y2
#> 1 0.9159481 2.742081
#> 2 0.9159481 2.742081
#> 3 0.9159481 2.742081
#> 4 0.9159481 2.742081
#> 5 0.9159481 2.742081
#> 6 0.9159481 2.742081
summary(fit1)
#>
#> Call:
#> vglm(formula = cbind(y1, y2) ~ 1, family = N1poisson, data = npdata,
#> trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept):1 0.91595 0.08565 10.70 <2e-16 ***
#> (Intercept):2 0.99633 0.02236 44.56 <2e-16 ***
#> (Intercept):3 1.00872 0.01990 50.68 <2e-16 ***
#> (Intercept):4 0.31285 0.01355 23.08 <2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Names of linear predictors: mean, loglink(sd), loglink(lambda),
#> rhobitlink(apar)
#>
#> Log-likelihood: -8496.518 on 3996 degrees of freedom
#>
#> Number of Fisher scoring iterations: 10
#>
confint(fit1)
#> 2.5 % 97.5 %
#> (Intercept):1 0.7480867 1.0838094
#> (Intercept):2 0.9525087 1.0401609
#> (Intercept):3 0.9697088 1.0477253
#> (Intercept):4 0.2862852 0.3394167