Ordinal Poisson Family Function
ordpoisson.RdFits a Poisson regression where the response is ordinal (the Poisson counts are grouped between known cutpoints).
Usage
ordpoisson(cutpoints, countdata = FALSE, NOS = NULL,
Levels = NULL, init.mu = NULL, parallel = FALSE,
zero = NULL, link = "loglink")Arguments
- cutpoints
Numeric. The cutpoints, \(K_l\). These must be non-negative integers.
Infvalues may be included. See below for further details.- countdata
Logical. Is the response (LHS of formula) in count-data format? If not then the response is a matrix or vector with values
1,2, ...,L, say, whereLis the number of levels. Such input can be generated withcutwith argumentlabels = FALSE. Ifcountdata = TRUEthen the response is expected to be in the same format asdepvar(fit)wherefitis a fitted model withordpoissonas the VGAM family function. That is, the response is matrix of counts withLcolumns (ifNOS = 1).- NOS
Integer. The number of species, or more generally, the number of response random variates. This argument must be specified when
countdata = TRUE. UsuallyNOS = 1.- Levels
Integer vector, recycled to length
NOSif necessary. The number of levels for each response random variate. This argument should agree withcutpoints. This argument must be specified whencountdata = TRUE.- init.mu
Numeric. Initial values for the means of the Poisson regressions. Recycled to length
NOSif necessary. Use this argument if the default initial values fail (the default is to compute an initial value internally).- parallel, zero, link
See
poissonff. SeeCommonVGAMffArgumentsfor information.
Details
This VGAM family function uses maximum likelihood estimation (Fisher scoring) to fit a Poisson regression to each column of a matrix response. The data, however, is ordinal, and is obtained from known integer cutpoints. Here, \(l=1,\ldots,L\) where \(L\) (\(L \geq 2\)) is the number of levels. In more detail, let \(Y^*=l\) if \(K_{l-1} < Y \leq K_{l}\) where the \(K_l\) are the cutpoints. We have \(K_0=-\infty\) and \(K_L=\infty\). The response for this family function corresponds to \(Y^*\) but we are really interested in the Poisson regression of \(Y\).
If NOS=1 then
the argument cutpoints is a vector \((K_1,K_2,\ldots,K_L)\)
where the last value (Inf) is optional. If NOS>1 then
the vector should have NOS-1 Inf values separating
the cutpoints. For example, if there are NOS=3 responses, then
something like
ordpoisson(cut = c(0, 5, 10, Inf, 20, 30, Inf, 0, 10, 40, Inf))
is valid.
Value
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm
and vgam.
References
Yee, T. W. (2020). Ordinal ordination with normalizing link functions for count data, (in preparation).
Note
Sometimes there are no observations between two cutpoints. If so,
the arguments Levels and NOS need to be specified too.
See below for an example.
Warning
The input requires care as little to no checking is done.
If fit is the fitted object, have a look at fit@extra and
depvar(fit) to check.
Examples
set.seed(123) # Example 1
x2 <- runif(n <- 1000); x3 <- runif(n)
mymu <- exp(3 - 1 * x2 + 2 * x3)
y1 <- rpois(n, lambda = mymu)
cutpts <- c(-Inf, 20, 30, Inf)
fcutpts <- cutpts[is.finite(cutpts)] # finite cutpoints
ystar <- cut(y1, breaks = cutpts, labels = FALSE)
if (FALSE) { # \dontrun{
plot(x2, x3, col = ystar, pch = as.character(ystar))
} # }
table(ystar) / sum(table(ystar))
#> ystar
#> 1 2 3
#> 0.260 0.194 0.546
fit <- vglm(ystar ~ x2 + x3, fam = ordpoisson(cutpoi = fcutpts))
head(depvar(fit)) # This can be input if countdata = TRUE
#> [,1] [,2] [,3]
#> [1,] 1 0 0
#> [2,] 0 1 0
#> [3,] 1 0 0
#> [4,] 0 0 1
#> [5,] 0 0 1
#> [6,] 0 0 1
head(fitted(fit))
#> mu
#> 1 26.37788
#> 2 29.70400
#> 3 18.82698
#> 4 44.47479
#> 5 41.56519
#> 6 49.54660
head(predict(fit))
#> loglink(mu)
#> [1,] 3.272526
#> [2,] 3.391282
#> [3,] 2.935291
#> [4,] 3.794923
#> [5,] 3.727263
#> [6,] 3.902914
coef(fit, matrix = TRUE)
#> loglink(mu)
#> (Intercept) 3.0324949
#> x2 -0.9879523
#> x3 1.9155716
fit@extra
#> $NOS
#> [1] 1
#>
#> $Levels
#> [1] 3
#>
#> $y.integer
#> [1] TRUE
#>
#> $ncoly
#> [1] 3
#>
#> $countdata
#> [1] FALSE
#>
#> $cutpoints
#> [1] 20 30 Inf
#>
#> $n
#> [1] 1000
#>
# Example 2: multivariate and there are no obsns between some cutpoints
cutpts2 <- c(-Inf, 0, 9, 10, 20, 70, 200, 201, Inf)
fcutpts2 <- cutpts2[is.finite(cutpts2)] # finite cutpoints
y2 <- rpois(n, lambda = mymu) # Same model as y1
ystar2 <- cut(y2, breaks = cutpts2, labels = FALSE)
table(ystar2) / sum(table(ystar2))
#> ystar2
#> 2 3 4 5 6
#> 0.037 0.029 0.214 0.571 0.149
fit <- vglm(cbind(ystar,ystar2) ~ x2 + x3, fam =
ordpoisson(cutpoi = c(fcutpts,Inf,fcutpts2,Inf),
Levels = c(length(fcutpts)+1,length(fcutpts2)+1),
parallel = TRUE), trace = TRUE)
#> Iteration 1: loglikelihood = -3421.6299
#> Iteration 2: loglikelihood = -975.56347
#> Iteration 3: loglikelihood = -763.12349
#> Iteration 4: loglikelihood = -759.96202
#> Iteration 5: loglikelihood = -759.9608
#> Iteration 6: loglikelihood = -759.9608
coef(fit, matrix = TRUE)
#> loglink(mu1) loglink(mu2)
#> (Intercept) 2.993586 2.993586
#> x2 -1.017975 -1.017975
#> x3 2.032580 2.032580
fit@extra
#> $NOS
#> [1] 2
#>
#> $Levels
#> [1] 3 8
#>
#> $y.integer
#> [1] TRUE
#>
#> $ncoly
#> [1] 11
#>
#> $countdata
#> [1] FALSE
#>
#> $cutpoints
#> [1] 20 30 Inf 0 9 10 20 70 200 201 Inf
#>
#> $n
#> [1] 1000
#>
constraints(fit)
#> $`(Intercept)`
#> [,1]
#> [1,] 1
#> [2,] 1
#>
#> $x2
#> [,1]
#> [1,] 1
#> [2,] 1
#>
#> $x3
#> [,1]
#> [1,] 1
#> [2,] 1
#>
summary(depvar(fit)) # Some columns have all zeros
#> V1 V2 V3 V4 V5
#> Min. :0.00 Min. :0.000 Min. :0.000 Min. :0 Min. :0.000
#> 1st Qu.:0.00 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0 1st Qu.:0.000
#> Median :0.00 Median :0.000 Median :1.000 Median :0 Median :0.000
#> Mean :0.26 Mean :0.194 Mean :0.546 Mean :0 Mean :0.037
#> 3rd Qu.:1.00 3rd Qu.:0.000 3rd Qu.:1.000 3rd Qu.:0 3rd Qu.:0.000
#> Max. :1.00 Max. :1.000 Max. :1.000 Max. :0 Max. :1.000
#> V6 V7 V8 V9 V10
#> Min. :0.000 Min. :0.000 Min. :0.000 Min. :0.000 Min. :0
#> 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0
#> Median :0.000 Median :0.000 Median :1.000 Median :0.000 Median :0
#> Mean :0.029 Mean :0.214 Mean :0.571 Mean :0.149 Mean :0
#> 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:1.000 3rd Qu.:0.000 3rd Qu.:0
#> Max. :1.000 Max. :1.000 Max. :1.000 Max. :1.000 Max. :0
#> V11
#> Min. :0
#> 1st Qu.:0
#> Median :0
#> Mean :0
#> 3rd Qu.:0
#> Max. :0