GARMA (Generalized Autoregressive Moving-Average) Models
garma.RdFits GARMA models to time series data.
Arguments
- link
Link function applied to the mean response. The default is suitable for continuous responses. The link
loglinkshould be chosen if the data are counts. The linkreciprocalcan be chosen if the data are counts and the variance assumed for this is \(\mu^2\). The linkslogitlink,probitlink,clogloglink, andcauchitlinkare supported and suitable for binary responses.Note that when the log or logit link is chosen: for log and logit, zero values can be replaced by
bvalue. Seeloglinkandlogitlinketc. for specific information about each link function.- p.ar.lag
A positive integer, the lag for the autoregressive component. Called \(p\) below.
- q.ma.lag
A non-negative integer, the lag for the moving-average component. Called \(q\) below.
- coefstart
Starting values for the coefficients. Assigning this argument is highly recommended. For technical reasons, the argument
coefstartinvglmcannot be used.- step
Numeric. Step length, e.g.,
0.5means half-stepsizing.
Details
This function draws heavily on Benjamin et al. (1998).
See also Benjamin et al. (2003).
GARMA models extend the ARMA time series model to generalized
responses in the exponential family, e.g., Poisson counts,
binary responses. Currently, this function is rudimentary and
can handle only certain continuous, count and binary responses only.
The user must choose an appropriate link for the link argument.
The GARMA(\(p, q\)) model is defined by firstly having a response belonging to the exponential family $$f(y_t|D_t) = \exp \left\{ \frac{y_t \theta_t - b(\theta_t)}{\phi / A_t} + c(y_t, \phi / A_t) \right\}$$ where \(\theta_t\) and \(\phi\) are the canonical and scale parameters respectively, and \(A_t\) are known prior weights. The mean \(\mu_t=E(Y_t|D_t)=b'(\theta_t)\) is related to the linear predictor \(\eta_t\) by the link function \(g\). Here, \(D_t=\{x_t,\ldots,x_1,y_{t-1},\ldots,y_1,\mu_{t-1},\ldots,\mu_1\}\) is the previous information set. Secondly, the GARMA(\(p, q\)) model is defined by $$g(\mu_t) = \eta_t = x_t^T \beta + \sum_{k=1}^p \phi_k (g(y_{t-k}) - x_{t-k}^T \beta) + \sum_{k=1}^q \theta_k (g(y_{t-k}) - \eta_{t-k}).$$ Parameter vectors \(\beta\), \(\phi\) and \(\theta\) are estimated by maximum likelihood.
Value
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm.
References
Benjamin, M. A., Rigby, R. A. and Stasinopoulos, M. D. (1998). Fitting Non-Gaussian Time Series Models. Pages 191–196 in: Proceedings in Computational Statistics COMPSTAT 1998 by Payne, R. and P. J. Green. Physica-Verlag.
Benjamin, M. A., Rigby, R. A. and Stasinopoulos, M. D. (2003). Generalized Autoregressive Moving Average Models. Journal of the American Statistical Association, 98: 214–223.
Zeger, S. L. and Qaqish, B. (1988). Markov regression models for time series: a quasi-likelihood approach. Biometrics, 44: 1019–1031.
Note
This function is unpolished and is requires lots of improvements.
In particular, initialization is very poor.
Results appear very sensitive to quality of initial values.
A limited amount of experience has shown that half-stepsizing is
often needed for convergence, therefore choosing crit = "coef"
is not recommended.
Overdispersion is not handled.
For binomial responses it is currently best to input a vector
of 1s and 0s rather than the cbind(successes, failures)
because the initialize slot is rudimentary.
Warning
This VGAM family function is 'non-standard' in that the model does need some coercing to get it into the VGLM framework. Special code is required to get it running. A consequence is that some methods functions may give wrong results when applied to the fitted object.
Examples
gdata <- data.frame(interspike = c(68, 41, 82, 66, 101, 66, 57, 41, 27, 78,
59, 73, 6, 44, 72, 66, 59, 60, 39, 52,
50, 29, 30, 56, 76, 55, 73, 104, 104, 52,
25, 33, 20, 60, 47, 6, 47, 22, 35, 30,
29, 58, 24, 34, 36, 34, 6, 19, 28, 16,
36, 33, 12, 26, 36, 39, 24, 14, 28, 13,
2, 30, 18, 17, 28, 9, 28, 20, 17, 12,
19, 18, 14, 23, 18, 22, 18, 19, 26, 27,
23, 24, 35, 22, 29, 28, 17, 30, 34, 17,
20, 49, 29, 35, 49, 25, 55, 42, 29, 16)) # See Zeger and Qaqish (1988)
gdata <- transform(gdata, spikenum = seq(interspike))
bvalue <- 0.1 # .Machine$double.xmin # Boundary value
fit <- eval(substitute(
vglm(interspike ~ 1, trace = TRUE, data = gdata,
garma(paste0("loglink(bvalue = ", .bvalue , ")"),
p = 2, coefstart = c(4, 0.3, 0.4))),
list( .bvalue = bvalue)))
#> Iteration 1: loglikelihood = -33517179
#> Taking a modified step.....
#> Iteration 1 : loglikelihood = 9063.765
#> Iteration 2: loglikelihood = 9039.4617
#> Taking a modified step.
#> Iteration 2 : loglikelihood = 9294.1272
#> Iteration 3: loglikelihood = 8776.2807
#> Taking a modified step..
#> Iteration 3 : loglikelihood = 9311.6036
#> Iteration 4: loglikelihood = 9236.8891
#> Taking a modified step..
#> Iteration 4 : loglikelihood = 9325.2546
#> Iteration 5: loglikelihood = 9303.4811
#> Taking a modified step..
#> Iteration 5 : loglikelihood = 9327.5058
#> Iteration 6: loglikelihood = 9323.9082
#> Taking a modified step.
#> Iteration 6 : loglikelihood = 9327.635
#> Iteration 7: loglikelihood = 9318.4243
#> Taking a modified step...
#> Iteration 7 : loglikelihood = 9328.5283
#> Iteration 8: loglikelihood = 9329.2394
#> Iteration 9: loglikelihood = 9316.0423
#> Taking a modified step..
#> Iteration 9 : loglikelihood = 9329.4185
#> Iteration 10: loglikelihood = 9329.8789
#> Iteration 11: loglikelihood = 9330.6516
#> Iteration 12: loglikelihood = 9330.6545
#> Iteration 13: loglikelihood = 9330.6617
#> Iteration 14: loglikelihood = 9330.6617
summary(fit)
#>
#> Call:
#> vglm(formula = interspike ~ 1, family = garma(paste0("loglink(bvalue = ",
#> 0.1, ")"), p = 2, coefstart = c(4, 0.3, 0.4)), data = gdata,
#> trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 3.71331 0.04117 90.191 < 2e-16 ***
#> (lag1) 0.33723 0.03076 10.962 < 2e-16 ***
#> (lag2) 0.24180 0.02988 8.094 5.79e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Name of linear predictor: loglink(mu)
#>
#> Log-likelihood: 9330.662 on 95 degrees of freedom
#>
#> Number of Fisher scoring iterations: 14
#>
coef(fit, matrix = TRUE)
#> loglink(mu)
#> (Intercept) 3.7133065
#> (lag1) 0.3372322
#> (lag2) 0.2418008
Coef(fit) # A bug here
#> mu <NA> <NA>
#> 40.989114 1.401064 1.273541
if (FALSE) with(gdata, plot(interspike, ylim = c(0, 120), las = 1,
xlab = "Spike Number", ylab = "Inter-Spike Time (ms)", col = "blue"))
with(gdata, lines(spikenum[-(1:fit@misc$plag)], fitted(fit), col = "orange"))
#> Error in plot.xy(xy.coords(x, y), type = type, ...): plot.new has not been called yet
abline(h = mean(with(gdata, interspike)), lty = "dashed", col = "gray") # \dontrun{}
#> Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...): plot.new has not been called yet