Fitting Generalized Linear Models (using S4)

glm4, very similarly as standard R's glm() is used to fit generalized linear models, specified by giving a symbolic description of the linear predictor and a description of the error distribution.

It is more general, as it fits linear, generalized linear, non-linear and generalized nonlinear models.

glm4(formula, family, data, weights, subset, na.action,
     start = NULL, etastart, mustart, offset,
     sparse = FALSE, drop.unused.levels = FALSE, doFit = TRUE,
     control = list(...),
     model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, ...)

Arguments

formula: an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.
family: a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions.)
data: an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which glm is called.
weights: an optional vector of ‘prior weights’ to be used in the fitting process. Should be NULL or a numeric vector.
subset: an optional vector specifying a subset of observations to be used in the fitting process.
na.action: a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The ‘factory-fresh’ default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.
start, etastart, mustart: starting values for the parameters in the linear predictor, the predictor itself and for the vector of means.
offset: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See model.offset.
sparse: logical indicating if the model matrix should be sparse or not.
drop.unused.levels: used only when sparse is TRUE: Should factors have unused levels dropped? (This used to be true, implicitly in the first versions up to July 2010; the default has been changed for compatibility with R's standard (dense) model.matrix().
doFit: logical indicating if the model should be fitted (or just returned unfitted).
control: a list with options on fitting; currently passed unchanged to (hidden) function IRLS().
model, x, y: currently ignored; here for back compatibility with glm.
contrasts: passed to model.Matrix(.., contrasts.arg = contrasts), see its documentation.
...: potentially arguments passed on to fitter functions; not used currently.

Value

an object of class glpModel.

Examples

### All the following is very experimental -- and probably will change: -------

data(CO2, package="datasets")
## dense linear model
str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE), 4)
#> Formal class 'glpModel' [package "MatrixModels"] with 4 slots
#>   ..@ resp    :Formal class 'respModule' [package "MatrixModels"] with 7 slots
#>   .. .. ..@ mu     : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. .. ..@ offset : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. .. ..@ sqrtXwt: num [1:84, 1] 1 1 1 1 1 1 1 1 1 1 ...
#>   .. .. ..@ sqrtrwt: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
#>   .. .. ..@ weights: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
#>   .. .. ..@ wtres  : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. .. ..@ y      : num [1:84] 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
#>   ..@ pred    :Formal class 'dPredModule' [package "MatrixModels"] with 4 slots
#>   .. .. ..@ X   :Formal class 'ddenseModelMatrix' [package "MatrixModels"] with 6 slots
#>   .. .. ..@ fac :Formal class 'Cholesky' [package "Matrix"] with 5 slots
#>   .. .. ..@ coef: num [1:4] 0 0 0 0
#>   .. .. ..@ Vtr : num [1:4] 0 0 0 0
#>   ..@ call    : language glm4(formula = uptake ~ 0 + Type * Treatment, data = CO2, doFit = FALSE)
#>   ..@ fitProps: list()
## sparse linear model
str(glm4(uptake ~ 0 + Type*Treatment, data=CO2, doFit = FALSE,
                  sparse = TRUE), 4)
#> Formal class 'glpModel' [package "MatrixModels"] with 4 slots
#>   ..@ resp    :Formal class 'respModule' [package "MatrixModels"] with 7 slots
#>   .. .. ..@ mu     : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. .. ..@ offset : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. .. ..@ sqrtXwt: num [1:84, 1] 1 1 1 1 1 1 1 1 1 1 ...
#>   .. .. ..@ sqrtrwt: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
#>   .. .. ..@ weights: num [1:84] 1 1 1 1 1 1 1 1 1 1 ...
#>   .. .. ..@ wtres  : num [1:84] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. .. ..@ y      : num [1:84] 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
#>   ..@ pred    :Formal class 'sPredModule' [package "MatrixModels"] with 4 slots
#>   .. .. ..@ X   :Formal class 'dsparseModelMatrix' [package "MatrixModels"] with 8 slots
#>   .. .. ..@ fac :Formal class 'dCHMsimpl' [package "Matrix"] with 11 slots
#>   .. .. ..@ coef: num [1:4] 0 0 0 0
#>   .. .. ..@ Vtr : num [1:4] 0 0 0 0
#>   ..@ call    : language glm4(formula = uptake ~ 0 + Type * Treatment, data = CO2, sparse = TRUE,      doFit = FALSE)
#>   ..@ fitProps: list()

## From example(glm): -----------------

## Dobson (1990) Page 93: Randomized Controlled Trial :
str(trial <- data.frame(counts=c(18,17,15,20,10,20,25,13,12),
                        outcome=gl(3,1,9,labels=LETTERS[1:3]),
                        treatment=gl(3,3,labels=letters[1:3])))
#> 'data.frame':	9 obs. of  3 variables:
#>  $ counts   : num  18 17 15 20 10 20 25 13 12
#>  $ outcome  : Factor w/ 3 levels "A","B","C": 1 2 3 1 2 3 1 2 3
#>  $ treatment: Factor w/ 3 levels "a","b","c": 1 1 1 2 2 2 3 3 3
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson, data=trial)
summary(glm.D93)
#> 
#> Call:
#> glm(formula = counts ~ outcome + treatment, family = poisson, 
#>     data = trial)
#> 
#> Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  3.045e+00  1.709e-01  17.815   <2e-16 ***
#> outcomeB    -4.543e-01  2.022e-01  -2.247   0.0246 *  
#> outcomeC    -2.930e-01  1.927e-01  -1.520   0.1285    
#> treatmentb  -2.414e-16  2.000e-01   0.000   1.0000    
#> treatmentc  -6.088e-16  2.000e-01   0.000   1.0000    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 10.5814  on 8  degrees of freedom
#> Residual deviance:  5.1291  on 4  degrees of freedom
#> AIC: 56.761
#> 
#> Number of Fisher Scoring iterations: 4
#> 
c.glm <- unname(coef(glm.D93))
glmM  <- glm4(counts ~ outcome + treatment, family = poisson, data=trial)
glmM2 <- update(glmM, quick = FALSE) # slightly more accurate
glmM3 <- update(glmM, quick = FALSE, finalUpdate = TRUE)
                 # finalUpdate has no effect on 'coef'
stopifnot( identical(glmM2@pred@coef, glmM3@pred@coef),
           all.equal(glmM @pred@coef, c.glm, tolerance=1e-7),
           all.equal(glmM2@pred@coef, c.glm, tolerance=1e-12))
#> Error : The following control arguments did not match any default's names:
#>    “fooBar”

## Watch the iterations --- and use no intercept --> more sparse X
## 1) dense generalized linear model
glmM <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
                      verbose = TRUE)
#> _1_ convergence criterion: 0.0990093
#> step = 1.00000, new wrss = 5.0726703, Delta(wrss)= 0.0499074, coef =
#> [1] 3.044512e+00 2.590484e+00 2.751707e+00 1.163981e-04 6.095497e-05
#> _2_ convergence criterion: 0.00107995
#> step = 1.00000, new wrss = 5.1721149, Delta(wrss)= 5.94911e-06, coef =
#> [1] 3.044522e+00 2.590267e+00 2.751535e+00 1.923493e-08 8.383392e-09
#> _3_ convergence criterion: 1.4393e-07
## 2) sparse generalized linear model
glmS <- glm4(counts ~ 0+outcome + treatment, poisson, trial,
                      verbose = TRUE, sparse = TRUE)
#> _1_ convergence criterion: 0.0990093
#> step = 1.00000, new wrss = 5.0726703, Delta(wrss)= 0.0499074, coef =
#> [1] 3.044512e+00 2.590484e+00 2.751707e+00 1.163981e-04 6.095497e-05
#> _2_ convergence criterion: 0.00107995
#> step = 1.00000, new wrss = 5.1721149, Delta(wrss)= 5.94911e-06, coef =
#> [1] 3.044522e+00 2.590267e+00 2.751535e+00 1.923493e-08 8.383392e-09
#> _3_ convergence criterion: 1.43931e-07
str(glmS, max.lev = 4)
#> Formal class 'glpModel' [package "MatrixModels"] with 4 slots
#>   ..@ resp    :Formal class 'glmRespMod' [package "MatrixModels"] with 10 slots
#>   .. .. ..@ family :List of 12
#>   .. .. .. ..- attr(*, "class")= chr "family"
#>   .. .. ..@ eta    : num [1:9] 3.04 2.59 2.75 3.04 2.59 ...
#>   .. .. ..@ n      : num [1:9] 1 1 1 1 1 1 1 1 1
#>   .. .. ..@ mu     : num [1:9] 21 13.3 15.7 21 13.3 ...
#>   .. .. ..@ offset : num [1:9] 0 0 0 0 0 0 0 0 0
#>   .. .. ..@ sqrtXwt: num [1:9, 1] 4.58 3.65 3.96 4.58 3.65 ...
#>   .. .. ..@ sqrtrwt: num [1:9] 0.218 0.274 0.253 0.218 0.274 ...
#>   .. .. ..@ weights: num [1:9] 1 1 1 1 1 1 1 1 1
#>   .. .. ..@ wtres  : num [1:9] -0.655 1.004 -0.168 -0.218 -0.913 ...
#>   .. .. ..@ y      : num [1:9] 18 17 15 20 10 20 25 13 12
#>   ..@ pred    :Formal class 'sPredModule' [package "MatrixModels"] with 4 slots
#>   .. .. ..@ X   :Formal class 'dsparseModelMatrix' [package "MatrixModels"] with 8 slots
#>   .. .. ..@ fac :Formal class 'dCHMsimpl' [package "Matrix"] with 11 slots
#>   .. .. ..@ coef: num [1:5] 3.04 2.59 2.75 1.92e-08 8.38e-09
#>   .. .. ..@ Vtr : num [1:5] -1.45e-07 -1.57e-06 -1.31e-06 -1.51e-06 -9.64e-07
#>   ..@ call    : language glm4(formula = counts ~ 0 + outcome + treatment, family = poisson, data = trial,      sparse = TRUE, verbose = TRUE)
#>   ..@ fitProps:List of 3
#>   .. ..$ convcrit : num 1.44e-07
#>   .. ..$ iter     : num 3
#>   .. ..$ nHalvings: num 0
stopifnot( all.equal(glmM@pred@coef, glmS@pred@coef),
           all.equal(glmM@pred@Vtr,  glmS@pred@Vtr) )


## A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)
clotting <- data.frame(u = c(5,10,15,20,30,40,60,80,100),
                       lot1 = c(118,58,42,35,27,25,21,19,18),
                       lot2 = c(69,35,26,21,18,16,13,12,12))
str(gMN <- glm4(lot1 ~ log(u), data=clotting, family=Gamma, verbose=TRUE))
#> _1_ convergence criterion: 0.0455001
#> step = 1.00000, new wrss = 0.017051829, Delta(wrss)= 3.5175e-05, coef =
#> [1] -0.01655447  0.01534311
#> _2_ convergence criterion: 0.000110842
#> step = 1.00000, new wrss = 0.017122092, Delta(wrss)= 2.0547e-10, coef =
#> [1] -0.01655438  0.01534311
#> _3_ convergence criterion: 1.02e-09
#> Formal class 'glpModel' [package "MatrixModels"] with 4 slots
#>   ..@ resp    :Formal class 'glmRespMod' [package "MatrixModels"] with 10 slots
#>   .. .. ..@ family :List of 12
#>   .. .. .. ..$ family    : chr "Gamma"
#>   .. .. .. ..$ link      : chr "inverse"
#>   .. .. .. ..$ linkfun   :function (mu)  
#>   .. .. .. ..$ linkinv   :function (eta)  
#>   .. .. .. ..$ variance  :function (mu)  
#>   .. .. .. ..$ dev.resids:function (y, mu, wt)  
#>   .. .. .. ..$ aic       :function (y, n, mu, wt, dev)  
#>   .. .. .. ..$ mu.eta    :function (eta)  
#>   .. .. .. ..$ validmu   :function (mu)  
#>   .. .. .. ..$ valideta  :function (eta)  
#>   .. .. .. ..$ simulate  :function (object, nsim)  
#>   .. .. .. ..$ dispersion: num NA
#>   .. .. .. ..- attr(*, "class")= chr "family"
#>   .. .. ..@ eta    : num [1:9] 0.00814 0.01877 0.025 0.02941 0.03563 ...
#>   .. .. ..@ n      : num [1:9] 1 1 1 1 1 1 1 1 1
#>   .. .. ..@ mu     : num [1:9] 122.9 53.3 40 34 28.1 ...
#>   .. .. ..@ offset : num [1:9] 0 0 0 0 0 0 0 0 0
#>   .. .. ..@ sqrtXwt: num [1:9, 1] -122.9 -53.3 -40 -34 -28.1 ...
#>   .. .. ..@ sqrtrwt: num [1:9] 0.00814 0.01877 0.025 0.02941 0.03563 ...
#>   .. .. ..@ weights: num [1:9] 1 1 1 1 1 1 1 1 1
#>   .. .. ..@ wtres  : num [1:9] -0.0395 0.0889 0.0498 0.0293 -0.038 ...
#>   .. .. ..@ y      : num [1:9] 118 58 42 35 27 25 21 19 18
#>   ..@ pred    :Formal class 'dPredModule' [package "MatrixModels"] with 4 slots
#>   .. .. ..@ X   :Formal class 'ddenseModelMatrix' [package "MatrixModels"] with 6 slots
#>   .. .. .. .. ..@ Dim      : int [1:2] 9 2
#>   .. .. .. .. ..@ Dimnames :List of 2
#>   .. .. .. .. .. ..$ : chr [1:9] "1" "2" "3" "4" ...
#>   .. .. .. .. .. ..$ : chr [1:2] "(Intercept)" "log(u)"
#>   .. .. .. .. ..@ x        : num [1:18] 1 1 1 1 1 ...
#>   .. .. .. .. ..@ factors  : list()
#>   .. .. .. .. ..@ assign   : int [1:2] 0 1
#>   .. .. .. .. ..@ contrasts: list()
#>   .. .. ..@ fac :Formal class 'Cholesky' [package "Matrix"] with 5 slots
#>   .. .. .. .. ..@ uplo    : chr "U"
#>   .. .. .. .. ..@ x       : num [1:4] 153 0 320 119
#>   .. .. .. .. ..@ perm    : int(0) 
#>   .. .. .. .. ..@ Dim     : int [1:2] 2 2
#>   .. .. .. .. ..@ Dimnames:List of 2
#>   .. .. .. .. .. ..$ : chr [1:2] "(Intercept)" "log(u)"
#>   .. .. .. .. .. ..$ : chr [1:2] "(Intercept)" "log(u)"
#>   .. .. ..@ coef: num [1:2] -0.0166 0.0153
#>   .. .. ..@ Vtr : num [1:2] 1.91e-08 3.43e-08
#>   ..@ call    : language glm4(formula = lot1 ~ log(u), family = Gamma, data = clotting, verbose = TRUE)
#>   ..@ fitProps:List of 3
#>   .. ..$ convcrit : num 1.02e-09
#>   .. ..$ iter     : num 3
#>   .. ..$ nHalvings: num 0
glm. <- glm(lot1 ~ log(u), data=clotting, family=Gamma)
stopifnot( all.equal(gMN@pred@coef, unname(coef(glm.)), tolerance=1e-7) )

Fitting Generalized Linear Models (using S4)

Arguments

Value

See also

Examples