Multinomial Logit Model
multinomial.RdFits a multinomial logit model (MLM) to a (preferably unordered) factor response.
Usage
multinomial(zero = NULL, parallel = FALSE, nointercept = NULL,
refLevel = "(Last)", ynames = FALSE,
imethod = 1, imu = NULL, byrow.arg = FALSE,
sumcon = FALSE, Thresh = NULL, Trev = FALSE,
Tref = if (Trev) "M" else 1, Intercept = NULL,
whitespace = FALSE)Arguments
- zero
Can be an integer-valued vector specifying which linear/additive predictors are modelled as intercepts only. Any values must be from the set {1,2,...,\(M\)}. The default value means none are modelled as intercept-only terms. See
CommonVGAMffArgumentsfor more information.- parallel
A logical, or formula specifying which terms have equal/unequal coefficients.
- ynames
Logical. If
TRUEthen"mu[,1]"is replaced by the probability of the first named response category, etc. (e.g.,"P[normal]"), so that the output is more readable, albeit less compact. This is seen in output such aspredict(fit)andcoef(fit, matrix = TRUE). Of course,"mu"stands for the fitted probabilities, and it remains the default for upward compatibility and predictability.- nointercept, whitespace
See
CommonVGAMffArgumentsfor details.- imu, byrow.arg
See
CommonVGAMffArgumentsfor details.- sumcon
Logical. Use summation constraints? If
TRUEthen the baseline linear predictor is negative the sum of the other linear predictors. That is, instead of the baseline eta being 0, it is minus the sum of the others. This parameterization is symmetric and is more suitable for techniques such as the LASSO where regression coefficients are shrunk towards zero. IfTRUEthen the fitted values and log-likelihood should be unchanged, howevercoefandpredict, etc. will return different answers, of course. Linkmultilogitlinkhas been modified to allow forsumcon.- refLevel
Either a (1) single positive integer or (2) a value of the factor or (3) a character string. If inputted as an integer then it specifies which column of the response matrix is the reference or baseline level. The default is the last one (the \((M+1)\)th one). If used, this argument will be usually assigned the value
1. If inputted as a value of a factor then beware of missing values of certain levels of the factor (drop.unused.levels = TRUEordrop.unused.levels = FALSE). See the example below. If inputted as a character string then this should be equal to (A) one of the levels of the factor response, else (B) one of the column names of the matrix response of counts; e.g.,vglm(cbind(normal, mild, severe) ~ let,multinomial(refLevel = "severe"), data = pneumo)if it was (incorrectly because the response is ordinal) applied to thepneumodata set. Another example isvglm(ethnicity ~ age,multinomial(refLevel = "European"), data = xs.nz)if it was applied toxs.nz.- imethod
Choosing 2 will use the mean sample proportions of each column of the response matrix, which corresponds to the MLEs for intercept-only models. See
CommonVGAMffArgumentsfor more details.
- Thresh, Trev, Tref, Intercept
Same as
cumulative. Because these arguments concern the intercepts, they should not be confused with the stereotype model where they would be applied to the A matrix instead.
Details
In this help file the response \(Y\) is assumed to be a factor with unordered values \(1,2,\dots,M+1\), so that \(M\) is the number of linear/additive predictors \(\eta_j\).
The default model can be written
$$\eta_j = \log(P[Y=j]/ P[Y=M+1])$$
where \(\eta_j\) is the \(j\)th
linear/additive predictor.
Here, \(j=1,\ldots,M\), and
\(\eta_{M+1}\)
is 0 by definition. That is, the last level
of the factor,
or last column of the response matrix, is
taken as the
reference level or baseline—this is for
identifiability
of the parameters. The reference or
baseline level can
be changed with the refLevel argument.
In almost all the literature, the
constraint matrices associated with this
family of models are known. For example,
setting parallel = TRUE will make
all constraint matrices (including the
intercept) equal to a vector of \(M\)
1's; to suppress the intercepts from
being parallel then set parallel =
FALSE ~ 1. If the constraint matrices
are unknown and to be estimated, then this
can be achieved by fitting the model as
a reduced-rank vector generalized linear
model (RR-VGLM; see rrvglm).
In particular, a multinomial logit model
with unknown constraint matrices is known as
a stereotype model (Anderson, 1984),
and can be fitted with rrvglm.
The above details correspond to the ordinary MLM where all the levels are altered (in the terminology of GAITD regression).
Value
An object of class "vglmff"
(see vglmff-class).
The object is used by modelling functions
such as vglm,
rrvglm and vgam.
References
Agresti, A. (2013). Categorical Data Analysis, 3rd ed. Hoboken, NJ, USA: Wiley.
Anderson, J. A. (1984). Regression and ordered categorical variables. Journal of the Royal Statistical Society, Series B, Methodological, 46, 1–30.
Hastie, T. J., Tibshirani, R. J. and Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd ed. New York, USA: Springer-Verlag.
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. London: Chapman & Hall.
Tutz, G. (2012). Regression for Categorical Data, Cambridge: Cambridge University Press.
Yee, T. W. and Hastie, T. J. (2003). Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15–41.
Yee, T. W. (2010). The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1–34. doi:10.18637/jss.v032.i10 .
Yee, T. W. and Ma, C. (2024). Generally altered, inflated, truncated and deflated regression. Statistical Science, 39, 568–588.
Note
The response should be either a matrix of counts
(with row sums that are all positive), or a
factor. In both cases, the y slot returned by
vglm/vgam/rrvglm
is the matrix of sample proportions.
The multinomial logit model is more appropriate for a nominal
(unordered) factor response than for an
ordinal (ordered) factor
response.
Models more suited for the latter include those based on
cumulative probabilities, e.g., cumulative.
multinomial is prone to numerical difficulties if
the groups are separable and/or the fitted probabilities
are close to 0 or 1. The fitted values returned
are estimates of the probabilities \(P[Y=j]\) for
\(j=1,\ldots,M+1\). See safeBinaryRegression
for the logistic regression case.
Here is an example of the usage of the parallel
argument. If there are covariates x2, x3
and x4, then parallel = TRUE ~ x2 + x3 -
1 and parallel = FALSE ~ x4 are equivalent. This
would constrain the regression coefficients for x2
and x3 to be equal; those of the intercepts and
x4 would be different.
In Example 4 below, a conditional logit model is
fitted to an artificial data set that explores how
cost and travel time affect people's decision about
how to travel to work. Walking is the baseline group.
The variable Cost.car is the difference between
the cost of travel to work by car and walking, etc. The
variable Time.car is the difference between
the travel duration/time to work by car and walking,
etc. For other details about the xij argument see
vglm.control and fill1.
The multinom function in the
nnet package uses the first level of the factor as
baseline, whereas the last level of the factor is used
here. Consequently the estimated regression coefficients
differ.
Warning
No check is made to verify that the response is nominal.
Argument sumcon is a recent addition
and may have limited functionality in other
parts of VGAM.
It has not been tested extensively.
See CommonVGAMffArguments for more warnings.
See also
multilogitlink,
margeff,
cumulative,
acat,
cratio,
sratio,
CM.equid,
CommonVGAMffArguments,
dirichlet,
dirmultinomial,
copsd,
rrvglm,
fill1,
Multinomial,
gaitdpoisson,
Gaitdpois,
iris.
Examples
# Example 1: Regn spline VGAM: marital status versus age
data(marital.nz)
ooo <- with(marital.nz, order(age))
om.nz <- marital.nz[ooo, ]
fit1 <- vglm(mstatus ~ sm.bs(age), multinomial, om.nz)
coef(fit1, matrix = TRUE) # Mostly meaningless
#> log(mu[,1]/mu[,4]) log(mu[,2]/mu[,4]) log(mu[,3]/mu[,4])
#> (Intercept) 5.844418 9.502096 13.10287
#> sm.bs(age)1 -3.988086 -5.655932 -17.19862
#> sm.bs(age)2 -4.085133 -5.650005 -10.69507
#> sm.bs(age)3 -9.277378 -9.292940 -16.56821
if (FALSE) with(om.nz,
matplot(age, fitted(fit1), type = "l", las = 1, lwd = 2))
legend("topright", leg = colnames(fitted(fit1)),
lty = 1:4, col = 1:4, lwd = 2) # \dontrun{}
#> Error in (function (s, units = "user", cex = NULL, font = NULL, vfont = NULL, ...) { if (!is.null(vfont)) vfont <- c(typeface = pmatch(vfont[1L], Hershey$typeface), fontindex = pmatch(vfont[2L], Hershey$fontindex)) .External.graphics(C_strWidth, as.graphicsAnnot(s), pmatch(units, c("user", "figure", "inches")), cex, font, vfont, ...)})(dots[[1L]][[1L]], cex = dots[[2L]][[1L]], font = dots[[3L]][[1L]], units = "user"): plot.new has not been called yet
# Example 2a: a simple example
set.seed(123)
ycounts <- t(rmultinom(10, size = 20, prob = c(0.1, 0.2, 0.8)))
fit2 <- fit <- vglm(ycounts ~ 1, multinomial)
head(fitted(fit2)) # Proportions
#> [,1] [,2] [,3]
#> 1 0.09 0.21 0.7
#> 2 0.09 0.21 0.7
#> 3 0.09 0.21 0.7
#> 4 0.09 0.21 0.7
#> 5 0.09 0.21 0.7
#> 6 0.09 0.21 0.7
fit2@prior.weights # NOT recommended for the prior weights
#> [,1]
#> 1 20
#> 2 20
#> 3 20
#> 4 20
#> 5 20
#> 6 20
#> 7 20
#> 8 20
#> 9 20
#> 10 20
weights(fit2, type = "prior", matrix = FALSE) # Better
#> [1] 20 20 20 20 20 20 20 20 20 20
depvar(fit2) # Sample proportions; same as fit2@y
#> [,1] [,2] [,3]
#> 1 0.05 0.25 0.70
#> 2 0.05 0.30 0.65
#> 3 0.20 0.05 0.75
#> 4 0.10 0.30 0.60
#> 5 0.10 0.15 0.75
#> 6 0.20 0.15 0.65
#> 7 0.10 0.20 0.70
#> 8 0.00 0.30 0.70
#> 9 0.05 0.05 0.90
#> 10 0.05 0.35 0.60
constraints(fit2) # Constraint matrices
#> $`(Intercept)`
#> [,1] [,2]
#> [1,] 1 0
#> [2,] 0 1
#>
# Example 2b: Different reference level used as the baseline
Fit2 <- vglm(ycounts ~ 1, multinomial(refLevel = 2))
coef(Fit2, matrix = TRUE)
#> log(mu[,1]/mu[,2]) log(mu[,3]/mu[,2])
#> (Intercept) -0.8472979 1.203973
coef(fit2, matrix = TRUE) # Easy to reconcile this output with fit2
#> log(mu[,1]/mu[,3]) log(mu[,2]/mu[,3])
#> (Intercept) -2.051271 -1.203973
# Example 3: The response is a factor.
nn <- 10
dframe3 <- data.frame(yfac = gl(3, nn, labels = c("Ctrl",
"Trt1", "Trt2")),
x2 = runif(3 * nn))
myrefLevel <- with(dframe3, yfac[12])
fit3a <- vglm(yfac ~ x2, multinomial(refLevel = myrefLevel), dframe3)
fit3b <- vglm(yfac ~ x2, multinomial(refLevel = 2), dframe3)
coef(fit3a, matrix = TRUE) # "Trt1" is the reference level
#> log(mu[,1]/mu[,2]) log(mu[,3]/mu[,2])
#> (Intercept) -1.807237 0.436824
#> x2 3.441869 -1.175112
coef(fit3b, matrix = TRUE) # "Trt1" is the reference level
#> log(mu[,1]/mu[,2]) log(mu[,3]/mu[,2])
#> (Intercept) -1.807237 0.436824
#> x2 3.441869 -1.175112
margeff(fit3b)
#> , , 1
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4950576 0.2290373 0.2660203
#> x2 0.9788089 -0.4120195 -0.5667894
#>
#> , , 2
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.3426238 0.2095186 0.1331052
#> x2 0.6721213 -0.3951285 -0.2769927
#>
#> , , 3
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4963675 0.2337004 0.2626671
#> x2 0.9810607 -0.4222876 -0.5587731
#>
#> , , 4
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4940612 0.2458292 0.2482319
#> x2 0.9753363 -0.4498845 -0.5254518
#>
#> , , 5
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4671111 0.1890396 0.2780715
#> x2 0.9255765 -0.3263330 -0.5992434
#>
#> , , 6
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4857059 0.2118654 0.2738405
#> x2 0.9613295 -0.3748713 -0.5864582
#>
#> , , 7
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.2931742 0.04953142 0.2436428
#> x2 0.5842511 -0.03598836 -0.5482627
#>
#> , , 8
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.1943019 -0.01157052 0.2058725
#> x2 0.3884057 0.08946134 -0.4778670
#>
#> , , 9
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.3654223 0.2195678 0.1458545
#> x2 0.7173274 -0.4134287 -0.3038987
#>
#> , , 10
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4079155 0.2362363 0.1716792
#> x2 0.8018019 -0.4431046 -0.3586973
#>
#> , , 11
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4960243 0.2424638 0.2535605
#> x2 0.9796062 -0.4420170 -0.5375892
#>
#> , , 12
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4683427 0.2514489 0.2168937
#> x2 0.9227600 -0.4667361 -0.4560239
#>
#> , , 13
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.1286387 -0.04642839 0.1750671
#> x2 0.2578026 0.16014745 -0.4179501
#>
#> , , 14
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4311338 0.1540311 0.2771027
#> x2 0.8555703 -0.2527784 -0.6027919
#>
#> , , 15
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4823973 0.2512249 0.2311724
#> x2 0.9512401 -0.4638922 -0.4873479
#>
#> , , 16
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.2400012 0.01548118 0.2245200
#> x2 0.4790454 0.03406733 -0.5131128
#>
#> , , 17
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.3152900 0.06453318 0.2507568
#> x2 0.6279223 -0.06694222 -0.5609801
#>
#> , , 18
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.2507534 0.02214392 0.2286095
#> x2 0.5003420 0.02038381 -0.5207258
#>
#> , , 19
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.1916516 -0.01307447 0.2047261
#> x2 0.3831428 0.09253135 -0.4756742
#>
#> , , 20
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.3881626 0.1181436 0.2700191
#> x2 0.7713939 -0.1779839 -0.5934100
#>
#> , , 21
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.3875663 0.1176733 0.2698930
#> x2 0.7702230 -0.1770066 -0.5932164
#>
#> , , 22
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.3541030 0.09219713 0.2619059
#> x2 0.7044277 -0.12415505 -0.5802726
#>
#> , , 23
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.1976104 -0.009682748 0.2072932
#> x2 0.3949746 0.085606113 -0.4805807
#>
#> , , 24
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.1892192 -0.01444825 0.2036674
#> x2 0.3783120 0.09533454 -0.4736465
#>
#> , , 25
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.2517599 0.02277330 0.2289866
#> x2 0.5023349 0.01909055 -0.5214254
#>
#> , , 26
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4236145 0.1474158 0.2761988
#> x2 0.8408733 -0.2389546 -0.6019187
#>
#> , , 27
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.2757893 0.03809260 0.2376967
#> x2 0.5498855 -0.01242084 -0.5374646
#>
#> , , 28
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4360102 0.2451407 0.1908695
#> x2 0.8578655 -0.4581239 -0.3997415
#>
#> , , 29
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.1385825 -0.04150505 0.1800875
#> x2 0.2776102 0.15024667 -0.4278569
#>
#> , , 30
#>
#> Ctrl Trt1 Trt2
#> (Intercept) -0.4077311 0.1339539 0.2737772
#> x2 0.8097786 -0.2108783 -0.5989003
#>
# Example 4: Fit a rank-1 stereotype model
fit4 <- rrvglm(Country ~ Width + Height + HP, multinomial, car.all)
coef(fit4) # Contains the C matrix
#> (Intercept):1 (Intercept):2 (Intercept):3 (Intercept):4 (Intercept):5
#> 93.19720321 35.60514657 26.34618629 35.91677976 45.93673486
#> (Intercept):6 (Intercept):7 (Intercept):8 (Intercept):9 Width
#> 50.05605259 74.63263688 62.87879198 27.22815450 -1.65508571
#> Height HP
#> 0.20684556 0.02872421
constraints(fit4)$HP # The A matrix
#> [,1]
#> [1,] 1.0000000
#> [2,] 0.3941497
#> [3,] 0.2926141
#> [4,] 0.3737687
#> [5,] 0.4650100
#> [6,] 0.5202978
#> [7,] 0.7834292
#> [8,] 0.6651158
#> [9,] 0.2945323
coef(fit4, matrix = TRUE) # The B matrix
#> log(mu[,1]/mu[,10]) log(mu[,2]/mu[,10]) log(mu[,3]/mu[,10])
#> (Intercept) 93.19720321 35.60514657 26.346186286
#> Width -1.65508571 -0.65235148 -0.484301482
#> Height 0.20684556 0.08152811 0.060525935
#> HP 0.02872421 0.01132164 0.008405111
#> log(mu[,4]/mu[,10]) log(mu[,5]/mu[,10]) log(mu[,6]/mu[,10])
#> (Intercept) 35.91677976 45.93673486 50.05605259
#> Width -0.61861918 -0.76963145 -0.86113742
#> Height 0.07731239 0.09618526 0.10762129
#> HP 0.01073621 0.01335705 0.01494514
#> log(mu[,7]/mu[,10]) log(mu[,8]/mu[,10]) log(mu[,9]/mu[,10])
#> (Intercept) 74.63263688 62.87879198 27.228154503
#> Width -1.29664253 -1.10082370 -0.487476229
#> Height 0.16204886 0.13757625 0.060922702
#> HP 0.02250339 0.01910493 0.008460209
Coef(fit4)@C # The C matrix
#> latvar
#> Width -1.65508569
#> Height 0.20684555
#> HP 0.02872421
concoef(fit4) # Better to get the C matrix this way
#> latvar
#> Width -1.65508569
#> Height 0.20684555
#> HP 0.02872421
Coef(fit4)@A # The A matrix
#> latvar
#> log(mu[,1]/mu[,10]) 1.0000000
#> log(mu[,2]/mu[,10]) 0.3941497
#> log(mu[,3]/mu[,10]) 0.2926141
#> log(mu[,4]/mu[,10]) 0.3737687
#> log(mu[,5]/mu[,10]) 0.4650100
#> log(mu[,6]/mu[,10]) 0.5202978
#> log(mu[,7]/mu[,10]) 0.7834292
#> log(mu[,8]/mu[,10]) 0.6651158
#> log(mu[,9]/mu[,10]) 0.2945323
svd(coef(fit4, matrix = TRUE)[-1, ])$d # Has rank 1; = C %*% t(A)
#> [1] 2.894480e+00 3.300153e-17 1.671748e-18
# Classification (but watch out for NAs in some of the variables):
apply(fitted(fit4), 1, which.max) # Classification
#> Acura Integra Acura Legend Audi 100
#> 5 10 10
#> Audi 80 BMW 325i BMW 535i
#> 5 5 5
#> Buick Century Buick Electra Buick Le Sabre
#> 10 10 10
#> Buick Riviera Cadillac Brougham Cadillac De Ville
#> 10 10 10
#> Cadillac Eldorado Chevrolet Astro Chevrolet Beretta
#> 10 10 10
#> Chevrolet Camaro Chevrolet Caprice Chevrolet Cavalier
#> 10 10 5
#> Chevrolet Corvette Chevrolet Lumina Chevrolet Lumina APV
#> 10 10 10
#> Chrysler Imperial Chrysler Le Baron Chrysler Le Baron Coupe
#> 10 5 10
#> Dodge Caravan Dodge Colt Dodge Daytona
#> 10 5 10
#> Dodge Dynasty Dodge Grand Caravan Dodge Omni
#> 10 10 5
#> Dodge Shadow Eagle Premier Eagle Summit
#> 5 10 5
#> Ford Aerostar Ford Escort Ford Festiva
#> 10 5 5
#> Ford LTD Crown Victoria Ford Mustang Ford Probe
#> 10 10 10
#> Ford Taurus Ford Tempo Ford Thunderbird
#> 10 10 10
#> GEO Metro GEO Prizm GEO Storm
#> 5 5 5
#> Honda Accord Honda Civic Honda Civic CRX
#> 5 5 5
#> Honda Prelude Hyundai Excel Hyundai Sonata
#> 5 5 10
#> Infiniti Q45 Lexus LS400 Lincoln Continental
#> 10 10 10
#> Lincoln Mark VII Lincoln Town Car Mazda 626
#> 10 10 5
#> Mazda 929 Mazda MPV Mazda MX-5 Miata
#> 5 10 5
#> Mazda MX-6 Mazda Protege Mazda RX7
#> 5 5 5
#> Mercedes-Benz 190 Mercedes-Benz 300E Mercury Tracer
#> 5 5 5
#> Mitsubishi Galant Mitsubishi Precis Mitsubishi Sigma
#> 5 5 5
#> Mitsubishi Wagon Nissan 240SX Nissan 300ZX
#> 5 5 10
#> Nissan Axxess Nissan Maxima Nissan Pulsar NX
#> 5 10 5
#> Nissan Sentra Nissan Stanza Nissan Van
#> 5 5 5
#> Peugeot 405 Peugeot 505 Plymouth Laser
#> 5 5 5
#> Pontiac Grand Am Pontiac LeMans Porsche 944
#> 5 5 5
#> Saab 900 Saab 9000 Sterling 827
#> 5 10 5
#> Subaru Justy Subaru Legacy Subaru Loyale
#> 7 5 5
#> Subaru XT Toyota Camry Toyota Celica
#> 5 5 10
#> Toyota Corolla Toyota Cressida Toyota Supra
#> 5 5 10
#> Toyota Tercel Volkswagen Corrado Volkswagen Fox
#> 5 5 5
#> Volkswagen GTI Volkswagen Golf Volkswagen Jetta
#> 5 5 5
#> Volkswagen Vanagon Volvo 240 Volvo 740
#> 10 5 10
# Classification:
colnames(fitted(fit4))[apply(fitted(fit4), 1, which.max)]
#> [1] "Japan" "USA" "USA" "Japan" "Japan" "Japan" "USA" "USA" "USA"
#> [10] "USA" "USA" "USA" "USA" "USA" "USA" "USA" "USA" "Japan"
#> [19] "USA" "USA" "USA" "USA" "Japan" "USA" "USA" "Japan" "USA"
#> [28] "USA" "USA" "Japan" "Japan" "USA" "Japan" "USA" "Japan" "Japan"
#> [37] "USA" "USA" "USA" "USA" "USA" "USA" "Japan" "Japan" "Japan"
#> [46] "Japan" "Japan" "Japan" "Japan" "Japan" "USA" "USA" "USA" "USA"
#> [55] "USA" "USA" "Japan" "Japan" "USA" "Japan" "Japan" "Japan" "Japan"
#> [64] "Japan" "Japan" "Japan" "Japan" "Japan" "Japan" "Japan" "Japan" "USA"
#> [73] "Japan" "USA" "Japan" "Japan" "Japan" "Japan" "Japan" "Japan" "Japan"
#> [82] "Japan" "Japan" "Japan" "Japan" "USA" "Japan" "Korea" "Japan" "Japan"
#> [91] "Japan" "Japan" "USA" "Japan" "Japan" "USA" "Japan" "Japan" "Japan"
#> [100] "Japan" "Japan" "Japan" "USA" "Japan" "USA"
apply(predict(fit4, car.all, type = "response"),
1, which.max) # Ditto
#> Acura Integra Acura Legend Audi 100
#> 5 10 10
#> Audi 80 BMW 325i BMW 535i
#> 5 5 5
#> Buick Century Buick Electra Buick Le Sabre
#> 10 10 10
#> Buick Regal Buick Riviera Cadillac Brougham
#> 10 10 10
#> Cadillac De Ville Cadillac Eldorado Chevrolet Astro
#> 10 10 10
#> Chevrolet Beretta Chevrolet Camaro Chevrolet Caprice
#> 10 10 10
#> Chevrolet Cavalier Chevrolet Corvette Chevrolet Lumina
#> 5 10 10
#> Chevrolet Lumina APV Chrysler Imperial Chrysler Le Baron
#> 10 10 5
#> Chrysler Le Baron Coupe Dodge Caravan Dodge Colt
#> 10 10 5
#> Dodge Daytona Dodge Dynasty Dodge Grand Caravan
#> 10 10 10
#> Dodge Omni Dodge Shadow Dodge Spirit
#> 5 5 5
#> Eagle Premier Eagle Summit Ford Aerostar
#> 10 5 10
#> Ford Escort Ford Festiva Ford LTD Crown Victoria
#> 5 5 10
#> Ford Mustang Ford Probe Ford Taurus
#> 10 10 10
#> Ford Tempo Ford Thunderbird GEO Metro
#> 10 10 5
#> GEO Prizm GEO Storm Honda Accord
#> 5 5 5
#> Honda Civic Honda Civic CRX Honda Prelude
#> 5 5 5
#> Hyundai Excel Hyundai Sonata Infiniti Q45
#> 5 10 10
#> Lexus LS400 Lincoln Continental Lincoln Mark VII
#> 10 10 10
#> Lincoln Town Car Mazda 626 Mazda 929
#> 10 5 5
#> Mazda MPV Mazda MX-5 Miata Mazda MX-6
#> 10 5 5
#> Mazda Protege Mazda RX7 Mercedes-Benz 190
#> 5 5 5
#> Mercedes-Benz 300E Mercury Tracer Mitsubishi Galant
#> 5 5 5
#> Mitsubishi Precis Mitsubishi Sigma Mitsubishi Wagon
#> 5 5 5
#> Nissan 240SX Nissan 300ZX Nissan Axxess
#> 5 10 5
#> Nissan Maxima Nissan Pulsar NX Nissan Sentra
#> 10 5 5
#> Nissan Stanza Nissan Van Peugeot 405
#> 5 5 5
#> Peugeot 505 Plymouth Laser Pontiac Bonneville
#> 5 5 10
#> Pontiac Grand Am Pontiac LeMans Porsche 944
#> 5 5 5
#> Saab 900 Saab 9000 Sterling 827
#> 5 10 5
#> Subaru Justy Subaru Legacy Subaru Loyale
#> 7 5 5
#> Subaru XT Toyota Camry Toyota Celica
#> 5 5 10
#> Toyota Corolla Toyota Cressida Toyota Supra
#> 5 5 10
#> Toyota Tercel Volkswagen Corrado Volkswagen Fox
#> 5 5 5
#> Volkswagen GTI Volkswagen Golf Volkswagen Jetta
#> 5 5 5
#> Volkswagen Vanagon Volvo 240 Volvo 740
#> 10 5 10
# Example 5: Using the xij argument (aka conditional logit model)
if (FALSE) set.seed(111)
nn <- 100 # Number of people who travel to work
M <- 3 # There are M+1 models of transport to go to work
ycounts <- matrix(0, nn, M+1)
ycounts[cbind(1:nn, sample(x = M+1, size = nn, replace = TRUE))] = 1
dimnames(ycounts) <- list(NULL, c("bus","train","car","walk"))
gotowork <- data.frame(cost.bus = runif(nn), time.bus = runif(nn),
cost.train= runif(nn), time.train= runif(nn),
cost.car = runif(nn), time.car = runif(nn),
cost.walk = runif(nn), time.walk = runif(nn))
gotowork <- round(gotowork, digits = 2) # For convenience
gotowork <- transform(gotowork,
Cost.bus = cost.bus - cost.walk,
Cost.car = cost.car - cost.walk,
Cost.train = cost.train - cost.walk,
Cost = cost.train - cost.walk, # for labelling
Time.bus = time.bus - time.walk,
Time.car = time.car - time.walk,
Time.train = time.train - time.walk,
Time = time.train - time.walk) # for labelling
fit5 <- vglm(ycounts ~ Cost + Time,
multinomial(parall = TRUE ~ Cost + Time - 1),
xij = list(Cost ~ Cost.bus + Cost.train + Cost.car,
Time ~ Time.bus + Time.train + Time.car),
form2 = ~ Cost + Cost.bus + Cost.train + Cost.car +
Time + Time.bus + Time.train + Time.car,
data = gotowork, trace = TRUE)
#> Iteration 1: deviance = 269.35663
#> Iteration 2: deviance = 269.31367
#> Iteration 3: deviance = 269.31367
head(model.matrix(fit5, type = "lm")) # LM model matrix
#> (Intercept) Cost Time
#> 1 1 0.13 0.26
#> 2 1 0.27 -0.07
#> 3 1 0.46 0.17
#> 4 1 0.03 -0.24
#> 5 1 -0.01 -0.17
#> 6 1 -0.18 0.48
head(model.matrix(fit5, type = "vlm")) # Big VLM model matrix
#> (Intercept):1 (Intercept):2 (Intercept):3 Cost Time
#> 1:1 1 0 0 -0.37 0.27
#> 1:2 0 1 0 0.13 0.26
#> 1:3 0 0 1 -0.63 0.68
#> 2:1 1 0 0 0.42 0.24
#> 2:2 0 1 0 0.27 -0.07
#> 2:3 0 0 1 0.36 0.23
coef(fit5)
#> (Intercept):1 (Intercept):2 (Intercept):3 Cost Time
#> 0.4511598 0.1050291 -0.1548926 0.3946012 0.6257691
coef(fit5, matrix = TRUE)
#> log(mu[,1]/mu[,4]) log(mu[,2]/mu[,4]) log(mu[,3]/mu[,4])
#> (Intercept) 0.4511598 0.1050291 -0.1548926
#> Cost 0.3946012 0.3946012 0.3946012
#> Time 0.6257691 0.6257691 0.6257691
constraints(fit5)
#> $`(Intercept)`
#> [,1] [,2] [,3]
#> [1,] 1 0 0
#> [2,] 0 1 0
#> [3,] 0 0 1
#>
#> $Cost
#> [,1]
#> [1,] 1
#> [2,] 1
#> [3,] 1
#>
#> $Time
#> [,1]
#> [1,] 1
#> [2,] 1
#> [3,] 1
#>
summary(fit5)
#>
#> Call:
#> vglm(formula = ycounts ~ Cost + Time, family = multinomial(parall = TRUE ~
#> Cost + Time - 1), data = gotowork, form2 = ~Cost + Cost.bus +
#> Cost.train + Cost.car + Time + Time.bus + Time.train + Time.car,
#> xij = list(Cost ~ Cost.bus + Cost.train + Cost.car, Time ~
#> Time.bus + Time.train + Time.car), trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept):1 0.4512 0.2758 1.636 0.102
#> (Intercept):2 0.1050 0.2941 0.357 0.721
#> (Intercept):3 -0.1549 0.3147 -0.492 0.623
#> Cost 0.3946 0.4020 0.982 0.326
#> Time 0.6258 0.4305 1.454 0.146
#>
#> Names of linear predictors: log(mu[,1]/mu[,4]), log(mu[,2]/mu[,4]),
#> log(mu[,3]/mu[,4])
#>
#> Residual deviance: 269.3137 on 295 degrees of freedom
#>
#> Log-likelihood: -134.6568 on 295 degrees of freedom
#>
#> Number of Fisher scoring iterations: 3
#>
#>
#> Reference group is level 4 of the response
max(abs(predict(fit5) - predict(fit5, new = gotowork))) # 0?
#> [1] 4.312523e-15
# \dontrun{}