
Consistent API for 'emmeans' and 'marginaleffects'
Source:R/get_emcontrasts.R, R/get_emmeans.R, R/get_emtrends.R, and 3 more
get_emmeans.RdThese functions are convenient wrappers around the emmeans and the
marginaleffects packages. They are mostly available for developers who want
to leverage a unified API for getting model-based estimates, and regular users
should use the estimate_* set of functions.
The get_emmeans(), get_emcontrasts() and get_emtrends() functions are
wrappers around emmeans::emmeans() and emmeans::emtrends().
Usage
get_emcontrasts(
model,
contrast = NULL,
by = NULL,
predict = NULL,
comparison = "pairwise",
keep_iterations = FALSE,
verbose = TRUE,
...
)
get_emmeans(
model,
by = "auto",
predict = NULL,
keep_iterations = FALSE,
verbose = TRUE,
...
)
get_emtrends(
model,
trend = NULL,
by = NULL,
predict = NULL,
keep_iterations = FALSE,
verbose = TRUE,
...
)
get_marginalcontrasts(
model,
contrast = NULL,
by = NULL,
predict = NULL,
ci = 0.95,
comparison = "pairwise",
estimate = NULL,
transform = NULL,
p_adjust = "none",
keep_iterations = FALSE,
verbose = TRUE,
...
)
get_marginalmeans(
model,
by = "auto",
predict = NULL,
ci = 0.95,
estimate = NULL,
transform = NULL,
keep_iterations = FALSE,
verbose = TRUE,
...
)
get_marginaltrends(
model,
trend = NULL,
by = NULL,
predict = NULL,
ci = 0.95,
estimate = NULL,
transform = NULL,
p_adjust = "none",
keep_iterations = FALSE,
verbose = TRUE,
...
)Arguments
- model
A statistical model.
- contrast
A character vector indicating the name of the variable(s) for which to compute the contrasts, optionally including representative values or levels at which contrasts are evaluated (e.g.,
contrast="x=c('a','b')").- by
The (focal) predictor variable(s) at which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them).
bycan be a character (vector) naming the focal predictors, optionally including representative values or levels at which focal predictors are evaluated (e.g.,by = "x = c(1, 2)"). Whenestimateis not"average", thebyargument is used to create a "reference grid" or "data grid" with representative values for the focal predictors. In this case,bycan also be list of named elements. See details ininsight::get_datagrid()to learn more about how to create data grids for predictors of interest.- predict
Is passed to the
typeargument inemmeans::emmeans()(whenbackend = "emmeans") or inmarginaleffects::avg_predictions()(whenbackend = "marginaleffects"). Valid options forpredictare:backend = "marginaleffects":predictcan be"response","link","inverse_link"or any validtypeoption supported by model's classpredict()method (e.g., for zero-inflation models from package glmmTMB, you can choosepredict = "zprob"orpredict = "conditional"etc., see glmmTMB::predict.glmmTMB). By default, whenpredict = NULL, the most appropriate transformation is selected, which usually returns predictions or contrasts on the response-scale. The"inverse_link"is a special option, comparable to marginaleffects'invlink(link)option. It will calculate predictions on the link scale and then back-transform to the response scale.backend = "emmeans":predictcan be"response","link","mu","unlink", or"log". Ifpredict = NULL(default), the most appropriate transformation is selected (which usually is"response"). See also this vignette.
See also section Predictions on different scales.
- comparison
Specify the type of contrasts or tests that should be carried out.
When
backend = "emmeans", can be one of"pairwise","poly","consec","eff","del.eff","mean_chg","trt.vs.ctrl","dunnett","wtcon"and some more. To test multiple hypotheses jointly (usually used for factorial designs),comparisoncan also be"joint". See alsomethodargument in emmeans::contrast and the?emmeans::emmc-functions.For
backend = "marginaleffects", can be a numeric value, vector, or matrix, a string equation specifying the hypothesis to test, a string naming the comparison method, a formula, or a function. For options not described below, see documentation of marginaleffects::comparisons, this website and section Comparison options below.String: One of
"pairwise","reference","sequential","meandev""meanotherdev","poly","helmert", or"trt_vs_ctrl". To test multiple hypotheses jointly (usually used for factorial designs),comparisoncan also be"joint". In this case, use thetestargument to specify which test should be conducted:"F"(default) or"Chi2".String: Special string options are
"inequality","inequality_ratio", and"inequality_pairwise".comparison = "inequality"computes the marginal effect inequality summary of categorical predictors' overall effects, respectively, the comprehensive effect of an independent variable across all outcome categories of a nominal or ordinal dependent variable (also called absolute inequality, or total marginal effect, see Mize and Han, 2025)."inequality_ratio"computes the ratio of marginal effect inequality measures, also known as relative inequality. This is useful to compare the relative effects of different predictors on the dependent variable. It provides a measure of how much more or less inequality one predictor has compared to another.comparison = "inequality_pairwise"computes pairwise differences of absolute inequality measures, while"inequality_ratio_pairwise"computes pairwise differences of relative inequality measures (ratios). See an overview of applications in the related case study in the vignettes.String equation: To identify parameters from the output, either specify the term name, or
"b1","b2"etc. to indicate rows, e.g.:"hp = drat","b1 = b2", or"b1 + b2 + b3 = 0".Formula: A formula like
comparison ~ pairs | group, where the left-hand side indicates the type of comparison (differenceorratio), the right-hand side determines the pairs of estimates to compare (reference,sequential,meandev, etc., see string-options). Optionally, comparisons can be carried out within subsets by indicating the grouping variable after a vertical bar (|).A custom function, e.g.
comparison = myfun, orcomparison ~ I(my_fun(x)) | groups.If contrasts should be calculated (or grouped by) factors,
comparisoncan also be a matrix that specifies factor contrasts (see 'Examples').
- keep_iterations
If
TRUE, will keep all iterations (draws) of bootstrapped or Bayesian models. They will be added as additional columns namediter_1,iter_2, and so on. Ifkeep_iterationsis a positive number, only as many columns as indicated inkeep_iterationswill be added to the output. You can reshape them to a long format by runningbayestestR::reshape_iterations().- verbose
Use
FALSEto silence messages and warnings.- ...
Other arguments passed, for instance, to
insight::get_datagrid(), to functions from the emmeans or marginaleffects package, or to process Bayesian models viabayestestR::describe_posterior(). Examples:insight::get_datagrid(): Argument such aslength,digitsorrangecan be used to control the (number of) representative values. For integer variables,protect_integersmodulates whether these should also be treated as numerics, i.e. values can have fractions or not.marginaleffects: Internally used functions are
avg_predictions()for means and contrasts, andavg_slope()for slopes. Therefore, arguments for instance likevcov,equivalence,df,slope,hypothesisor evennewdatacan be passed to those functions. Aweightsargument is passed to thewtsargument inavg_predictions()oravg_slopes(), however, weights can only be applied whenestimateis"average"or"population"(i.e. for those marginalization options that do not use data grids). Other arguments, such asre.formorallow.new.levels, may be passed topredict()(which is internally used by marginaleffects) if supported by that model class.emmeans: Internally used functions are
emmeans()andemtrends(). Additional arguments can be passed to these functions.Bayesian models: For Bayesian models, parameters are cleaned using
describe_posterior(), thus, arguments like, for example,centrality,rope_range, ortestare passed to that function.Especially for
estimate_contrasts()with integer focal predictors, for which contrasts should be calculated, use argumentinteger_as_continuousto set the maximum number of unique values in an integer predictor to treat that predictor as "discrete integer" or as numeric. For the first case, contrasts are calculated between values of the predictor, for the latter, contrasts of slopes are calculated. If the integer has more thaninteger_as_continuousunique values, it is treated as numeric. Defaults to5. Set toTRUEto always treat integer predictors as continuous.For count regression models that use an offset term, use
offset = <value>to fix the offset at a specific value. Or useestimate = "average", to average predictions over the distribution of the offset (if appropriate).
- trend
A character indicating the name of the variable for which to compute the slopes. To get marginal effects at specific values, use
trend="<variable>"along with thebyargument, e.g.by="<variable> = c(1, 3, 5)", or a combination ofbyandlength, for instance,by="<variable>", length=30. To calculate average marginal effects over a range of values, usetrend="<variable> = seq(1, 3, 0.1)"(or similar) and omit the variable provided intrendfrom thebyargument.- ci
Confidence Interval (CI) level. Default to
0.95(95%).- estimate
The
estimateargument determines how predictions are averaged ("marginalized") over variables not specified inbyorcontrast(non-focal predictors). It controls whether predictions represent a "typical" individual, an "average" individual from the sample, or an "average" individual from a broader population."typical"(Default): Calculates predictions for a balanced data grid representing all combinations of focal predictor levels (specified inby). For non-focal numeric predictors, it uses the mean; for non-focal categorical predictors, it marginalizes (averages) over the levels. This represents a "typical" observation based on the data grid and is useful for comparing groups. It answers: "What would the average outcome be for a 'typical' observation?". This is the default approach when estimating marginal means using the emmeans package."average": Calculates predictions for each observation in the sample and then averages these predictions within each group defined by the focal predictors. This reflects the sample's actual distribution of non-focal predictors, not a balanced grid. It answers: "What is the predicted value for an average observation in my data?""population": "Clones" each observation, creating copies with all possible combinations of focal predictor levels. It then averages the predictions across these "counterfactual" observations (non-observed permutations) within each group. This extrapolates to a hypothetical broader population, considering "what if" scenarios. It answers: "What is the predicted response for the 'average' observation in a broader possible target population?" This approach entails more assumptions about the likelihood of different combinations, but can be more apt to generalize. This is also the option that should be used for G-computation (causal inference, see Chatton and Rohrer 2024)."counterfactual"is an alias for"population".
You can set a default option for the
estimateargument viaoptions(), e.g.options(modelbased_estimate = "average").Note following limitations:
When you set
estimateto"average", it calculates the average based only on the data points that actually exist. This is in particular important for two or more focal predictors, because it doesn't generate a complete grid of all theoretical combinations of predictor values. Consequently, the output may not include all the values.Filtering the output at values of continuous predictors, e.g.
by = "x=1:5", in combination withestimate = "average"may result in returning an empty data frame because of what was described above. In such case, you can useestimate = "typical"or use thenewdataargument to provide a data grid of predictor values at which to evaluate predictions.estimate = "population"is not available forestimate_slopes().
- transform
A function applied to predictions and confidence intervals to (back-) transform results, which can be useful in case the regression model has a transformed response variable (e.g.,
lm(log(y) ~ x)). For Bayesian models, this function is applied to individual draws from the posterior distribution, before computing summaries. Can also beTRUE, in which caseinsight::get_transformation()is called to determine the appropriate transformation-function. Note that no standard errors are returned when transformations are applied.- p_adjust
The p-values adjustment method for frequentist multiple comparisons. For
estimate_slopes(), multiple comparison only occurs for Johnson-Neyman intervals, i.e. in case of interactions with two numeric predictors (one specified inslope, one inby). In this case, the"esarey"or"sup-t"options are recommended, butp_adjustcan also be one of"none"(default),"hochberg","hommel","bonferroni","BH","BY","fdr","tukey","sidak", or"holm"."sup-t"computes simultaneous confidence bands, also called sup-t confidence band (Montiel Olea & Plagborg-Møller, 2019).
Examples
# Basic usage
model <- lm(Sepal.Width ~ Species, data = iris)
get_emcontrasts(model)
#> No variable was specified for contrast estimation. Selecting `contrast =
#> "Species"`.
#> contrast estimate SE df t.ratio p.value
#> setosa - versicolor 0.658 0.0679 147 9.685 <0.0001
#> setosa - virginica 0.454 0.0679 147 6.683 <0.0001
#> versicolor - virginica -0.204 0.0679 147 -3.003 0.0088
#>
#> P value adjustment: tukey method for comparing a family of 3 estimates
if (FALSE) { # \dontrun{
# Dealing with interactions
model <- lm(Sepal.Width ~ Species * Petal.Width, data = iris)
# By default: selects first factor
get_emcontrasts(model)
# Or both
get_emcontrasts(model, contrast = c("Species", "Petal.Width"), length = 2)
# Or with custom specifications
get_emcontrasts(model, contrast = c("Species", "Petal.Width=c(1, 2)"))
# Or modulate it
get_emcontrasts(model, by = "Petal.Width", length = 4)
} # }
model <- lm(Sepal.Length ~ Species + Petal.Width, data = iris)
# By default, 'by' is set to "Species"
get_emmeans(model)
#> We selected `by = c("Species")`.
#> Species emmean SE df lower.CL upper.CL
#> setosa 5.88 0.1970 146 5.49 6.27
#> versicolor 5.82 0.0723 146 5.68 5.96
#> virginica 5.83 0.1740 146 5.49 6.17
#>
#> Confidence level used: 0.95
if (FALSE) { # \dontrun{
# Overall mean (close to 'mean(iris$Sepal.Length)')
get_emmeans(model, by = NULL)
# One can estimate marginal means at several values of a 'modulate' variable
get_emmeans(model, by = "Petal.Width", length = 3)
# Interactions
model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris)
get_emmeans(model)
get_emmeans(model, by = c("Species", "Petal.Length"), length = 2)
get_emmeans(model, by = c("Species", "Petal.Length = c(1, 3, 5)"), length = 2)
} # }
if (FALSE) { # \dontrun{
model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris)
get_emtrends(model)
get_emtrends(model, by = "Species")
get_emtrends(model, by = "Petal.Length")
get_emtrends(model, by = c("Species", "Petal.Length"))
} # }
model <- lm(Petal.Length ~ poly(Sepal.Width, 4), data = iris)
get_emtrends(model)
#> No numeric variable was specified for slope estimation. Selecting `trend
#> = "Sepal.Width"`.
#> 1 Sepal.Width.trend SE df lower.CL upper.CL
#> overall -2.67 0.548 145 -3.75 -1.58
#>
#> Confidence level used: 0.95
get_emtrends(model, by = "Sepal.Width")
#> No numeric variable was specified for slope estimation. Selecting `trend
#> = "Sepal.Width"`.
#> Sepal.Width Sepal.Width.trend SE df lower.CL upper.CL
#> 2.00 7.484 5.420 145 -3.225 18.192
#> 2.27 3.775 2.090 145 -0.357 7.906
#> 2.53 0.834 0.765 145 -0.678 2.346
#> 2.80 -1.337 0.706 145 -2.732 0.058
#> 3.07 -2.700 0.543 145 -3.773 -1.628
#> 3.33 -3.231 0.606 145 -4.430 -2.033
#> 3.60 -2.909 0.838 145 -4.564 -1.254
#> 3.87 -1.705 1.010 145 -3.701 0.290
#> 4.13 0.394 2.390 145 -4.327 5.116
#> 4.40 3.431 5.800 145 -8.028 14.890
#>
#> Confidence level used: 0.95
model <- lm(Sepal.Length ~ Species + Petal.Width, data = iris)
# By default, 'by' is set to "Species"
get_marginalmeans(model)
#> We selected `by=c("Species")`.
#>
#> Species Estimate Std. Error t Pr(>|t|) S 2.5 % 97.5 % Df
#> setosa 5.88 0.1969 29.9 <0.001 210.3 5.49 6.27 146
#> versicolor 5.82 0.0723 80.5 <0.001 405.6 5.68 5.96 146
#> virginica 5.83 0.1741 33.5 <0.001 231.4 5.49 6.17 146
#>
#> Type: response
#>
# Overall mean (close to 'mean(iris$Sepal.Length)')
get_marginalmeans(model, by = NULL)
#>
#> Estimate Std. Error t Pr(>|t|) S 2.5 % 97.5 % Df
#> 5.84 0.0393 149 <0.001 533.4 5.77 5.92 146
#>
#> Type: response
#>
if (FALSE) { # \dontrun{
# One can estimate marginal means at several values of a 'modulate' variable
get_marginalmeans(model, by = "Petal.Width", length = 3)
# Interactions
model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris)
get_marginalmeans(model)
get_marginalmeans(model, by = c("Species", "Petal.Length"), length = 2)
get_marginalmeans(model, by = c("Species", "Petal.Length = c(1, 3, 5)"), length = 2)
} # }
model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris)
get_marginaltrends(model, trend = "Petal.Length", by = "Species")
#>
#> Species Estimate Std. Error t Pr(>|t|) S 2.5 % 97.5 % Df
#> setosa 0.388 0.2602 1.49 0.13825 2.9 -0.1264 0.902 144
#> versicolor 0.374 0.0963 3.89 < 0.001 12.7 0.1839 0.565 144
#> virginica 0.234 0.0818 2.86 0.00482 7.7 0.0726 0.396 144
#>
#> Term: Petal.Length
#> Type: response
#> Comparison: dY/dX
#>
get_marginaltrends(model, trend = "Petal.Length", by = "Petal.Length")
#>
#> Petal.Length Estimate Std. Error t Pr(>|t|) S 2.5 % 97.5 % Df
#> 1.00 0.332 0.0964 3.45 <0.001 10.4 0.142 0.523 144
#> 1.66 0.332 0.0963 3.45 <0.001 10.4 0.142 0.523 144
#> 2.31 0.332 0.0966 3.44 <0.001 10.4 0.141 0.523 144
#> 2.97 0.332 0.0966 3.44 <0.001 10.4 0.141 0.523 144
#> 3.62 0.332 0.0963 3.45 <0.001 10.4 0.142 0.523 144
#> 4.28 0.332 0.0963 3.45 <0.001 10.4 0.142 0.523 144
#> 4.93 0.332 0.0965 3.44 <0.001 10.4 0.141 0.523 144
#> 5.59 0.332 0.0963 3.45 <0.001 10.4 0.142 0.523 144
#> 6.24 0.332 0.0963 3.45 <0.001 10.4 0.142 0.522 144
#> 6.90 0.332 0.0966 3.44 <0.001 10.4 0.141 0.523 144
#>
#> Term: Petal.Length
#> Type: response
#> Comparison: dY/dX
#>
get_marginaltrends(model, trend = "Petal.Length", by = c("Species", "Petal.Length"))
#>
#> Species Petal.Length Estimate Std. Error t Pr(>|t|) S 2.5 % 97.5 %
#> setosa 1.00 0.388 0.2602 1.49 0.13820 2.9 -0.1264 0.902
#> setosa 1.66 0.388 0.2601 1.49 0.13815 2.9 -0.1263 0.902
#> versicolor 3.62 0.374 0.0963 3.89 < 0.001 12.7 0.1839 0.565
#> versicolor 4.28 0.374 0.0963 3.89 < 0.001 12.7 0.1840 0.565
#> versicolor 4.93 0.374 0.0960 3.90 < 0.001 12.7 0.1846 0.564
#> virginica 4.93 0.234 0.0819 2.86 0.00482 7.7 0.0726 0.396
#> virginica 5.59 0.234 0.0819 2.86 0.00484 7.7 0.0725 0.396
#> virginica 6.24 0.234 0.0819 2.86 0.00482 7.7 0.0726 0.396
#> virginica 6.90 0.234 0.0818 2.86 0.00482 7.7 0.0726 0.396
#> Df
#> 144
#> 144
#> 144
#> 144
#> 144
#> 144
#> 144
#> 144
#> 144
#>
#> Term: Petal.Length
#> Type: response
#> Comparison: dY/dX
#>