Cross-Tabulation

Cross-tabulation for a pair of categorical variables with either row, column, or total proportions, as well as marginal sums. Works with numeric, character, as well as factor variables.

Usage

ctable(
  x,
  y,
  prop = st_options("ctable.prop"),
  useNA = "ifany",
  totals = st_options("ctable.totals"),
  style = st_options("style"),
  round.digits = st_options("ctable.round.digits"),
  justify = "right",
  plain.ascii = st_options("plain.ascii"),
  headings = st_options("headings"),
  display.labels = st_options("display.labels"),
  split.tables = Inf,
  na.val = st_options("na.val"),
  rev = "none",
  dnn = c(substitute(x), substitute(y)),
  chisq = FALSE,
  OR = FALSE,
  RR = FALSE,
  weights = NA,
  rescale.weights = FALSE,
  ...
)

Arguments

x: First categorical variable - values will appear as row names.
y: Second categorical variable - values will appear as column names.
prop: Character. Indicates which proportions to show: “r” (rows, default), “c” (columns), “t” (total), or “n” (none). Default value can be changed using st_options, option ctable.prop.
useNA: Character. One of “ifany” (default), “no”, or “always”. This argument is passed on ‘as is’ to table, or adapted for xtabs when weights are used.
totals: Logical. Show row and column totals. Defaults to TRUE but can be set globally with st_options, option ctable.totals.
style: Character. Style to be used by pander. One of “simple” (default), “grid”, “rmarkdown”, or “jira”. Can be set globally with st_options.
round.digits: Numeric. Number of significant digits to keep. Defaults to 1. To change this default value, use st_options, option ctable.round.digits.
justify: Character. Horizontal alignment; one of “l” (left), “c” (center), or “r” (right, default).
plain.ascii: Logical. Used by pander; when TRUE, no markup characters are generated (useful when printing to console). Defaults to TRUE unless style = 'rmarkdown', in which case it is set to FALSE automatically. To change the default value globally, use st_options.
headings: Logical. Show heading section. TRUE by default; can be set globally with st_options.
display.labels: Logical. Display data frame label in the heading section. TRUE by default, can be changed globally with st_options.
split.tables: Numeric. pander argument that specifies how many characters wide a table can be. Inf by default.
na.val: Character. For factors and character vectors, consider this value as NA. Ignored if there are actual NA values or if it matches no value / factor level in the data. NULL by default.
rev: Character. Dimension(s) to reverse for calculation of risk/odds ratios. One of “rows” / “r”, “columns” / “c”, “both” / “b”, or “none” / “n” (default). See details.
dnn: Character vector. Variable names to be used in output table. In most cases, setting this parameter is not required as the names are automatically generated.
chisq: Logical. Display chi-square statistic along with p-value.
OR: Logical or numeric. Set to TRUE to show odds ratio with 95 confidence interval, or specify confidence level explicitly (e.g., .90). CI's are calculated using Wald's method of normal approximation.
RR: Logical or numeric. Set to TRUE to show risk ratio (also called relative risk with 95 confidence level explicitly (e.g. .90). CI's are calculated using Wald's method of normal approximation.
weights: Numeric. Vector of weights; must have the same length as x.
rescale.weights: Logical. When TRUE, a global constant is applied so that the sum of counts equals nrow(x). FALSE by default.
...: Additional arguments passed to pander or format.

Value

A list containing two matrices, cross_table and proportions. The print method takes care of assembling figures from those matrices into a single table. The returned object has classes “summarytools” and “list”, unless stby is used, in which case we have an object of class “stby”.

Details

For risk ratios and odds ratios, the expected structure of the contingency table is as follows (using “No” as reference):


             Outcome
 Exposure      Yes     No
  Yes          a       b
  No           c       d

The rev parameter allows for different structures; use either one of “rows”, “columns”, or “both” to indicate which dimension(s) to reverse in order to match that structure. This does not affect display.

Note

Markdown does not fully support multi-header tables; until such support is available, the recommended way to display cross-tables in .Rmd documents is to use `method=render`. See package vignettes for examples.

Author

Dominic Comtois, dominic.comtois@gmail.com

Examples

data("tobacco")
ctable(tobacco$gender, tobacco$smoker)
#> Cross-Tabulation, Row Proportions  
#> gender * smoker  
#> Data Frame: tobacco  
#> 
#> -------- -------- ------------- ------------- ---------------
#>            smoker           Yes            No           Total
#>   gender                                                     
#>        F            147 (30.1%)   342 (69.9%)    489 (100.0%)
#>        M            143 (29.2%)   346 (70.8%)    489 (100.0%)
#>     <NA>              8 (36.4%)    14 (63.6%)     22 (100.0%)
#>    Total            298 (29.8%)   702 (70.2%)   1000 (100.0%)
#> -------- -------- ------------- ------------- ---------------

# Use with() to simplify syntax
with(tobacco, ctable(gender, smoker))
#> Cross-Tabulation, Row Proportions  
#> gender * smoker  
#> Data Frame: tobacco  
#> 
#> -------- -------- ------------- ------------- ---------------
#>            smoker           Yes            No           Total
#>   gender                                                     
#>        F            147 (30.1%)   342 (69.9%)    489 (100.0%)
#>        M            143 (29.2%)   346 (70.8%)    489 (100.0%)
#>     <NA>              8 (36.4%)    14 (63.6%)     22 (100.0%)
#>    Total            298 (29.8%)   702 (70.2%)   1000 (100.0%)
#> -------- -------- ------------- ------------- ---------------

# Show column proportions, without totals
with(tobacco, ctable(smoker, diseased, prop = "c", totals = FALSE))
#> Cross-Tabulation, Column Proportions  
#> smoker * diseased  
#> Data Frame: tobacco  
#> 
#> -------- ---------- ------------- -------------
#>            diseased           Yes            No
#>   smoker                                       
#>      Yes              125 (55.8%)   173 (22.3%)
#>       No               99 (44.2%)   603 (77.7%)
#> -------- ---------- ------------- -------------

# Simple 2 x 2 table with odds ratio and risk ratio
with(tobacco, ctable(smoker, diseased, totals = FALSE, headings = FALSE,
                     prop = "r", OR = TRUE, RR = TRUE))
#> 
#> -------- ---------- ------------- -------------
#>            diseased           Yes            No
#>   smoker                                       
#>      Yes              125 (41.9%)   173 (58.1%)
#>       No               99 (14.1%)   603 (85.9%)
#> -------- ---------- ------------- -------------
#> 
#> ----------------------------------
#>  Odds Ratio   Lo - 95%   Hi - 95% 
#> ------------ ---------- ----------
#>     4.40        3.22       6.02   
#> ----------------------------------
#> 
#> ----------------------------------
#>  Risk Ratio   Lo - 95%   Hi - 95% 
#> ------------ ---------- ----------
#>     2.97        2.37       3.73   
#> ----------------------------------

# Grouped cross-tabulations
with(tobacco, stby(data = list(x = smoker, y = diseased), 
                   INDICES = gender, FUN = ctable))
#> NA detected in grouping variable(s); consider using useNA = TRUE
#> Cross-Tabulation, Row Proportions  
#> smoker * diseased  
#> Data Frame: tobacco  
#> Group: gender = F  
#> 
#> -------- ---------- ------------- ------------- --------------
#>            diseased           Yes            No          Total
#>   smoker                                                      
#>      Yes               62 (42.2%)    85 (57.8%)   147 (100.0%)
#>       No               49 (14.3%)   293 (85.7%)   342 (100.0%)
#>    Total              111 (22.7%)   378 (77.3%)   489 (100.0%)
#> -------- ---------- ------------- ------------- --------------
#> 
#> Group: gender = M  
#> 
#> -------- ---------- ------------- ------------- --------------
#>            diseased           Yes            No          Total
#>   smoker                                                      
#>      Yes               63 (44.1%)    80 (55.9%)   143 (100.0%)
#>       No               47 (13.6%)   299 (86.4%)   346 (100.0%)
#>    Total              110 (22.5%)   379 (77.5%)   489 (100.0%)
#> -------- ---------- ------------- ------------- --------------


if (FALSE) { # \dontrun{
ct <- ctable(tobacco$gender, tobacco$smoker)

# Show html results in browser
print(ct, method = "browser")

# Save results to html file
print(ct, file = "ct_gender_smoker.html")

# Save results to text file
print(ct, file = "ct_gender_smoker.txt")
} # }