Summarize continuous and categorical data in long format

This function makes a single table from both continuous and categorical data.

Usage

pt_demographics(
  data,
  cols_cont,
  cols_cat,
  span = NULL,
  units = NULL,
  table = NULL,
  stat_name = "Statistic",
  stat_width = 2,
  summarize_all = TRUE,
  all_name = "Summary",
  fun = dem_cont_fun,
  notes = pt_demographics_notes(),
  paneled = TRUE,
  denom = c("group", "total"),
  drop_miss = TRUE
)

Arguments

data: the data frame to summarize; the user should filter or subset so that data contains exactly the records to be summarized; pmtables will not add or remove rows prior to summarizing data
cols_cont: the continuous data columns to summarize; this argument may be specified as a character vector, comma-separated string or quosure
cols_cat: the categorical columns to summarize; this argument may be specified as a character vector, comma-separated string or quosure
span: variable name for column spanner
units: optional units for each summarized column; must be a named list where the names correspond with continuous data columns in data
table: a named list to use for renaming columns (see details and examples)
stat_name: name of statistic column
stat_width: width (in cm) of the statistic column
summarize_all: logical; if TRUE, summaries across all span levels will be appended to the right hand side of the table
all_name: a character name for the all data summary invoked by summarize_all
fun: The summary function to use for summarizing the continuous data; the default is dem_cont_fun(). The result will be validated with validate_dem_fun().
notes: notes a character vector of notes to place under the table
paneled: logical; if TRUE, the table will be paneled with the covariate names; otherwise, the covariate names will appear as the left-most column with non-repeating names cleared and separated with hline (see examples).
denom: the denominator to use when calculating percent for each level; group uses the total number in the chunk being summarized; total uses the total number in the data set; historically, group has been used as the default.
drop_miss: rows where the Statistic column is Missing will be dropped if there are no missing values; set this to FALSE to retain these rows.

Value

An object of class pmtable.

An object with class pmtable; see class-pmtable.

Details

When a continuous data summary function (fun) is passed, the user should also pass a set of notes that explain the summary statistics produced by that function. If no notes are passed, no notes will appear under the table.

The categorical data is summarized using pt_cat_long(). The default summary function for continuous variables is dem_cont_fun(). Please review that documentation for details on the default summary for this table.

If you wish to define your own function, please ensure the output is in the same format. Any number of columns is acceptable.

Examples


out <- pt_demographics(
  data = pmt_first,
  cols_cont = c(Age = "AGE", Weight = "WT"),
  cols_cat = c(Sex = "SEXf", Race = "ASIANf"),
  units = list(WT = "kg"),
  span = c(Study = "STUDYf")
)

out <- pt_demographics(
  data = pmt_first,
  cols_cont = "AGE,WT",
  cols_cat = "SEXf,ASIANf",
  paneled = FALSE,
  span = "FORMf"
)
tab <- stable(out)

pmtables:::pt_demographics_notes()
#> [1] "Categorical summary is count (percent)"
#> [2] "n: number of records summarized"       
#> [3] "SD: standard deviation"                
#> [4] "Min: minimum; Max: maximum"            

new_fun <- function(value = seq(1,5), name = "", ...) {
value <- value[!is.na(value)]
 tibble::tibble(
  `mean` = sig(mean(value)),
  `median` = sig(median(value)),
  `min-max` = paste0(sig(range(value)), collapse = " - ")
 )
}

out <- pt_demographics(
  data = pmt_first,
  cols_cont = "AGE,WT",
  cols_cat = "SEXf,ASIANf",
  fun = new_fun
)

pmtables:::dem_cont_fun(rnorm(20))
#> # A tibble: 1 × 4
#>   `Mean (SD)`   Median `Min / Max`  Missing
#>   <chr>         <chr>  <chr>        <chr>  
#> 1 -0.347 (1.07) -0.329 -2.94 / 1.15 0      
new_fun(rnorm(20))
#> # A tibble: 1 × 3
#>   mean   median `min-max`   
#>   <chr>  <chr>  <chr>       
#> 1 -0.228 -0.272 -2.09 - 1.78