
Summarize continuous and categorical data in long format
Source:R/demographics-table.R
pt_demographics.RdThis function makes a single table from both continuous and categorical data.
Usage
pt_demographics(
data,
cols_cont,
cols_cat,
span = NULL,
units = NULL,
table = NULL,
stat_name = "Statistic",
stat_width = 2,
summarize_all = TRUE,
all_name = "Summary",
fun = dem_cont_fun,
notes = pt_demographics_notes(),
paneled = TRUE,
denom = c("group", "total"),
drop_miss = TRUE
)Arguments
- data
the data frame to summarize; the user should filter or subset so that data contains exactly the records to be summarized; pmtables will not add or remove rows prior to summarizing data
- cols_cont
the continuous data columns to summarize; this argument may be specified as a character vector, comma-separated string or quosure
- cols_cat
the categorical columns to summarize; this argument may be specified as a character vector, comma-separated string or quosure
- span
variable name for column spanner
- units
optional units for each summarized column; must be a named list where the names correspond with continuous data columns in
data- table
a named list to use for renaming columns (see details and examples)
- stat_name
name of statistic column
- stat_width
width (in cm) of the statistic column
- summarize_all
logical; if
TRUE, summaries across allspanlevels will be appended to the right hand side of the table- all_name
a character name for the all data summary invoked by
summarize_all- fun
The summary function to use for summarizing the continuous data; the default is
dem_cont_fun(). The result will be validated withvalidate_dem_fun().- notes
notes a character vector of notes to place under the table
- paneled
logical; if
TRUE, the table will be paneled with the covariate names; otherwise, the covariate names will appear as the left-most column with non-repeating names cleared and separated withhline(see examples).- denom
the denominator to use when calculating percent for each level;
groupuses the total number in the chunk being summarized;totaluses the total number in the data set; historically,grouphas been used as the default.- drop_miss
rows where the
Statisticcolumn isMissingwill be dropped if there are no missing values; set this toFALSEto retain these rows.
Details
When a continuous data summary function (fun) is passed, the user should
also pass a set of notes that explain the summary statistics produced
by that function. If no notes are passed, no notes will appear under the
table.
The categorical data is summarized using pt_cat_long().
The default summary function for continuous variables is dem_cont_fun().
Please review that documentation for details on the default summary for this
table.
If you wish to define your own function, please ensure the output is in the same format. Any number of columns is acceptable.
Examples
out <- pt_demographics(
data = pmt_first,
cols_cont = c(Age = "AGE", Weight = "WT"),
cols_cat = c(Sex = "SEXf", Race = "ASIANf"),
units = list(WT = "kg"),
span = c(Study = "STUDYf")
)
out <- pt_demographics(
data = pmt_first,
cols_cont = "AGE,WT",
cols_cat = "SEXf,ASIANf",
paneled = FALSE,
span = "FORMf"
)
tab <- stable(out)
pmtables:::pt_demographics_notes()
#> [1] "Categorical summary is count (percent)"
#> [2] "n: number of records summarized"
#> [3] "SD: standard deviation"
#> [4] "Min: minimum; Max: maximum"
new_fun <- function(value = seq(1,5), name = "", ...) {
value <- value[!is.na(value)]
tibble::tibble(
`mean` = sig(mean(value)),
`median` = sig(median(value)),
`min-max` = paste0(sig(range(value)), collapse = " - ")
)
}
out <- pt_demographics(
data = pmt_first,
cols_cont = "AGE,WT",
cols_cat = "SEXf,ASIANf",
fun = new_fun
)
pmtables:::dem_cont_fun(rnorm(20))
#> # A tibble: 1 × 4
#> `Mean (SD)` Median `Min / Max` Missing
#> <chr> <chr> <chr> <chr>
#> 1 -0.347 (1.07) -0.329 -2.94 / 1.15 0
new_fun(rnorm(20))
#> # A tibble: 1 × 3
#> mean median `min-max`
#> <chr> <chr> <chr>
#> 1 -0.228 -0.272 -2.09 - 1.78