R/h_response_biomarkers_subgroups.R
h_response_biomarkers_subgroups.RdHelper functions which are documented here separately to not confuse the user when reading about the user-facing functions.
h_rsp_to_logistic_variables(variables, biomarker)
h_logistic_mult_cont_df(variables, data, control = control_logistic())(named list of string)
list of additional analysis variables.
(string)
the name of the biomarker variable.
(data.frame)
the dataset containing the variables to summarize.
(named list)
controls for the response definition and the
confidence level produced by control_logistic().
h_rsp_to_logistic_variables() returns a named list of elements response, arm, covariates, and strata.
h_logistic_mult_cont_df() returns a data.frame containing estimates and statistics for the selected biomarkers.
h_rsp_to_logistic_variables(): helps with converting the "response" function variable list
to the "logistic regression" variable list. The reason is that currently there is an
inconsistency between the variable names accepted by extract_rsp_subgroups() and fit_logistic().
h_logistic_mult_cont_df(): prepares estimates for number of responses, patients and
overall response rate, as well as odds ratio estimates, confidence intervals and p-values, for multiple
biomarkers in a given single data set.
variables corresponds to names of variables found in data, passed as a named list and requires elements
rsp and biomarkers (vector of continuous biomarker variables) and optionally covariates
and strata.
library(dplyr)
library(forcats)
adrs <- tern_ex_adrs
adrs_labels <- formatters::var_labels(adrs)
adrs_f <- adrs %>%
filter(PARAMCD == "BESRSPI") %>%
mutate(rsp = AVALC == "CR")
formatters::var_labels(adrs_f) <- c(adrs_labels, "Response")
# This is how the variable list is converted internally.
h_rsp_to_logistic_variables(
variables = list(
rsp = "RSP",
covariates = c("A", "B"),
strata = "D"
),
biomarker = "AGE"
)
#> $response
#> [1] "RSP"
#>
#> $arm
#> [1] "AGE"
#>
#> $covariates
#> [1] "A" "B"
#>
#> $strata
#> [1] "D"
#>
# For a single population, estimate separately the effects
# of two biomarkers.
df <- h_logistic_mult_cont_df(
variables = list(
rsp = "rsp",
biomarkers = c("BMRKR1", "AGE"),
covariates = "SEX"
),
data = adrs_f
)
df
#> biomarker biomarker_label n_tot n_rsp prop or lcl
#> 1 BMRKR1 Continuous Level Biomarker 1 200 164 0.82 0.9755036 0.8804862
#> 2 AGE Age 200 164 0.82 0.9952416 0.9462617
#> ucl conf_level pval pval_label
#> 1 1.080775 0.95 0.6352602 p-value (Wald)
#> 2 1.046757 0.95 0.8530389 p-value (Wald)
# If the data set is empty, still the corresponding rows with missings are returned.
h_coxreg_mult_cont_df(
variables = list(
rsp = "rsp",
biomarkers = c("BMRKR1", "AGE"),
covariates = "SEX",
strata = "STRATA1"
),
data = adrs_f[NULL, ]
)
#> biomarker biomarker_label n_tot n_tot_events median hr lcl ucl
#> 1 BMRKR1 Continuous Level Biomarker 1 0 0 NA NA NA NA
#> 2 AGE Age 0 0 NA NA NA NA
#> conf_level pval pval_label
#> 1 0.95 NA p-value (Wald)
#> 2 0.95 NA p-value (Wald)