Compute Summary Statistics on a Vector
smean.sd.RdA number of statistical summary functions is provided for use
with summary.formula and summarize (as well as
tapply and by themselves).
smean.cl.normal computes 3 summary variables: the sample mean and
lower and upper Gaussian confidence limits based on the t-distribution.
smean.sd computes the mean and standard deviation.
smean.sdl computes the mean plus or minus a constant times the
standard deviation.
smean.cl.boot is a very fast implementation of the basic
nonparametric bootstrap for obtaining confidence limits for the
population mean without assuming normality.
These functions all delete NAs automatically.
smedian.hilow computes the sample median and a selected pair of
outer quantiles having equal tail areas.
Usage
smean.cl.normal(x, mult=qt((1+conf.int)/2,n-1), conf.int=.95, na.rm=TRUE)
smean.sd(x, na.rm=TRUE)
smean.sdl(x, mult=2, na.rm=TRUE)
smean.cl.boot(x, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)
smedian.hilow(x, conf.int=.95, na.rm=TRUE)Arguments
- x
for summary functions
smean.*,smedian.hilow, a numeric vector from which NAs will be removed automatically- na.rm
defaults to
TRUEunlike built-in functions, so that by defaultNAs are automatically removed- mult
for
smean.cl.normalis the multiplier of the standard error of the mean to use in obtaining confidence limits of the population mean (default is appropriate quantile of the t distribution). Forsmean.sdl,multis the multiplier of the standard deviation used in obtaining a coverage interval about the sample mean. The default ismult=2to use plus or minus 2 standard deviations.- conf.int
for
smean.cl.normalandsmean.cl.bootspecifies the confidence level (0-1) for interval estimation of the population mean. Forsmedian.hilow,conf.intis the coverage probability the outer quantiles should target. When the default, 0.95, is used, the lower and upper quantiles computed are 0.025 and 0.975.- B
number of bootstrap resamples for
smean.cl.boot- reps
set to
TRUEto havesmean.cl.bootreturn the vector of bootstrapped means as therepsattribute of the returned object
Author
Frank Harrell
Department of Biostatistics
Vanderbilt University
fh@fharrell.com
Examples
set.seed(1)
x <- rnorm(100)
smean.sd(x)
#> Mean SD
#> 0.109 0.898
smean.sdl(x)
#> Mean Lower Upper
#> 0.109 -1.688 1.905
smean.cl.normal(x)
#> Mean Lower Upper
#> 0.1089 -0.0693 0.2871
smean.cl.boot(x)
#> Mean Lower Upper
#> 0.1089 -0.0582 0.2770
smedian.hilow(x, conf.int=.5) # 25th and 75th percentiles
#> Median Lower Upper
#> 0.114 -0.494 0.692
# Function to compute 0.95 confidence interval for the difference in two means
# g is grouping variable
bootdif <- function(y, g) {
g <- as.factor(g)
a <- attr(smean.cl.boot(y[g==levels(g)[1]], B=2000, reps=TRUE),'reps')
b <- attr(smean.cl.boot(y[g==levels(g)[2]], B=2000, reps=TRUE),'reps')
meandif <- diff(tapply(y, g, mean, na.rm=TRUE))
a.b <- quantile(b-a, c(.025,.975))
res <- c(meandif, a.b)
names(res) <- c('Mean Difference','.025','.975')
res
}