Skip to contents

The work is done by the functions summarizeNumerics and summarizeFactors. Please see the help pages for those functions for complete details.

Usage

summarize(
  dat,
  alphaSort = FALSE,
  stats = c("mean", "sd", "skewness", "kurtosis", "entropy", "normedEntropy", "nobs",
    "nmiss"),
  probs = c(0, 0.5, 1),
  digits = 3,
  ...
)

Arguments

dat

A data frame

alphaSort

If TRUE, the columns are re-organized in alphabetical order. If FALSE, they are presented in the original order.

stats

A vector of desired summary statistics. Set stats = NULL to omit all stat summaries. Legal elements are c("min", "med", "max", "mean", "sd", "var", "skewness", "kurtosis", "entropy", "normedEntropy", "nobs", "nmiss"). The statistics c("entropy", "normedEntropy") are available only for factor variables, while mean, variance, and so forth will be calculated only for numeric variables. "nobs" is the number of observations with non-missing, finite scores (not NA, NaN, -Inf, or Inf). "nmiss" is the number of cases with values of NA. The default setting for probs will cause c("min", "med", "max") to be included, they need not be requested explicitly. To disable them, revise probs.

probs

For numeric variables, is used with the quantile function. The default is probs = c(0, .50, 1.0), which are labeled in output as c("min", "med", and "max"). Set probs = NULL to prevent these in the output.

digits

Decimal values to display, defaults as 2.

...

Optional arguments that are passed to summarizeNumerics and summarizeFactors. For numeric variables, one can specify na.rm and unbiased. For discrete variables, the key argument is maxLevels, which determines the number of levels that will be reported in tables for discrete variables.

Value

Return is a list with two objects 1) output from summarizeNumerics: a data frame with variable names on rows and summary stats on columns, 2) output from summarizeFactors: a list with summary information about each discrete variable. The display on-screen is governed by a method print.summarize.

Details

The major purpose here is to generate summary data structure that is more useful in subsequent data analysis. The numeric portion of the summaries are a data frame that can be used in plots or other diagnostics.

The term "factors" was used, but "discrete variables" would have been more accurate. The factor summaries will collect all logical, factor, ordered, and character variables.

Other variable types, such as Dates, will be ignored, with a warning.

Author

Paul E. Johnson pauljohn@ku.edu

Examples

library(rockchalk)


set.seed(23452345)
N <- 100
x1 <- gl(12, 2, labels = LETTERS[1:12])
x2 <- gl(8, 3, labels = LETTERS[12:24])
x1 <- sample(x = x1, size=N, replace = TRUE)
x2 <- sample(x = x2, size=N, replace = TRUE)
z1 <- rnorm(N)
a1 <- rnorm(N, mean = 1.2, sd = 11.7)
a2 <- rpois(N, lambda = 10 + abs(a1))
a3 <- rgamma(N, 0.5, 4)
b1 <- rnorm(N, mean = 211.3, sd = 0.4)
dat <- data.frame(z1, a1, x2, a2, x1, a3, b1)
summary(dat)
#>        z1                a1                 x2           a2              x1    
#>  Min.   :-2.0855   Min.   :-21.5918   N      :16   Min.   : 4.00   I      :13  
#>  1st Qu.:-0.8893   1st Qu.: -9.4553   S      :16   1st Qu.:14.00   H      :12  
#>  Median :-0.1856   Median : -1.7660   M      :15   Median :20.50   L      :11  
#>  Mean   :-0.1872   Mean   : -0.3338   P      :15   Mean   :21.19   G      :10  
#>  3rd Qu.: 0.4332   3rd Qu.:  8.6615   Q      :12   3rd Qu.:26.25   J      :10  
#>  Max.   : 2.0231   Max.   : 38.0818   O      :11   Max.   :53.00   E      : 9  
#>                                       (Other):15                   (Other):35  
#>        a3                  b1       
#>  Min.   :3.330e-06   Min.   :210.1  
#>  1st Qu.:1.037e-02   1st Qu.:211.0  
#>  Median :6.276e-02   Median :211.3  
#>  Mean   :1.333e-01   Mean   :211.3  
#>  3rd Qu.:1.713e-01   3rd Qu.:211.6  
#>  Max.   :1.129e+00   Max.   :212.3  
#>                                     

summarize(dat)
#> Numeric variables
#>              z1        a1        a2        a3        b1   
#> min         -2.086   -21.592     4         0       210.101
#> med         -0.186    -1.766    20.500     0.063   211.322
#> max          2.023    38.082    53         1.129   212.291
#> mean        -0.187    -0.334    21.190     0.133   211.316
#> sd           0.956    12.810     9.054     0.184     0.422
#> skewness     0.161     0.463     0.837     2.440    -0.166
#> kurtosis    -0.606    -0.246     0.770     7.918    -0.240
#> nobs       100       100       100       100       100    
#> nmissing     0         0         0         0         0    
#> 
#> Nonnumeric variables
#>                   x2                     x1   
#>  N           : 16       I           : 13      
#>  S           : 16       H           : 12      
#>  M           : 15       L           : 11      
#>  P           : 15       G           : 10      
#>  (All Others): 38       (All Others): 54      
#>  nobs         : 100.000 nobs         : 100.000
#>  nmiss        :   0.000 nmiss        :   0.000
#>  entropy      :   2.945 entropy      :   3.503
#>  normedEntropy:   0.982 normedEntropy:   0.977

summarize(dat, digits = 4)
#> Numeric variables
#>               z1         a1         a2         a3         b1   
#> min         -2.0855   -21.5918     4          0        210.1006
#> med         -0.1856    -1.7660    20.5000     0.0628   211.3223
#> max          2.0231    38.0818    53          1.1294   212.2905
#> mean        -0.1872    -0.3338    21.1900     0.1333   211.3164
#> sd           0.9558    12.8097     9.0539     0.1842     0.4216
#> skewness     0.1606     0.4632     0.8366     2.4396    -0.1657
#> kurtosis    -0.6063    -0.2458     0.7702     7.9180    -0.2397
#> nobs       100        100        100        100        100     
#> nmissing     0          0          0          0          0     
#> 
#> Nonnumeric variables
#>                   x2                      x1    
#>  N           : 16        I           : 13       
#>  S           : 16        H           : 12       
#>  M           : 15        L           : 11       
#>  P           : 15        G           : 10       
#>  (All Others): 38        (All Others): 54       
#>  nobs         : 100.0000 nobs         : 100.0000
#>  nmiss        :   0.0000 nmiss        :   0.0000
#>  entropy      :   2.9445 entropy      :   3.5031
#>  normedEntropy:   0.9815 normedEntropy:   0.9772

summarize(dat, stats = c("min", "max", "mean", "sd"),
          probs = c(0.25, 0.75))
#> Numeric variables
#>                z1        a1        a2        a3        b1   
#> pctile_25%    -0.889    -9.455    14         0.010   211.009
#> pctile_75%     0.433     8.661    26.250     0.171   211.601
#> mean          -0.187    -0.334    21.190     0.133   211.316
#> sd             0.956    12.810     9.054     0.184     0.422
#> 
#> Nonnumeric variables
#>             x2               x1   
#>  N           : 16 I           : 13
#>  S           : 16 H           : 12
#>  M           : 15 L           : 11
#>  P           : 15 G           : 10
#>  (All Others): 38 (All Others): 54

summarize(dat, probs = c(0, 0.20, 0.80),
          stats = c("nobs", "mean", "med", "entropy"))
#> Numeric variables
#>                z1        a1        a2        a3        b1   
#> min           -2.086   -21.592     4         0       210.101
#> pctile_20%    -1.020   -11.849    14         0.008   210.967
#> pctile_80%     0.702     9.784    29         0.210   211.711
#> mean          -0.187    -0.334    21.190     0.133   211.316
#> nobs         100       100       100       100       100    
#> 
#> Nonnumeric variables
#>                  x2                 x1
#>  N           : 16   I           : 13  
#>  S           : 16   H           : 12  
#>  M           : 15   L           : 11  
#>  P           : 15   G           : 10  
#>  (All Others): 38   (All Others): 54  
#>  nobs   : 100.00    nobs   : 100.0    
#>  entropy:   2.94    entropy:   3.5    

summarize(dat, probs = c(0, 0.20, 0.50),
          stats = c("nobs", "nmiss", "mean", "entropy"), maxLevels=10)
#> Numeric variables
#>                z1        a1        a2        a3        b1   
#> min           -2.086   -21.592     4         0       210.101
#> pctile_20%    -1.020   -11.849    14         0.008   210.967
#> med           -0.186    -1.766    20.500     0.063   211.322
#> mean          -0.187    -0.334    21.190     0.133   211.316
#> nobs         100       100       100       100       100    
#> nmissing       0         0         0         0         0    
#> 
#> Nonnumeric variables
#>             x2                   x1
#>  L: 8            I           : 13  
#>  M: 15           H           : 12  
#>  N: 16           L           : 11  
#>  O: 11           G           : 10  
#>  P: 15           J           : 10  
#>  Q: 12           E           : 9   
#>  R: 7            A           : 7   
#>  S: 16           B           : 7   
#>                  F           : 6   
#>                  (All Others): 15  
#>  nobs   : 100.00 nobs   : 100.0    
#>  nmiss  :   0.00 nmiss  :   0.0    
#>  entropy:   2.94 entropy:   3.5    

dat.sum <- summarize(dat, probs = c(0, 0.20, 0.50),
                     stats = c("nobs", "nmiss", "mean", "entropy"), maxLevels=10)
dat.sum
#> Numeric variables
#>                z1        a1        a2        a3        b1   
#> min           -2.086   -21.592     4         0       210.101
#> pctile_20%    -1.020   -11.849    14         0.008   210.967
#> med           -0.186    -1.766    20.500     0.063   211.322
#> mean          -0.187    -0.334    21.190     0.133   211.316
#> nobs         100       100       100       100       100    
#> nmissing       0         0         0         0         0    
#> 
#> Nonnumeric variables
#>             x2                   x1
#>  L: 8            I           : 13  
#>  M: 15           H           : 12  
#>  N: 16           L           : 11  
#>  O: 11           G           : 10  
#>  P: 15           J           : 10  
#>  Q: 12           E           : 9   
#>  R: 7            A           : 7   
#>  S: 16           B           : 7   
#>                  F           : 6   
#>                  (All Others): 15  
#>  nobs   : 100.00 nobs   : 100.0    
#>  nmiss  :   0.00 nmiss  :   0.0    
#>  entropy:   2.94 entropy:   3.5    
## Inspect unformatted structure of objects within return
dat.sum[["numerics"]]
#>              min    pctile_20%         med        mean nobs nmissing
#> z1 -2.085521e+00  -1.019580912  -0.1856309  -0.1872083  100        0
#> a1 -2.159177e+01 -11.849475853  -1.7659806  -0.3337909  100        0
#> a2  4.000000e+00  14.000000000  20.5000000  21.1900000  100        0
#> a3  3.333462e-06   0.008299486   0.0627584   0.1332853  100        0
#> b1  2.101006e+02 210.967102024 211.3223015 211.3163960  100        0
dat.sum[["factors"]]
#> $x2
#> $x2$table
#>  L  M  N  O  P  Q  R  S 
#>  8 15 16 11 15 12  7 16 
#> 
#> $x2$stats
#>          nobs         nmiss       entropy normedEntropy 
#>   100.0000000     0.0000000     2.9445412     0.9815137 
#> 
#> 
#> $x1
#> $x1$table
#>  A  B  C  D  E  F  G  H  I  J  K  L 
#>  7  7  5  4  9  6 10 12 13 10  6 11 
#> 
#> $x1$stats
#>          nobs         nmiss       entropy normedEntropy 
#>   100.0000000     0.0000000     3.5030656     0.9771554 
#> 
#> 
#> attr(,"class")
#> [1] "summarizedFactors"
#> attr(,"maxLevels")
#> [1] 10
#> attr(,"stats")
#> [1] "nobs"    "nmiss"   "mean"    "entropy"
#> attr(,"digits")
#> [1] 3

## Only quantile values, no summary stats for numeric variables
## Discrete variables get entropy
summarize(dat, 
          probs = c(0, 0.25, 0.50, 0.75, 1.0),
          stats = "entropy", digits = 2)
#> Numeric variables
#>                z1       a1       a2       a3       b1  
#> min           -2.09   -21.59     4        0      210.10
#> pctile_25%    -0.89    -9.46    14        0.01   211.01
#> med           -0.19    -1.77    20.50     0.06   211.32
#> pctile_75%     0.43     8.66    26.25     0.17   211.60
#> max            2.02    38.08    53        1.13   212.29
#> 
#> Nonnumeric variables
#>                  x2                 x1
#>  N           : 16   I           : 13  
#>  S           : 16   H           : 12  
#>  M           : 15   L           : 11  
#>  P           : 15   G           : 10  
#>  (All Others): 38   (All Others): 54  
#>  entropy: 2.9       entropy: 3.5      

## Quantiles and the mean for numeric variables.
## No diversity stats for discrete variables (entropy omitted)
summarize(dat, 
          probs = c(0, 0.25, 0.50, 0.75, 1.0),
          stats = "mean")
#> Numeric variables
#>                z1        a1        a2        a3        b1   
#> min           -2.086   -21.592     4         0       210.101
#> pctile_25%    -0.889    -9.455    14         0.010   211.009
#> med           -0.186    -1.766    20.500     0.063   211.322
#> pctile_75%     0.433     8.661    26.250     0.171   211.601
#> max            2.023    38.082    53         1.129   212.291
#> mean          -0.187    -0.334    21.190     0.133   211.316
#> 
#> Nonnumeric variables
#>             x2               x1   
#>  N           : 16 I           : 13
#>  S           : 16 H           : 12
#>  M           : 15 L           : 11
#>  P           : 15 G           : 10
#>  (All Others): 38 (All Others): 54

summarize(dat, 
          probs = NULL,
          stats = "mean")
#> Numeric variables
#>          z1        a1        a2        a3        b1   
#> mean    -0.187    -0.334    21.190     0.133   211.316
#> 
#> Nonnumeric variables
#>             x2               x1   
#>  N           : 16 I           : 13
#>  S           : 16 H           : 12
#>  M           : 15 L           : 11
#>  P           : 15 G           : 10
#>  (All Others): 38 (All Others): 54

## Note: output is not beautified by a print method
dat.sn <- summarizeNumerics(dat)
dat.sn
#>              min         med        max        mean         sd   skewness
#> z1 -2.085521e+00  -0.1856309   2.023081  -0.1872083  0.9558129  0.1605997
#> a1 -2.159177e+01  -1.7659806  38.081822  -0.3337909 12.8097455  0.4631669
#> a2  4.000000e+00  20.5000000  53.000000  21.1900000  9.0539293  0.8366467
#> a3  3.333462e-06   0.0627584   1.129405   0.1332853  0.1842008  2.4395965
#> b1  2.101006e+02 211.3223015 212.290534 211.3163960  0.4216084 -0.1656890
#>      kurtosis nobs nmissing
#> z1 -0.6063152  100        0
#> a1 -0.2458057  100        0
#> a2  0.7701532  100        0
#> a3  7.9180262  100        0
#> b1 -0.2396525  100        0
formatSummarizedNumerics(dat.sn)
#>            z1     a1     a2     a3     b1  
#> min       -2.09 -21.59   4      0    210.10
#> med       -0.19  -1.77  20.50   0.06 211.32
#> max        2.02  38.08  53      1.13 212.29
#> mean      -0.19  -0.33  21.19   0.13 211.32
#> sd         0.96  12.81   9.05   0.18   0.42
#> skewness   0.16   0.46   0.84   2.44  -0.17
#> kurtosis  -0.61  -0.25   0.77   7.92  -0.24
#> nobs     100    100    100    100    100   
#> nmissing   0      0      0      0      0   
formatSummarizedNumerics(dat.sn, digits = 5)
#>              z1        a1        a2        a3        b1   
#> min       -2.08552 -21.59177   4         0       210.10061
#> med       -0.18563  -1.76598  20.50000   0.06276 211.32230
#> max        2.02308  38.08182  53         1.12941 212.29053
#> mean      -0.18721  -0.33379  21.19000   0.13329 211.31640
#> sd         0.95581  12.80975   9.05393   0.18420   0.42161
#> skewness   0.16060   0.46317   0.83665   2.43960  -0.16569
#> kurtosis  -0.60632  -0.24581   0.77015   7.91803  -0.23965
#> nobs     100       100       100       100       100      
#> nmissing   0         0         0         0         0      

dat.summ <- summarize(dat)


dat.sf <- summarizeFactors(dat, maxLevels = 20)
dat.sf
#> $x1
#> $x1$table
#>  A  B  C  D  E  F  G  H  I  J  K  L 
#>  7  7  5  4  9  6 10 12 13 10  6 11 
#> 
#> $x1$stats
#>          nobs         nmiss       entropy normedEntropy 
#>   100.0000000     0.0000000     3.5030656     0.9771554 
#> 
#> 
#> $x2
#> $x2$table
#>  L  M  N  O  P  Q  R  S 
#>  8 15 16 11 15 12  7 16 
#> 
#> $x2$stats
#>          nobs         nmiss       entropy normedEntropy 
#>   100.0000000     0.0000000     2.9445412     0.9815137 
#> 
#> 
#> attr(,"class")
#> [1] "summarizedFactors"
#> attr(,"maxLevels")
#> [1] 20
#> attr(,"stats")
#> [1] "entropy"       "normedEntropy" "nobs"          "nmiss"        
#> attr(,"digits")
#> [1] 2
formatSummarizedFactors(dat.sf)
#>                   x1                    x2  
#>  A: 7                  L: 8                 
#>  B: 7                  M: 15                
#>  C: 5                  N: 16                
#>  D: 4                  O: 11                
#>  E: 9                  P: 15                
#>  F: 6                  Q: 12                
#>  G: 10                 R: 7                 
#>  H: 12                 S: 16                
#>  I: 13                                      
#>  J: 10                                      
#>  K: 6                                       
#>  L: 11                                      
#>  nobs         : 100.00 nobs         : 100.00
#>  nmiss        :   0.00 nmiss        :   0.00
#>  entropy      :   3.50 entropy      :   2.94
#>  normedEntropy:   0.98 normedEntropy:   0.98

## See actual values of factor summaries, without
## beautified printing
summarizeFactors(dat, maxLevels = 5)
#> $x1
#> $x1$table
#>  A  B  C  D  E  F  G  H  I  J  K  L 
#>  7  7  5  4  9  6 10 12 13 10  6 11 
#> 
#> $x1$stats
#>          nobs         nmiss       entropy normedEntropy 
#>   100.0000000     0.0000000     3.5030656     0.9771554 
#> 
#> 
#> $x2
#> $x2$table
#>  L  M  N  O  P  Q  R  S 
#>  8 15 16 11 15 12  7 16 
#> 
#> $x2$stats
#>          nobs         nmiss       entropy normedEntropy 
#>   100.0000000     0.0000000     2.9445412     0.9815137 
#> 
#> 
#> attr(,"class")
#> [1] "summarizedFactors"
#> attr(,"maxLevels")
#> [1] 5
#> attr(,"stats")
#> [1] "entropy"       "normedEntropy" "nobs"          "nmiss"        
#> attr(,"digits")
#> [1] 2
formatSummarizedFactors(summarizeFactors(dat, maxLevels = 5))
#>                   x1                    x2  
#>  I           : 13      N           : 16     
#>  H           : 12      S           : 16     
#>  L           : 11      M           : 15     
#>  G           : 10      P           : 15     
#>  (All Others): 54      (All Others): 38     
#>  nobs         : 100.00 nobs         : 100.00
#>  nmiss        :   0.00 nmiss        :   0.00
#>  entropy      :   3.50 entropy      :   2.94
#>  normedEntropy:   0.98 normedEntropy:   0.98

summarize(dat, alphaSort = TRUE) 
#> Numeric variables
#>              a1        a2        a3        b1        z1   
#> min        -21.592     4         0       210.101    -2.086
#> med         -1.766    20.500     0.063   211.322    -0.186
#> max         38.082    53         1.129   212.291     2.023
#> mean        -0.334    21.190     0.133   211.316    -0.187
#> sd          12.810     9.054     0.184     0.422     0.956
#> skewness     0.463     0.837     2.440    -0.166     0.161
#> kurtosis    -0.246     0.770     7.918    -0.240    -0.606
#> nobs       100       100       100       100       100    
#> nmissing     0         0         0         0         0    
#> 
#> Nonnumeric variables
#>                   x1                     x2   
#>  I           : 13       N           : 16      
#>  H           : 12       S           : 16      
#>  L           : 11       M           : 15      
#>  G           : 10       P           : 15      
#>  (All Others): 54       (All Others): 38      
#>  nobs         : 100.000 nobs         : 100.000
#>  nmiss        :   0.000 nmiss        :   0.000
#>  entropy      :   3.503 entropy      :   2.945
#>  normedEntropy:   0.977 normedEntropy:   0.982

summarize(dat, digits = 6, alphaSort = FALSE)
#> Numeric variables
#>                z1           a1           a2           a3           b1    
#> min         -2.085521   -21.591766     4            0.000003   210.100610
#> med         -0.185631    -1.765981    20.500000     0.062758   211.322302
#> max          2.023081    38.081822    53            1.129405   212.290534
#> mean        -0.187208    -0.333791    21.190000     0.133285   211.316396
#> sd           0.955813    12.809746     9.053929     0.184201     0.421608
#> skewness     0.160600     0.463167     0.836647     2.439596    -0.165689
#> kurtosis    -0.606315    -0.245806     0.770153     7.918026    -0.239653
#> nobs       100          100          100          100          100       
#> nmissing     0            0            0            0            0       
#> 
#> Nonnumeric variables
#>                   x2                        x1      
#>  N           : 16          I           : 13         
#>  S           : 16          H           : 12         
#>  M           : 15          L           : 11         
#>  P           : 15          G           : 10         
#>  (All Others): 38          (All Others): 54         
#>  nobs         : 100.000000 nobs         : 100.000000
#>  nmiss        :   0.000000 nmiss        :   0.000000
#>  entropy      :   2.944541 entropy      :   3.503066
#>  normedEntropy:   0.981514 normedEntropy:   0.977155


summarize(dat, maxLevels = 2)
#> Numeric variables
#>              z1        a1        a2        a3        b1   
#> min         -2.086   -21.592     4         0       210.101
#> med         -0.186    -1.766    20.500     0.063   211.322
#> max          2.023    38.082    53         1.129   212.291
#> mean        -0.187    -0.334    21.190     0.133   211.316
#> sd           0.956    12.810     9.054     0.184     0.422
#> skewness     0.161     0.463     0.837     2.440    -0.166
#> kurtosis    -0.606    -0.246     0.770     7.918    -0.240
#> nobs       100       100       100       100       100    
#> nmissing     0         0         0         0         0    
#> 
#> Nonnumeric variables
#>                   x2                     x1   
#>  N           : 16       I           : 13      
#>  (All Others): 84       (All Others): 87      
#>  nobs         : 100.000 nobs         : 100.000
#>  nmiss        :   0.000 nmiss        :   0.000
#>  entropy      :   2.945 entropy      :   3.503
#>  normedEntropy:   0.982 normedEntropy:   0.977

datsumm <- summarize(dat, stats = c("mean", "sd", "var", "entropy", "nobs"))

## Unbeautified numeric data frame, variables on the rows
datsumm[["numerics"]]
#>              min         med        max        mean         sd          var
#> z1 -2.085521e+00  -0.1856309   2.023081  -0.1872083  0.9558129   0.91357829
#> a1 -2.159177e+01  -1.7659806  38.081822  -0.3337909 12.8097455 164.08958047
#> a2  4.000000e+00  20.5000000  53.000000  21.1900000  9.0539293  81.97363636
#> a3  3.333462e-06   0.0627584   1.129405   0.1332853  0.1842008   0.03392994
#> b1  2.101006e+02 211.3223015 212.290534 211.3163960  0.4216084   0.17775367
#>    nobs
#> z1  100
#> a1  100
#> a2  100
#> a3  100
#> b1  100
## Beautified versions 1. shows saved version:
attr(datsumm, "numeric.formatted")
#>        z1      a1      a2      a3      b1   
#> min   -2.086 -21.592   4       0     210.101
#> med   -0.186  -1.766  20.500   0.063 211.322
#> max    2.023  38.082  53       1.129 212.291
#> mean  -0.187  -0.334  21.190   0.133 211.316
#> sd     0.956  12.810   9.054   0.184   0.422
#> var    0.914 164.090  81.974   0.034   0.178
#> nobs 100     100     100     100     100    
## 2. Run formatSummarizedNumerics to re-specify digits:
formatSummarizedNumerics(datsumm[["numerics"]], digits = 10)
#>            z1             a1             a2             a3             b1      
#> min   -2.0855207951 -21.5917659362   4              0.0000033335 210.1006096596
#> med   -0.1856309007  -1.7659806343  20.5000000000   0.0627584000 211.3223015364
#> max    2.0230808757  38.0818218927  53              1.1294052546 212.2905342958
#> mean  -0.1872083340  -0.3337909315  21.1900000000   0.1332852775 211.3163960106
#> sd     0.9558128927  12.8097455271   9.0539293328   0.1842008171   0.4216084338
#> var    0.9135782858 164.0895804693  81.9736363636   0.0339299410   0.1777536715
#> nobs 100            100            100            100            100           

datsumm[["factors"]]
#> $x2
#> $x2$table
#>  L  M  N  O  P  Q  R  S 
#>  8 15 16 11 15 12  7 16 
#> 
#> $x2$stats
#>          nobs         nmiss       entropy normedEntropy 
#>   100.0000000     0.0000000     2.9445412     0.9815137 
#> 
#> 
#> $x1
#> $x1$table
#>  A  B  C  D  E  F  G  H  I  J  K  L 
#>  7  7  5  4  9  6 10 12 13 10  6 11 
#> 
#> $x1$stats
#>          nobs         nmiss       entropy normedEntropy 
#>   100.0000000     0.0000000     3.5030656     0.9771554 
#> 
#> 
#> attr(,"class")
#> [1] "summarizedFactors"
#> attr(,"maxLevels")
#> [1] 5
#> attr(,"stats")
#> [1] "mean"    "sd"      "var"     "entropy" "nobs"   
#> attr(,"digits")
#> [1] 3
formatSummarizedFactors(datsumm[["factors"]])
#>                  x2                 x1
#>  N           : 16   I           : 13  
#>  S           : 16   H           : 12  
#>  M           : 15   L           : 11  
#>  P           : 15   G           : 10  
#>  (All Others): 38   (All Others): 54  
#>  nobs   : 100.00    nobs   : 100.0    
#>  entropy:   2.94    entropy:   3.5    
formatSummarizedFactors(datsumm[["factors"]], digits = 6, maxLevels = 10)
#>             x2                      x1
#>  L: 8               I           : 13  
#>  M: 15              H           : 12  
#>  N: 16              L           : 11  
#>  O: 11              G           : 10  
#>  P: 15              J           : 10  
#>  Q: 12              E           : 9   
#>  R: 7               A           : 7   
#>  S: 16              B           : 7   
#>                     F           : 6   
#>                     (All Others): 15  
#>  nobs   : 100.00000 nobs   : 100.00000
#>  entropy:   2.94454 entropy:   3.50307