R/quantcut.R
quantcut.RdCreate a factor variable using the quantiles of a continuous variable.
quantcut(x, q = 4, na.rm = TRUE, ...)Continuous variable.
Either a integer number of equally spaced quantile groups to
create, or a vector of quantiles used for creating groups. Defaults to
q=4 which is equivalent to q=seq(0, 1, by=0.25). See
quantile for details.
Boolean indicating whether missing values should be removed when computing quantiles. Defaults to TRUE.
Optional arguments passed to cut.
Factor variable with one level for each quantile interval.
This function uses quantile to obtain the specified quantiles
of x, then calls cut to create a factor variable using
the intervals specified by these quantiles.
It properly handles cases where more than one quantile obtains the same value, as in the second example below. Note that in this case, there will be fewer generated factor levels than the specified number of quantile intervals.
## create example data
# testonly{
set.seed(1234)
# }
x <- rnorm(1000)
## cut into quartiles
quartiles <- quantcut(x)
table(quartiles)
#> quartiles
#> [-3.4,-0.673] (-0.673,-0.0398] (-0.0398,0.616] (0.616,3.2]
#> 250 250 250 250
## cut into deciles
deciles.1 <- quantcut(x, 10)
table(deciles.1)
#> deciles.1
#> [-3.4,-1.21] (-1.21,-0.849] (-0.849,-0.538] (-0.538,-0.285]
#> 100 100 100 100
#> (-0.285,-0.0398] (-0.0398,0.193] (0.193,0.466] (0.466,0.761]
#> 100 100 100 100
#> (0.761,1.33] (1.33,3.2]
#> 100 100
# or equivalently
deciles.2 <- quantcut(x, seq(0, 1, by = 0.1))
# testonly{
stopifnot(identical(deciles.1, deciles.2))
# }
## show handling of 'tied' quantiles.
x <- round(x) # discretize to create ties
stem(x) # display the ties
#>
#> The decimal point is at the |
#>
#> -3 | 0000000000000
#> -2 |
#> -2 | 000000000000000000000000000000000000000000000000
#> -1 |
#> -1 | 00000000000000000000000000000000000000000000000000000000000000000000+181
#> -0 |
#> -0 |
#> 0 | 00000000000000000000000000000000000000000000000000000000000000000000+310
#> 0 |
#> 1 | 00000000000000000000000000000000000000000000000000000000000000000000+140
#> 1 |
#> 2 | 000000000000000000000000000000000000000000000000000000000000000
#> 2 |
#> 3 | 00000
#>
deciles <- quantcut(x, 10)
table(deciles) # note that there are only 5 groups (not 10)
#> deciles
#> [-3,-1) -1 0 1 (1,3]
#> 61 261 390 220 68
# due to duplicates