recode a factor by "combining" levels
combineLevels.RdThis makes it easy to put levels together and create a new factor variable. If a factor variable is currently coded with levels c("Male","Female","Man", "M"), and the user needs to combine the redundant levels for males, this is the function to use! This is a surprisingly difficult problem in R.
Arguments
- fac
An R factor variable, either ordered or not.
- levs
The levels to be combined. Users may specify either a numerical vector of level values, such as c(1,2,3), to combine the first three elements of level(fac), or they may specify level names. This can be done as a character vector of *correctly spelled* factor values, such as c("Yes","Maybe","Always") or it may be provided as a subset of the output from levels, such as levels(fac)[1:3].
- newLabel
A character string that represents the label of the new level to be created when
levsvalues are combined.
Details
If the factor is an ordinal factor, then levels may be combined only if they are adjacent. A factor with levels c("Lo","Med","Hi","Extreme") allows us to combine responses "Lo" and "Med", while it will NOT allow us to combine "Lo" with "Hi".
A non-ordered factor can be reorganized to combine any values, no matter what positions they occupy in the levels vector.
Author
Paul E. Johnson pauljohn@ku.edu
Examples
x <- c("M","A","B","C","A","B","A","M")
x <- factor(x)
levels(x)
#> [1] "A" "B" "C" "M"
x2a <- combineLevels(x, levs = c("M","A"), newLabel = "M_or_A")
#> The original levels A B C M
#> have been replaced by B C M_or_A
addmargins(table(x2a, x, exclude=NULL))
#> x
#> x2a A B C M Sum
#> B 0 2 0 0 2
#> C 0 0 1 0 1
#> M_or_A 3 0 0 2 5
#> Sum 3 2 1 2 8
x2b <- combineLevels(x, c(1,4), "M_or_A")
#> The original levels A B C M
#> have been replaced by B C M_or_A
addmargins(table(x2b, x, exclude=NULL))
#> x
#> x2b A B C M Sum
#> B 0 2 0 0 2
#> C 0 0 1 0 1
#> M_or_A 3 0 0 2 5
#> Sum 3 2 1 2 8
x3 <- combineLevels(x, levs = c("M","A","C"), newLabel = "MAC")
#> The original levels A B C M
#> have been replaced by B MAC
addmargins(table(x3, x, exclude=NULL))
#> x
#> x3 A B C M Sum
#> B 0 2 0 0 2
#> MAC 3 0 1 2 6
#> Sum 3 2 1 2 8
## Now an ordinal factor
z <- c("M","A","B","C","A","B","A","M")
z <- ordered(z)
levels(z)
#> [1] "A" "B" "C" "M"
table(z, exclude=NULL)
#> z
#> A B C M
#> 3 2 1 2
z2a <- combineLevels(z, levs = c(1,2), "Good")
#> The original levels A B C M
#> have been replaced by Good C M
addmargins(table(z2a, z, exclude = NULL))
#> z
#> z2a A B C M Sum
#> Good 3 2 0 0 5
#> C 0 0 1 0 1
#> M 0 0 0 2 2
#> Sum 3 2 1 2 8
z2b <- combineLevels(z, levs = c("A","B"), "AorB")
#> The original levels A B C M
#> have been replaced by AorB C M
addmargins(table(z2b, z, exclude = NULL))
#> z
#> z2b A B C M Sum
#> AorB 3 2 0 0 5
#> C 0 0 1 0 1
#> M 0 0 0 2 2
#> Sum 3 2 1 2 8