Convert Character-String Variables in a Data Frame to Factors
strings2factors.RdConverts the character variables (or a subset of these variables) in a data frame to factors, with optional control of the order of the resulting factor levels.
Usage
strings2factors(object, which, not, exclude.unique, levels, verbose, ...)
# S3 method for class 'data.frame'
strings2factors(object, which, not,
exclude.unique=TRUE, levels=list(), verbose=TRUE, ...)Arguments
- object
a data frame or an object inheriting from the
"data.frame"class.- which
an optional character vector of names or column numbers of the character variables to be converted to factors; if absent, all character variables will be converted, except as excluded by the
notandexclude.uniquearguments (see below).- not
an optional character vector of names or column numbers of character variables not to be converted to factors.
- exclude.unique
if
TRUE(the default), character variables all of whose values are unique (i.e., all different from each other) are not converted to factors. Such variables, which would have as many levels as there are cases, are typically case identifiers and not categorical variables. IfFALSE, character variables all of whose values are unique are converted to factors with a warning.- levels
an optional named list, each element of which is a character vector of levels of the corresponding factor. This argument allows you to control the order of levels of the factor; if omitted, or if a particular factor is omitted from the list, the levels will be in the default alphabetic order.
- verbose
if
TRUE(the default), the names of the character variables that were converted to factors are printed on the console.- ...
not used.
Author
John Fox jfox@mcmaster.ca
Examples
M <- Moore # from the carData package
M$partner <- as.character(Moore$partner.status)
M$fcat <- as.character(Moore$fcategory)
M$names <- rownames(M) # values are unique
str(M)
#> 'data.frame': 45 obs. of 7 variables:
#> $ partner.status: Factor w/ 2 levels "high","low": 2 2 2 2 2 2 2 2 2 2 ...
#> $ conformity : int 8 4 8 7 10 6 12 4 13 12 ...
#> $ fcategory : Factor w/ 3 levels "high","low","medium": 2 1 1 2 2 2 3 3 2 2 ...
#> $ fscore : int 37 57 65 20 36 18 51 44 31 36 ...
#> $ partner : chr "low" "low" "low" "low" ...
#> $ fcat : chr "low" "high" "high" "low" ...
#> $ names : chr "1" "2" "3" "4" ...
str(strings2factors(M))
#>
#> The following character variables were converted to factors
#> partner fcat
#> 'data.frame': 45 obs. of 7 variables:
#> $ partner.status: Factor w/ 2 levels "high","low": 2 2 2 2 2 2 2 2 2 2 ...
#> $ conformity : int 8 4 8 7 10 6 12 4 13 12 ...
#> $ fcategory : Factor w/ 3 levels "high","low","medium": 2 1 1 2 2 2 3 3 2 2 ...
#> $ fscore : int 37 57 65 20 36 18 51 44 31 36 ...
#> $ partner : Factor w/ 2 levels "high","low": 2 2 2 2 2 2 2 2 2 2 ...
#> $ fcat : Factor w/ 3 levels "high","low","medium": 2 1 1 2 2 2 3 3 2 2 ...
#> $ names : chr "1" "2" "3" "4" ...
str(strings2factors(M,
levels=list(partner=c("low", "high"), fcat=c("low", "medium", "high"))))
#>
#> The following character variables were converted to factors
#> partner fcat
#> 'data.frame': 45 obs. of 7 variables:
#> $ partner.status: Factor w/ 2 levels "high","low": 2 2 2 2 2 2 2 2 2 2 ...
#> $ conformity : int 8 4 8 7 10 6 12 4 13 12 ...
#> $ fcategory : Factor w/ 3 levels "high","low","medium": 2 1 1 2 2 2 3 3 2 2 ...
#> $ fscore : int 37 57 65 20 36 18 51 44 31 36 ...
#> $ partner : Factor w/ 2 levels "low","high": 1 1 1 1 1 1 1 1 1 1 ...
#> $ fcat : Factor w/ 3 levels "low","medium",..: 1 3 3 1 1 1 2 2 1 1 ...
#> $ names : chr "1" "2" "3" "4" ...
str(strings2factors(M, which="partner", levels=list(partner=c("low", "high"))))
#>
#> partner was converted to a factor'data.frame': 45 obs. of 7 variables:
#> $ partner.status: Factor w/ 2 levels "high","low": 2 2 2 2 2 2 2 2 2 2 ...
#> $ conformity : int 8 4 8 7 10 6 12 4 13 12 ...
#> $ fcategory : Factor w/ 3 levels "high","low","medium": 2 1 1 2 2 2 3 3 2 2 ...
#> $ fscore : int 37 57 65 20 36 18 51 44 31 36 ...
#> $ partner : Factor w/ 2 levels "low","high": 1 1 1 1 1 1 1 1 1 1 ...
#> $ fcat : chr "low" "high" "high" "low" ...
#> $ names : chr "1" "2" "3" "4" ...
str(strings2factors(M, not="partner", exclude.unique=FALSE))
#> Warning: all values of names are unique
#>
#> The following character variables were converted to factors
#> fcat names
#> 'data.frame': 45 obs. of 7 variables:
#> $ partner.status: Factor w/ 2 levels "high","low": 2 2 2 2 2 2 2 2 2 2 ...
#> $ conformity : int 8 4 8 7 10 6 12 4 13 12 ...
#> $ fcategory : Factor w/ 3 levels "high","low","medium": 2 1 1 2 2 2 3 3 2 2 ...
#> $ fscore : int 37 57 65 20 36 18 51 44 31 36 ...
#> $ partner : chr "low" "low" "low" "low" ...
#> $ fcat : Factor w/ 3 levels "high","low","medium": 2 1 1 2 2 2 3 3 2 2 ...
#> $ names : Factor w/ 45 levels "1","10","11",..: 1 12 23 34 41 42 43 44 45 2 ...