This function sorts a character vector according to a locale-dependent lexicographic order.
Arguments
- str
a character vector
- decreasing
a single logical value; should the sort order be nondecreasing (
FALSE, default, i.e., weakly increasing) or nonincreasing (TRUE)?- na_last
a single logical value; controls the treatment of
NAs instr. IfTRUE, then missing values instrare put at the end; ifFALSE, they are put at the beginning; ifNA, then they are removed from the output- ...
additional settings for
opts_collator- opts_collator
a named list with ICU Collator's options, see
stri_opts_collator,NULLfor default collation options
Details
For more information on ICU's Collator and how to tune it up
in stringi, refer to stri_opts_collator.
As usual in stringi, non-character inputs are coerced to strings, see an example below for a somewhat non-intuitive behavior of lexicographic sorting on numeric inputs.
This function uses a stable sort algorithm (STL's stable_sort),
which performs up to \(N*log^2(N)\) element comparisons,
where \(N\) is the length of str.
References
Collation - ICU User Guide, https://unicode-org.github.io/icu/userguide/collation/
See also
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, doi:10.18637/jss.v103.i02
Other locale_sensitive:
%s<%(),
about_locale,
about_search_boundaries,
about_search_coll,
stri_compare(),
stri_count_boundaries(),
stri_duplicated(),
stri_enc_detect2(),
stri_extract_all_boundaries(),
stri_locate_all_boundaries(),
stri_opts_collator(),
stri_order(),
stri_rank(),
stri_sort_key(),
stri_split_boundaries(),
stri_trans_tolower(),
stri_unique(),
stri_wrap()
Author
Marek Gagolewski and other contributors
Examples
stri_sort(c('hladny', 'chladny'), locale='pl_PL')
#> [1] "chladny" "hladny"
stri_sort(c('hladny', 'chladny'), locale='sk_SK')
#> [1] "hladny" "chladny"
stri_sort(sample(LETTERS))
#> [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
#> [20] "T" "U" "V" "W" "X" "Y" "Z"
stri_sort(c(1, 100, 2, 101, 11, 10)) # lexicographic order
#> [1] "1" "10" "100" "101" "11" "2"
stri_sort(c(1, 100, 2, 101, 11, 10), numeric=TRUE) # OK for integers
#> [1] "1" "2" "10" "11" "100" "101"
stri_sort(c(0.25, 0.5, 1, -1, -2, -3), numeric=TRUE) # incorrect
#> [1] "-1" "-2" "-3" "0.5" "0.25" "1"