Check If a Data Stream Is Possibly in UTF-16 or UTF-32
Source:R/encoding_detection.R
stri_enc_isutf16.RdThese functions detect whether a given byte stream is valid UTF-16LE, UTF-16BE, UTF-32LE, or UTF-32BE.
Usage
stri_enc_isutf16be(str)
stri_enc_isutf16le(str)
stri_enc_isutf32be(str)
stri_enc_isutf32le(str)Details
These functions are independent of the way R marks encodings in character strings (see Encoding and stringi-encoding). Most often, these functions act on raw vectors.
A result of FALSE means that a string is surely not valid UTF-16
or UTF-32. However, false positives are possible.
Also note that a data stream may be sometimes classified as both valid UTF-16LE and UTF-16BE.
See also
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, doi:10.18637/jss.v103.i02
Other encoding_detection:
about_encoding,
stri_enc_detect(),
stri_enc_detect2(),
stri_enc_isascii(),
stri_enc_isutf8()
Author
Marek Gagolewski and other contributors