This is for fixing up "untrusted text" that is to be passed into a file as content. It protects against "bad" text strings in 3 contexts, 1) LaTeX documents, 2) HTML documents, or 3) text in a file name. It converts content text to an improved string that will not cause failures in the eventual document.

escape(x, type = "tex")

Arguments

x

a string, or vector of strings (each of which is processed separately)

type

"tex" is default, could be "filename" or "html"

Value

corrected character vector

Details

The special in-document LaTeX symbols like percent sign or dollar sign are " session, these will appear as double-backslashed symbols, while in a saved text file, there will only be the one desired slash.

If type = "html", we only clean up <, >, / and &, and quote characters. If document is in unicode, we don't need to do the gigantic set anymore.

If type = "filename", then symbols that are not allowed in file names, such as "\", "*", are replaced. Do not use this on a full path, since it will obliterate path separators.

Author

Paul Johnson <pauljohn@ku.edu>

Examples

x1 <- c("_asdf&_&$", "asd adf asd_", "^ % & $asdf_")
escape(x1)
#> [1] "\\_asdf\\&\\_\\&\\$"            "asd adf asd\\_"                
#> [3] "\\\\verb|^| \\% \\& \\$asdf\\_"
x2 <- c("a>b", "a<b", "a < c", 'Paul "pj" Johnson')
escape(x2, type = "tex")
#> [1] "a$>$b"               "a$<$b"               "a $<$ c"            
#> [4] "Paul \"pj\" Johnson"
escape(x2, type = "html")
#> [1] "a&gt;b"                      "a&lt;b"                     
#> [3] "a &lt; c"                    "Paul &quot;pj&quot; Johnson"
escape(x2, type = "filename")
#> [1] "ab"              "ab"              "a__c"            "Paul_pj_Johnson"