Class Escaper

  • Direct Known Subclasses:
    CharEscaper, UnicodeEscaper

    @Beta
    @GwtCompatible
    public abstract class Escaper
    extends java.lang.Object
    An object that converts literal text into a format safe for inclusion in a particular context (such as an XML document). Typically (but not always), the inverse process of "unescaping" the text is performed automatically by the relevant parser.

    For example, an XML escaper would convert the literal string "Foo<Bar>" into "Foo&lt;Bar&gt;" to prevent "<Bar>" from being confused with an XML tag. When the resulting XML document is parsed, the parser API will return this text as the original literal string "Foo<Bar>".

    An Escaper instance is required to be stateless, and safe when used concurrently by multiple threads.

    Because, in general, escaping operates on the code points of a string and not on its individual char values, it is not safe to assume that escape(s) is equivalent to escape(s.substring(0, n)) + escape(s.substing(n)) for arbitrary n. This is because of the possibility of splitting a surrogate pair. The only case in which it is safe to escape strings and concatenate the results is if you can rule out this possibility, either by splitting an existing long string into short strings adaptively around surrogate pairs, or by starting with short strings already known to be free of unpaired surrogates.

    The two primary implementations of this interface are CharEscaper and UnicodeEscaper. They are heavily optimized for performance and greatly simplify the task of implementing new escapers. It is strongly recommended that when implementing a new escaper you extend one of these classes. If you find that you are unable to achieve the desired behavior using either of these classes, please contact the Java libraries team for advice.

    Several popular escapers are defined as constants in classes like HtmlEscapers, XmlEscapers, and SourceCodeEscapers. To create your own escapers, use CharEscaperBuilder, or extend CharEscaper or UnicodeEscaper.

    Since:
    15.0
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected Escaper()
      Constructor for use by subclasses.
    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      Function<java.lang.String,​java.lang.String> asFunction()
      Returns a Function that invokes escape(String) on this escaper.
      abstract java.lang.String escape​(java.lang.String string)
      Returns the escaped form of a given literal string.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • Escaper

        protected Escaper()
        Constructor for use by subclasses.
    • Method Detail

      • escape

        public abstract java.lang.String escape​(java.lang.String string)
        Returns the escaped form of a given literal string.

        Note that this method may treat input characters differently depending on the specific escaper implementation.

        • UnicodeEscaper handles UTF-16 correctly, including surrogate character pairs. If the input is badly formed the escaper should throw IllegalArgumentException.
        • CharEscaper handles Java characters independently and does not verify the input for well formed characters. A CharEscaper should not be used in situations where input is not guaranteed to be restricted to the Basic Multilingual Plane (BMP).
        Parameters:
        string - the literal string to be escaped
        Returns:
        the escaped form of string
        Throws:
        java.lang.NullPointerException - if string is null
        java.lang.IllegalArgumentException - if string contains badly formed UTF-16 or cannot be escaped for any other reason
      • asFunction

        public final Function<java.lang.String,​java.lang.String> asFunction()
        Returns a Function that invokes escape(String) on this escaper.