ArrayBasedUnicodeEscaper (Guava: Google Core Libraries for Java 20.0 API)

java.lang.Object
- com.google.common.escape.Escaper
- - com.google.common.escape.UnicodeEscaper
  - - com.google.common.escape.ArrayBasedUnicodeEscaper

```
@Beta
 @GwtCompatible
public abstract class ArrayBasedUnicodeEscaper
extends UnicodeEscaper
```
A UnicodeEscaper that uses an array to quickly look up replacement characters for a given code point. An additional safe range is provided that determines whether code points without specific replacements are to be considered safe and left unescaped or should be escaped in a general way.
A good example of usage of this class is for HTML escaping where the replacement array contains information about the named HTML entities such as & and " while escapeUnsafe(int) is overridden to handle general escaping of the form &#NNNNN;.
The size of the data structure used by ArrayBasedUnicodeEscaper is proportional to the highest valued code point that requires escaping. For example a replacement map containing the single character '\u1000' will require approximately 16K of memory. If you need to create multiple escaper instances that have the same character replacement mapping consider using ArrayBasedEscaperMap.

Since:

15.0

Author:

David Beaumont

Constructor Summary

Constructors
Modifier	Constructor and Description
`protected`	`ArrayBasedUnicodeEscaper(ArrayBasedEscaperMap escaperMap, int safeMin, int safeMax, String unsafeReplacement)` Creates a new ArrayBasedUnicodeEscaper instance with the given replacement map and specified safe range.
`protected`	`ArrayBasedUnicodeEscaper(Map<Character,String> replacementMap, int safeMin, int safeMax, String unsafeReplacement)` Creates a new ArrayBasedUnicodeEscaper instance with the given replacement map and specified safe range.

Method Summary

All Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method and Description
`protected char[]`	`escape(int cp)` Escapes a single Unicode code point using the replacement array and safe range values.
`String`	`escape(String s)` Returns the escaped form of a given literal string.
`protected abstract char[]`	`escapeUnsafe(int cp)` Escapes a code point that has no direct explicit value in the replacement array and lies outside the stated safe range.
`protected int`	`nextEscapeIndex(CharSequence csq, int index, int end)` Scans a sub-sequence of characters from a given `CharSequence`, returning the index of the next character that requires escaping.

Methods inherited from class com.google.common.escape.UnicodeEscaper
codePointAt, escapeSlow

Methods inherited from class com.google.common.escape.Escaper
asFunction

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - ArrayBasedUnicodeEscaper
```
protected ArrayBasedUnicodeEscaper(Map<Character,String> replacementMap,
                                   int safeMin,
                                   int safeMax,
                                   @Nullable
                                   String unsafeReplacement)
```
    Creates a new ArrayBasedUnicodeEscaper instance with the given replacement map and specified safe range. If safeMax < safeMin then no code points are considered safe.
    If a code point has no mapped replacement then it is checked against the safe range. If it lies outside that, then escapeUnsafe(int) is called, otherwise no escaping is performed.
    
    Parameters:
    
    replacementMap - a map of characters to their escaped representations
    
    safeMin - the lowest character value in the safe range
    
    safeMax - the highest character value in the safe range
    
    unsafeReplacement - the default replacement for unsafe characters or null if no default replacement is required
  - ArrayBasedUnicodeEscaper
```
protected ArrayBasedUnicodeEscaper(ArrayBasedEscaperMap escaperMap,
                                   int safeMin,
                                   int safeMax,
                                   @Nullable
                                   String unsafeReplacement)
```
    Creates a new ArrayBasedUnicodeEscaper instance with the given replacement map and specified safe range. If safeMax < safeMin then no code points are considered safe. This initializer is useful when explicit instances of ArrayBasedEscaperMap are used to allow the sharing of large replacement mappings.
    If a code point has no mapped replacement then it is checked against the safe range. If it lies outside that, then escapeUnsafe(int) is called, otherwise no escaping is performed.
    
    Parameters:
    
    escaperMap - the map of replacements
    
    safeMin - the lowest character value in the safe range
    
    safeMax - the highest character value in the safe range
    
    unsafeReplacement - the default replacement for unsafe characters or null if no default replacement is required
- Method Detail
  - escape
```
public final String escape(String s)
```
    Description copied from class: UnicodeEscaper
    
    Returns the escaped form of a given literal string.
    If you are escaping input in arbitrary successive chunks, then it is not generally safe to use this method. If an input string ends with an unmatched high surrogate character, then this method will throw IllegalArgumentException. You should ensure your input is valid UTF-16 before calling this method.
    Note: When implementing an escaper it is a good idea to override this method for efficiency by inlining the implementation of UnicodeEscaper.nextEscapeIndex(CharSequence, int, int) directly. Doing this for PercentEscaper more than doubled the performance for unescaped strings (as measured by CharEscapersBenchmark).
    
    Overrides:
    
    escape in class UnicodeEscaper
    
    Parameters:
    
    s - the literal string to be escaped
    
    Returns:
    
    the escaped form of string
  - nextEscapeIndex
```
protected final int nextEscapeIndex(CharSequence csq,
                                    int index,
                                    int end)
```
    Description copied from class: UnicodeEscaper
    
    Scans a sub-sequence of characters from a given CharSequence, returning the index of the next character that requires escaping.
    Note: When implementing an escaper, it is a good idea to override this method for efficiency. The base class implementation determines successive Unicode code points and invokes UnicodeEscaper.escape(int) for each of them. If the semantics of your escaper are such that code points in the supplementary range are either all escaped or all unescaped, this method can be implemented more efficiently using CharSequence.charAt(int).
    Note however that if your escaper does not escape characters in the supplementary range, you should either continue to validate the correctness of any surrogate characters encountered or provide a clear warning to users that your escaper does not validate its input.
    See PercentEscaper for an example.
    
    Overrides:
    
    nextEscapeIndex in class UnicodeEscaper
    
    Parameters:
    
    csq - a sequence of characters
    
    index - the index of the first character to be scanned
    
    end - the index immediately after the last character to be scanned
  - escape
```
protected final char[] escape(int cp)
```
    Escapes a single Unicode code point using the replacement array and safe range values. If the given character does not have an explicit replacement and lies outside the safe range then escapeUnsafe(int) is called.
    
    Specified by:
    
    escape in class UnicodeEscaper
    
    Parameters:
    
    cp - the Unicode code point to escape if necessary
    
    Returns:
    
    the replacement characters, or null if no escaping was needed
  - escapeUnsafe
```
protected abstract char[] escapeUnsafe(int cp)
```
    Escapes a code point that has no direct explicit value in the replacement array and lies outside the stated safe range. Subclasses should override this method to provide generalized escaping for code points if required.
    Note that arrays returned by this method must not be modified once they have been returned. However it is acceptable to return the same array multiple times (even for different input characters).
    
    Parameters:
    
    cp - the Unicode code point to escape
    
    Returns:
    
    the replacement characters, or null if no escaping was required

Class ArrayBasedUnicodeEscaper

Constructor Summary

Method Summary

Methods inherited from class com.google.common.escape.UnicodeEscaper

Methods inherited from class com.google.common.escape.Escaper

Methods inherited from class java.lang.Object

Constructor Detail

ArrayBasedUnicodeEscaper

ArrayBasedUnicodeEscaper

Method Detail

escape

nextEscapeIndex

escape

escapeUnsafe