Class CharMatcher
- java.lang.Object
- 
- com.google.common.base.CharMatcher
 
- 
- All Implemented Interfaces:
- Predicate<java.lang.Character>,- java.util.function.Predicate<java.lang.Character>
 
 @GwtCompatible(emulated=true) public abstract class CharMatcher extends java.lang.Object implements Predicate<java.lang.Character> Determines a true or false value for any Javacharvalue, just asPredicatedoes for anyObject. Also offers basic text processing methods based on this function. Implementations are strongly encouraged to be side-effect-free and immutable.Throughout the documentation of this class, the phrase "matching character" is used to mean "any charvaluecfor whichthis.matches(c)returnstrue".Warning: This class deals only with charvalues, that is, BMP characters. It does not understand supplementary Unicode code points in the range0x10000to0x10FFFFwhich includes the majority of assigned characters, including important CJK characters and emoji.Supplementary characters are encoded into a Stringusing surrogate pairs, and aCharMatchertreats these just as two separate characters.countIn(java.lang.CharSequence)counts each supplementary character as 2chars.For up-to-date Unicode character properties (digit, letter, etc.) and support for supplementary code points, use ICU4J UCharacter and UnicodeSet (freeze() after building). For basic text processing based on UnicodeSet use the ICU4J UnicodeSetSpanner. Example usages: String trimmed = whitespace().trimFrom(userInput); if (ascii().matchesAllOf(s)) { ... }See the Guava User Guide article on CharMatcher.- Since:
- 1.0
- Author:
- Kevin Bourrillion
 
- 
- 
Constructor SummaryConstructors Modifier Constructor Description protectedCharMatcher()Constructor for use by subclasses.
 - 
Method SummaryAll Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description CharMatcherand(CharMatcher other)Returns a matcher that matches any character matched by both this matcher andother.static CharMatcherany()Matches any character.static CharMatcheranyOf(java.lang.CharSequence sequence)Returns acharmatcher that matches any BMP character present in the given character sequence.booleanapply(java.lang.Character character)Deprecated.Provided only to satisfy thePredicateinterface; usematches(char)instead.static CharMatcherascii()Determines whether a character is ASCII, meaning that its code point is less than 128.static CharMatcherbreakingWhitespace()Determines whether a character is a breaking whitespace (that is, a whitespace which can be interpreted as a break between words for formatting purposes).java.lang.StringcollapseFrom(java.lang.CharSequence sequence, char replacement)Returns a string copy of the input character sequence, with each group of consecutive matching BMP characters replaced by a single replacement character.intcountIn(java.lang.CharSequence sequence)Returns the number of matchingchars found in a character sequence.static CharMatcherdigit()Deprecated.Many digits are supplementary characters; see the class documentation.static CharMatcherforPredicate(Predicate<? super java.lang.Character> predicate)Returns a matcher with identical behavior to the givenCharacter-based predicate, but which operates on primitivecharinstances instead.intindexIn(java.lang.CharSequence sequence)Returns the index of the first matching BMP character in a character sequence, or-1if no matching character is present.intindexIn(java.lang.CharSequence sequence, int start)Returns the index of the first matching BMP character in a character sequence, starting from a given position, or-1if no character matches after that position.static CharMatcherinRange(char startInclusive, char endInclusive)Returns acharmatcher that matches any character in a given BMP range (both endpoints are inclusive).static CharMatcherinvisible()Deprecated.Most invisible characters are supplementary characters; see the class documentation.static CharMatcheris(char match)Returns acharmatcher that matches only one specified BMP character.static CharMatcherisNot(char match)Returns acharmatcher that matches any character except the BMP character specified.static CharMatcherjavaDigit()Deprecated.Many digits are supplementary characters; see the class documentation.static CharMatcherjavaIsoControl()Determines whether a character is an ISO control character as specified byCharacter.isISOControl(char).static CharMatcherjavaLetter()Deprecated.Most letters are supplementary characters; see the class documentation.static CharMatcherjavaLetterOrDigit()Deprecated.Most letters and digits are supplementary characters; see the class documentation.static CharMatcherjavaLowerCase()Deprecated.Some lowercase characters are supplementary characters; see the class documentation.static CharMatcherjavaUpperCase()Deprecated.Some uppercase characters are supplementary characters; see the class documentation.intlastIndexIn(java.lang.CharSequence sequence)Returns the index of the last matching BMP character in a character sequence, or-1if no matching character is present.abstract booleanmatches(char c)Determines a true or false value for the given character.booleanmatchesAllOf(java.lang.CharSequence sequence)Returnstrueif a character sequence contains only matching BMP characters.booleanmatchesAnyOf(java.lang.CharSequence sequence)Returnstrueif a character sequence contains at least one matching BMP character.booleanmatchesNoneOf(java.lang.CharSequence sequence)Returnstrueif a character sequence contains no matching BMP characters.CharMatchernegate()Returns a matcher that matches any character not matched by this matcher.static CharMatchernone()Matches no characters.static CharMatchernoneOf(java.lang.CharSequence sequence)Returns acharmatcher that matches any BMP character not present in the given character sequence.CharMatcheror(CharMatcher other)Returns a matcher that matches any character matched by either this matcher orother.CharMatcherprecomputed()Returns acharmatcher functionally equivalent to this one, but which may be faster to query than the original; your mileage may vary.java.lang.StringremoveFrom(java.lang.CharSequence sequence)Returns a string containing all non-matching characters of a character sequence, in order.java.lang.StringreplaceFrom(java.lang.CharSequence sequence, char replacement)Returns a string copy of the input character sequence, with each matching BMP character replaced by a given replacement character.java.lang.StringreplaceFrom(java.lang.CharSequence sequence, java.lang.CharSequence replacement)Returns a string copy of the input character sequence, with each matching BMP character replaced by a given replacement sequence.java.lang.StringretainFrom(java.lang.CharSequence sequence)Returns a string containing all matching BMP characters of a character sequence, in order.static CharMatchersingleWidth()Deprecated.Many such characters are supplementary characters; see the class documentation.java.lang.StringtoString()Returns a string representation of thisCharMatcher, such asCharMatcher.or(WHITESPACE, JAVA_DIGIT).java.lang.StringtrimAndCollapseFrom(java.lang.CharSequence sequence, char replacement)Collapses groups of matching characters exactly ascollapseFrom(java.lang.CharSequence, char)does, except that groups of matching BMP characters at the start or end of the sequence are removed without replacement.java.lang.StringtrimFrom(java.lang.CharSequence sequence)Returns a substring of the input character sequence that omits all matching BMP characters from the beginning and from the end of the string.java.lang.StringtrimLeadingFrom(java.lang.CharSequence sequence)Returns a substring of the input character sequence that omits all matching BMP characters from the beginning of the string.java.lang.StringtrimTrailingFrom(java.lang.CharSequence sequence)Returns a substring of the input character sequence that omits all matching BMP characters from the end of the string.static CharMatcherwhitespace()Determines whether a character is whitespace according to the latest Unicode standard, as illustrated here.
 
- 
- 
- 
Constructor Detail- 
CharMatcherprotected CharMatcher() Constructor for use by subclasses. When subclassing, you may want to overridetoString()to provide a useful description.
 
- 
 - 
Method Detail- 
anypublic static CharMatcher any() Matches any character.- Since:
- 19.0 (since 1.0 as constant ANY)
 
 - 
nonepublic static CharMatcher none() Matches no characters.- Since:
- 19.0 (since 1.0 as constant NONE)
 
 - 
whitespacepublic static CharMatcher whitespace() Determines whether a character is whitespace according to the latest Unicode standard, as illustrated here. This is not the same definition used by other Java APIs. (See a comparison of several definitions of "whitespace".)All Unicode White_Space characters are on the BMP and thus supported by this API. Note: as the Unicode definition evolves, we will modify this matcher to keep it up to date. - Since:
- 19.0 (since 1.0 as constant WHITESPACE)
 
 - 
breakingWhitespacepublic static CharMatcher breakingWhitespace() Determines whether a character is a breaking whitespace (that is, a whitespace which can be interpreted as a break between words for formatting purposes). Seewhitespace()for a discussion of that term.- Since:
- 19.0 (since 2.0 as constant BREAKING_WHITESPACE)
 
 - 
asciipublic static CharMatcher ascii() Determines whether a character is ASCII, meaning that its code point is less than 128.- Since:
- 19.0 (since 1.0 as constant ASCII)
 
 - 
digit@Deprecated public static CharMatcher digit() Deprecated.Many digits are supplementary characters; see the class documentation.Determines whether a character is a BMP digit according to Unicode. If you only care to match ASCII digits, you can useinRange('0', '9').- Since:
- 19.0 (since 1.0 as constant DIGIT)
 
 - 
javaDigit@Deprecated public static CharMatcher javaDigit() Deprecated.Many digits are supplementary characters; see the class documentation.Determines whether a character is a BMP digit according to Java's definition. If you only care to match ASCII digits, you can useinRange('0', '9').- Since:
- 19.0 (since 1.0 as constant JAVA_DIGIT)
 
 - 
javaLetter@Deprecated public static CharMatcher javaLetter() Deprecated.Most letters are supplementary characters; see the class documentation.Determines whether a character is a BMP letter according to Java's definition. If you only care to match letters of the Latin alphabet, you can useinRange('a', 'z').or(inRange('A', 'Z')).- Since:
- 19.0 (since 1.0 as constant JAVA_LETTER)
 
 - 
javaLetterOrDigit@Deprecated public static CharMatcher javaLetterOrDigit() Deprecated.Most letters and digits are supplementary characters; see the class documentation.Determines whether a character is a BMP letter or digit according to Java's definition.- Since:
- 19.0 (since 1.0 as constant JAVA_LETTER_OR_DIGIT).
 
 - 
javaUpperCase@Deprecated public static CharMatcher javaUpperCase() Deprecated.Some uppercase characters are supplementary characters; see the class documentation.Determines whether a BMP character is upper case according to Java's definition.- Since:
- 19.0 (since 1.0 as constant JAVA_UPPER_CASE)
 
 - 
javaLowerCase@Deprecated public static CharMatcher javaLowerCase() Deprecated.Some lowercase characters are supplementary characters; see the class documentation.Determines whether a BMP character is lower case according to Java's definition.- Since:
- 19.0 (since 1.0 as constant JAVA_LOWER_CASE)
 
 - 
javaIsoControlpublic static CharMatcher javaIsoControl() Determines whether a character is an ISO control character as specified byCharacter.isISOControl(char).All ISO control codes are on the BMP and thus supported by this API. - Since:
- 19.0 (since 1.0 as constant JAVA_ISO_CONTROL)
 
 - 
invisible@Deprecated public static CharMatcher invisible() Deprecated.Most invisible characters are supplementary characters; see the class documentation.Determines whether a character is invisible; that is, if its Unicode category is any of SPACE_SEPARATOR, LINE_SEPARATOR, PARAGRAPH_SEPARATOR, CONTROL, FORMAT, SURROGATE, and PRIVATE_USE according to ICU4J.See also the Unicode Default_Ignorable_Code_Point property (available via ICU). - Since:
- 19.0 (since 1.0 as constant INVISIBLE)
 
 - 
singleWidth@Deprecated public static CharMatcher singleWidth() Deprecated.Many such characters are supplementary characters; see the class documentation.Determines whether a character is single-width (not double-width). When in doubt, this matcher errs on the side of returningfalse(that is, it tends to assume a character is double-width).Note: as the reference file evolves, we will modify this matcher to keep it up to date. See also UAX #11 East Asian Width. - Since:
- 19.0 (since 1.0 as constant SINGLE_WIDTH)
 
 - 
ispublic static CharMatcher is(char match) Returns acharmatcher that matches only one specified BMP character.
 - 
isNotpublic static CharMatcher isNot(char match) Returns acharmatcher that matches any character except the BMP character specified.To negate another CharMatcher, usenegate().
 - 
anyOfpublic static CharMatcher anyOf(java.lang.CharSequence sequence) Returns acharmatcher that matches any BMP character present in the given character sequence. Returns a bogus matcher if the sequence contains supplementary characters.
 - 
noneOfpublic static CharMatcher noneOf(java.lang.CharSequence sequence) Returns acharmatcher that matches any BMP character not present in the given character sequence. Returns a bogus matcher if the sequence contains supplementary characters.
 - 
inRangepublic static CharMatcher inRange(char startInclusive, char endInclusive) Returns acharmatcher that matches any character in a given BMP range (both endpoints are inclusive). For example, to match any lowercase letter of the English alphabet, useCharMatcher.inRange('a', 'z').- Throws:
- java.lang.IllegalArgumentException- if- endInclusive < startInclusive
 
 - 
forPredicatepublic static CharMatcher forPredicate(Predicate<? super java.lang.Character> predicate) Returns a matcher with identical behavior to the givenCharacter-based predicate, but which operates on primitivecharinstances instead.
 - 
matchespublic abstract boolean matches(char c) Determines a true or false value for the given character.
 - 
negatepublic CharMatcher negate() Returns a matcher that matches any character not matched by this matcher.- Specified by:
- negatein interface- java.util.function.Predicate<java.lang.Character>
 
 - 
andpublic CharMatcher and(CharMatcher other) Returns a matcher that matches any character matched by both this matcher andother.
 - 
orpublic CharMatcher or(CharMatcher other) Returns a matcher that matches any character matched by either this matcher orother.
 - 
precomputedpublic CharMatcher precomputed() Returns acharmatcher functionally equivalent to this one, but which may be faster to query than the original; your mileage may vary. Precomputation takes time and is likely to be worthwhile only if the precomputed matcher is queried many thousands of times.This method has no effect (returns this) when called in GWT: it's unclear whether a precomputed matcher is faster, but it certainly consumes more memory, which doesn't seem like a worthwhile tradeoff in a browser.
 - 
matchesAnyOfpublic boolean matchesAnyOf(java.lang.CharSequence sequence) Returnstrueif a character sequence contains at least one matching BMP character. Equivalent to!matchesNoneOf(sequence).The default implementation iterates over the sequence, invoking matches(char)for each character, until this returnstrueor the end is reached.- Parameters:
- sequence- the character sequence to examine, possibly empty
- Returns:
- trueif this matcher matches at least one character in the sequence
- Since:
- 8.0
 
 - 
matchesAllOfpublic boolean matchesAllOf(java.lang.CharSequence sequence) Returnstrueif a character sequence contains only matching BMP characters.The default implementation iterates over the sequence, invoking matches(char)for each character, until this returnsfalseor the end is reached.- Parameters:
- sequence- the character sequence to examine, possibly empty
- Returns:
- trueif this matcher matches every character in the sequence, including when the sequence is empty
 
 - 
matchesNoneOfpublic boolean matchesNoneOf(java.lang.CharSequence sequence) Returnstrueif a character sequence contains no matching BMP characters. Equivalent to!matchesAnyOf(sequence).The default implementation iterates over the sequence, invoking matches(char)for each character, until this returnstrueor the end is reached.- Parameters:
- sequence- the character sequence to examine, possibly empty
- Returns:
- trueif this matcher matches no characters in the sequence, including when the sequence is empty
 
 - 
indexInpublic int indexIn(java.lang.CharSequence sequence) Returns the index of the first matching BMP character in a character sequence, or-1if no matching character is present.The default implementation iterates over the sequence in forward order calling matches(char)for each character.- Parameters:
- sequence- the character sequence to examine from the beginning
- Returns:
- an index, or -1if no character matches
 
 - 
indexInpublic int indexIn(java.lang.CharSequence sequence, int start) Returns the index of the first matching BMP character in a character sequence, starting from a given position, or-1if no character matches after that position.The default implementation iterates over the sequence in forward order, beginning at start, callingmatches(char)for each character.- Parameters:
- sequence- the character sequence to examine
- start- the first index to examine; must be nonnegative and no greater than- sequence.length()
- Returns:
- the index of the first matching character, guaranteed to be no less than start, or-1if no character matches
- Throws:
- java.lang.IndexOutOfBoundsException- if start is negative or greater than- sequence.length()
 
 - 
lastIndexInpublic int lastIndexIn(java.lang.CharSequence sequence) Returns the index of the last matching BMP character in a character sequence, or-1if no matching character is present.The default implementation iterates over the sequence in reverse order calling matches(char)for each character.- Parameters:
- sequence- the character sequence to examine from the end
- Returns:
- an index, or -1if no character matches
 
 - 
countInpublic int countIn(java.lang.CharSequence sequence) Returns the number of matchingchars found in a character sequence.Counts 2 per supplementary character, such as for whitespace()().negate()().
 - 
removeFrompublic java.lang.String removeFrom(java.lang.CharSequence sequence) Returns a string containing all non-matching characters of a character sequence, in order. For example:
 ... returnsCharMatcher.is('a').removeFrom("bazaar")"bzr".
 - 
retainFrompublic java.lang.String retainFrom(java.lang.CharSequence sequence) Returns a string containing all matching BMP characters of a character sequence, in order. For example:
 ... returnsCharMatcher.is('a').retainFrom("bazaar")"aaa".
 - 
replaceFrompublic java.lang.String replaceFrom(java.lang.CharSequence sequence, char replacement) Returns a string copy of the input character sequence, with each matching BMP character replaced by a given replacement character. For example:
 ... returnsCharMatcher.is('a').replaceFrom("radar", 'o')"rodor".The default implementation uses indexIn(CharSequence)to find the first matching character, then iterates the remainder of the sequence callingmatches(char)for each character.- Parameters:
- sequence- the character sequence to replace matching characters in
- replacement- the character to append to the result string in place of each matching character in- sequence
- Returns:
- the new string
 
 - 
replaceFrompublic java.lang.String replaceFrom(java.lang.CharSequence sequence, java.lang.CharSequence replacement) Returns a string copy of the input character sequence, with each matching BMP character replaced by a given replacement sequence. For example:
 ... returnsCharMatcher.is('a').replaceFrom("yaha", "oo")"yoohoo".Note: If the replacement is a fixed string with only one character, you are better off calling replaceFrom(CharSequence, char)directly.- Parameters:
- sequence- the character sequence to replace matching characters in
- replacement- the characters to append to the result string in place of each matching character in- sequence
- Returns:
- the new string
 
 - 
trimFrompublic java.lang.String trimFrom(java.lang.CharSequence sequence) Returns a substring of the input character sequence that omits all matching BMP characters from the beginning and from the end of the string. For example:
 ... returnsCharMatcher.anyOf("ab").trimFrom("abacatbab")"cat".Note that: 
 ... is equivalent toCharMatcher.inRange('\0', ' ').trimFrom(str)String.trim().
 - 
trimLeadingFrompublic java.lang.String trimLeadingFrom(java.lang.CharSequence sequence) Returns a substring of the input character sequence that omits all matching BMP characters from the beginning of the string. For example:
 ... returnsCharMatcher.anyOf("ab").trimLeadingFrom("abacatbab")"catbab".
 - 
trimTrailingFrompublic java.lang.String trimTrailingFrom(java.lang.CharSequence sequence) Returns a substring of the input character sequence that omits all matching BMP characters from the end of the string. For example:
 ... returnsCharMatcher.anyOf("ab").trimTrailingFrom("abacatbab")"abacat".
 - 
collapseFrompublic java.lang.String collapseFrom(java.lang.CharSequence sequence, char replacement) Returns a string copy of the input character sequence, with each group of consecutive matching BMP characters replaced by a single replacement character. For example:
 ... returnsCharMatcher.anyOf("eko").collapseFrom("bookkeeper", '-')"b-p-r".The default implementation uses indexIn(CharSequence)to find the first matching character, then iterates the remainder of the sequence callingmatches(char)for each character.- Parameters:
- sequence- the character sequence to replace matching groups of characters in
- replacement- the character to append to the result string in place of each group of matching characters in- sequence
- Returns:
- the new string
 
 - 
trimAndCollapseFrompublic java.lang.String trimAndCollapseFrom(java.lang.CharSequence sequence, char replacement) Collapses groups of matching characters exactly ascollapseFrom(java.lang.CharSequence, char)does, except that groups of matching BMP characters at the start or end of the sequence are removed without replacement.
 - 
apply@Deprecated public boolean apply(java.lang.Character character) Deprecated.Provided only to satisfy thePredicateinterface; usematches(char)instead.Description copied from interface:PredicateReturns the result of applying this predicate toinput(Java 8 users, see notes in the class documentation above). This method is generally expected, but not absolutely required, to have the following properties:- Its execution does not cause any observable side effects.
- The computation is consistent with equals; that is, Objects.equal(a, b)implies thatpredicate.apply(a) == predicate.apply(b)).
 
 - 
toStringpublic java.lang.String toString() Returns a string representation of thisCharMatcher, such asCharMatcher.or(WHITESPACE, JAVA_DIGIT).- Overrides:
- toStringin class- java.lang.Object
 
 
- 
 
-