@Beta @GwtCompatible public final class Utf8 extends Object
The variant of UTF-8 implemented by this class is the restricted definition of UTF-8 introduced in Unicode 3.1. One implication of this is that it rejects "non-shortest form" byte sequences, even though the JDK decoder may accept them.
| Modifier and Type | Method and Description | 
|---|---|
| static int | encodedLength(CharSequence sequence)Returns the number of bytes in the UTF-8-encoded form of  sequence. | 
| static boolean | isWellFormed(byte[] bytes)Returns  trueifbytesis a well-formed UTF-8 byte sequence according to
 Unicode 6.0. | 
| static boolean | isWellFormed(byte[] bytes,
                        int off,
                        int len)Returns whether the given byte array slice is a well-formed UTF-8 byte sequence, as defined by
  isWellFormed(byte[]). | 
public static int encodedLength(CharSequence sequence)
sequence. For a string,
 this method is equivalent to string.getBytes(UTF_8).length, but is more efficient in
 both time and space.IllegalArgumentException - if sequence contains ill-formed UTF-16 (unpaired
     surrogates)public static boolean isWellFormed(byte[] bytes)
true if bytes is a well-formed UTF-8 byte sequence according to
 Unicode 6.0. Note that this is a stronger criterion than simply whether the bytes can be
 decoded. For example, some versions of the JDK decoder will accept "non-shortest form" byte
 sequences, but encoding never reproduces these. Such byte sequences are not considered
 well-formed.
 This method returns true if and only if Arrays.equals(bytes, new
 String(bytes, UTF_8).getBytes(UTF_8)) does, but is more efficient in both time and space.
public static boolean isWellFormed(byte[] bytes, int off, int len)
isWellFormed(byte[]). Note that this can be false even when isWellFormed(bytes) is true.bytes - the input bufferoff - the offset in the buffer of the first byte to readlen - the number of bytes to read from the bufferCopyright © 2010-2014. All Rights Reserved.