Utf8 (Guava: Google Core Libraries for Java 17.0 API)

java.lang.Object
- com.google.common.base.Utf8

```
@Beta
@GwtCompatible
public final class Utf8
extends Object
```
Low-level, high-performance utility methods related to the UTF-8 character encoding. UTF-8 is defined in section D92 of The Unicode Standard Core Specification, Chapter 3.
The variant of UTF-8 implemented by this class is the restricted definition of UTF-8 introduced in Unicode 3.1. One implication of this is that it rejects "non-shortest form" byte sequences, even though the JDK decoder may accept them.

Since:

16.0

Author:

Martin Buchholz, Clément Roux

Method Summary

Methods
Modifier and Type	Method and Description
`static int`	`encodedLength(CharSequence sequence)` Returns the number of bytes in the UTF-8-encoded form of `sequence`.
`static boolean`	`isWellFormed(byte[] bytes)` Returns `true` if `bytes` is a well-formed UTF-8 byte sequence according to Unicode 6.0.
`static boolean`	`isWellFormed(byte[] bytes, int off, int len)` Returns whether the given byte array slice is a well-formed UTF-8 byte sequence, as defined by `isWellFormed(byte[])`.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Method Detail
  - encodedLength
```
public static int encodedLength(CharSequence sequence)
```
    Returns the number of bytes in the UTF-8-encoded form of sequence. For a string, this method is equivalent to string.getBytes(UTF_8).length, but is more efficient in both time and space.
    
    Throws:
    
    IllegalArgumentException - if sequence contains ill-formed UTF-16 (unpaired surrogates)
  - isWellFormed
```
public static boolean isWellFormed(byte[] bytes)
```
    Returns true if bytes is a well-formed UTF-8 byte sequence according to Unicode 6.0. Note that this is a stronger criterion than simply whether the bytes can be decoded. For example, some versions of the JDK decoder will accept "non-shortest form" byte sequences, but encoding never reproduces these. Such byte sequences are not considered well-formed.
    This method returns true if and only if Arrays.equals(bytes, new String(bytes, UTF_8).getBytes(UTF_8)) does, but is more efficient in both time and space.
  - isWellFormed
```
public static boolean isWellFormed(byte[] bytes,
                   int off,
                   int len)
```
    Returns whether the given byte array slice is a well-formed UTF-8 byte sequence, as defined by isWellFormed(byte[]). Note that this can be false even when isWellFormed(bytes) is true.
    
    Parameters:
    bytes - the input buffer
    off - the offset in the buffer of the first byte to read
    len - the number of bytes to read from the buffer

Class Utf8

Method Summary

Methods inherited from class java.lang.Object

Method Detail

encodedLength

isWellFormed

isWellFormed