Interface Hasher

  • All Superinterfaces:
    PrimitiveSink

    @Beta
    @CanIgnoreReturnValue
    public interface Hasher
    extends PrimitiveSink
    A PrimitiveSink that can compute a hash code after reading the input. Each hasher should translate all multibyte values (putInt(int), putLong(long), etc) to bytes in little-endian order.

    Warning: The result of calling any methods after calling hash() is undefined.

    Warning: Using a specific character encoding when hashing a CharSequence with putString(CharSequence, Charset) is generally only useful for cross-language compatibility (otherwise prefer putUnencodedChars(java.lang.CharSequence)). However, the character encodings must be identical across languages. Also beware that Charset definitions may occasionally change between Java releases.

    Warning: Chunks of data that are put into the Hasher are not delimited. The resulting HashCode is dependent only on the bytes inserted, and the order in which they were inserted, not how those bytes were chunked into discrete put() operations. For example, the following three expressions all generate colliding hash codes:

    
     newHasher().putByte(b1).putByte(b2).putByte(b3).hash()
     newHasher().putByte(b1).putBytes(new byte[] { b2, b3 }).hash()
     newHasher().putBytes(new byte[] { b1, b2, b3 }).hash()
     

    If you wish to avoid this, you should either prepend or append the size of each chunk. Keep in mind that when dealing with char sequences, the encoded form of two concatenated char sequences is not equivalent to the concatenation of their encoded form. Therefore, putString(CharSequence, Charset) should only be used consistently with complete sequences and not broken into chunks.

    Since:
    11.0
    Author:
    Kevin Bourrillion
    • Method Detail

      • putBytes

        Hasher putBytes​(byte[] bytes)
        Description copied from interface: PrimitiveSink
        Puts an array of bytes into this sink.
        Specified by:
        putBytes in interface PrimitiveSink
        Parameters:
        bytes - a byte array
        Returns:
        this instance
      • putBytes

        Hasher putBytes​(byte[] bytes,
                        int off,
                        int len)
        Description copied from interface: PrimitiveSink
        Puts a chunk of an array of bytes into this sink. bytes[off] is the first byte written, bytes[off + len - 1] is the last.
        Specified by:
        putBytes in interface PrimitiveSink
        Parameters:
        bytes - a byte array
        off - the start offset in the array
        len - the number of bytes to write
        Returns:
        this instance
      • putBytes

        Hasher putBytes​(ByteBuffer bytes)
        Description copied from interface: PrimitiveSink
        Puts the remaining bytes of a byte buffer into this sink. bytes.position() is the first byte written, bytes.limit() - 1 is the last. The position of the buffer will be equal to the limit when this method returns.
        Specified by:
        putBytes in interface PrimitiveSink
        Parameters:
        bytes - a byte buffer
        Returns:
        this instance
      • putUnencodedChars

        Hasher putUnencodedChars​(CharSequence charSequence)
        Equivalent to processing each char value in the CharSequence, in order. In other words, no character encoding is performed; the low byte and high byte of each char are hashed directly (in that order). The input must not be updated while this method is in progress.

        Warning: This method will produce different output than most other languages do when running the same hash function on the equivalent input. For cross-language compatibility, use putString(java.lang.CharSequence, java.nio.charset.Charset), usually with a charset of UTF-8. For other use cases, use putUnencodedChars.

        Specified by:
        putUnencodedChars in interface PrimitiveSink
        Since:
        15.0 (since 11.0 as putString(CharSequence)).
      • putString

        Hasher putString​(CharSequence charSequence,
                         Charset charset)
        Equivalent to putBytes(charSequence.toString().getBytes(charset)).

        Warning: This method, which reencodes the input before hashing it, is useful only for cross-language compatibility. For other use cases, prefer putUnencodedChars(java.lang.CharSequence), which is faster, produces the same output across Java releases, and hashes every char in the input, even if some are invalid.

        Specified by:
        putString in interface PrimitiveSink
      • hash

        HashCode hash()
        Computes a hash code based on the data that have been provided to this hasher. The result is unspecified if this method is called more than once on the same instance.
      • hashCode

        @Deprecated
        int hashCode()
        Deprecated.
        This returns Object.hashCode(); you almost certainly mean to call hash().asInt().
        Returns a hash code value for the object. This method is supported for the benefit of hash tables such as those provided by HashMap.

        The general contract of hashCode is:

        • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
        • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
        • It is not required that if two objects are unequal according to the Object.equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

        As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (The hashCode may or may not be implemented as some function of an object's memory address at some point in time.)

        Overrides:
        hashCode in class Object
        Returns:
        a hash code value for this object.
        See Also:
        Object.equals(java.lang.Object), System.identityHashCode(java.lang.Object)