T
- the type of instances that the BloomFilter
accepts@Beta public final class BloomFilter<T> extends Object implements Predicate<T>, Serializable
T
. A Bloom filter offers an approximate containment test
with one-sided error: if it claims that an element is contained in it, this might be in error,
but if it claims that an element is not contained in it, then this is definitely true.
If you are unfamiliar with Bloom filters, this nice tutorial may help you understand how they work.
The false positive probability (FPP
) of a Bloom filter is defined as the probability
that mightContain(Object) will erroneously return true
for an object that
has not actually been put in the BloomFilter
.
Bloom filters are serializable. They also support a more compact serial representation via the
writeTo(java.io.OutputStream)
and readFrom(java.io.InputStream, com.google.common.hash.Funnel<? super T>)
methods. Both serialized forms will continue to be
supported by future versions of this library. However, serial forms generated by newer versions
of the code may not be readable by older versions of the code (e.g., a serialized Bloom filter
generated today may not be readable by a binary that was compiled 6 months ago).
As of Guava 23.0, this class is thread-safe and lock-free. It internally uses atomics and compare-and-swap to ensure correctness when multiple threads are used to access it.
Modifier and Type | Method and Description |
---|---|
boolean |
apply(T input)
Deprecated.
Provided only to satisfy the
Predicate interface; use mightContain(T)
instead. |
long |
approximateElementCount()
Returns an estimate for the total number of distinct elements that have been added to this
Bloom filter.
|
BloomFilter<T> |
copy()
Creates a new
BloomFilter that's a copy of this instance. |
static <T> BloomFilter<T> |
create(Funnel<? super T> funnel,
int expectedInsertions)
Creates a
BloomFilter with the expected number of insertions and a default expected
false positive probability of 3%. |
static <T> BloomFilter<T> |
create(Funnel<? super T> funnel,
int expectedInsertions,
double fpp)
Creates a
BloomFilter with the expected number of insertions and expected false
positive probability. |
static <T> BloomFilter<T> |
create(Funnel<? super T> funnel,
long expectedInsertions)
Creates a
BloomFilter with the expected number of insertions and a default expected
false positive probability of 3%. |
static <T> BloomFilter<T> |
create(Funnel<? super T> funnel,
long expectedInsertions,
double fpp)
Creates a
BloomFilter with the expected number of insertions and expected false
positive probability. |
boolean |
equals(@Nullable Object object)
Indicates whether some other object is "equal to" this one.
|
double |
expectedFpp()
Returns the probability that mightContain(Object) will erroneously return
true for an object that has not actually been put in the BloomFilter . |
int |
hashCode()
Returns a hash code value for the object.
|
boolean |
isCompatible(BloomFilter<T> that)
Determines whether a given Bloom filter is compatible with this Bloom filter.
|
boolean |
mightContain(T object)
Returns
true if the element might have been put in this Bloom filter, false if this is definitely not the case. |
boolean |
put(T object)
Puts an element into this
BloomFilter . |
void |
putAll(BloomFilter<T> that)
Combines this Bloom filter with another Bloom filter by performing a bitwise OR of the
underlying data.
|
static <T> BloomFilter<T> |
readFrom(InputStream in,
Funnel<? super T> funnel)
Reads a byte stream, which was written by writeTo(OutputStream), into a
BloomFilter . |
static <T> Collector<T,?,BloomFilter<T>> |
toBloomFilter(Funnel<? super T> funnel,
long expectedInsertions)
Returns a
Collector expecting the specified number of insertions, and yielding a BloomFilter with false positive probability 3%. |
static <T> Collector<T,?,BloomFilter<T>> |
toBloomFilter(Funnel<? super T> funnel,
long expectedInsertions,
double fpp)
Returns a
Collector expecting the specified number of insertions, and yielding a BloomFilter with the specified expected false positive probability. |
void |
writeTo(OutputStream out)
Writes this
BloomFilter to an output stream, with a custom format (not Java
serialization). |
public BloomFilter<T> copy()
BloomFilter
that's a copy of this instance. The new instance is equal to
this instance but shares no mutable state.public boolean mightContain(T object)
true
if the element might have been put in this Bloom filter, false
if this is definitely not the case.@Deprecated public boolean apply(T input)
Predicate
interface; use mightContain(T)
instead.Predicate
input
(Java 8 users, see notes in the
class documentation above). This method is generally expected, but not absolutely
required, to have the following properties:
Objects.equal
(a, b)
implies that predicate.apply(a) ==
predicate.apply(b))
.
@CanIgnoreReturnValue public boolean put(T object)
BloomFilter
. Ensures that subsequent invocations of mightContain(Object)
with the same element will always return true
.object
has been added to the
filter. If the bits haven't changed, this might be the first time object
has
been added to the filter. Note that put(t)
always returns the opposite
result to what mightContain(t)
would have returned at the time it is called.void
return type})public double expectedFpp()
true
for an object that has not actually been put in the BloomFilter
.
Ideally, this number should be close to the fpp
parameter passed in create(Funnel, int, double), or smaller. If it is significantly higher, it is usually the
case that too many elements (more than expected) have been put in the BloomFilter
,
degenerating it.
public long approximateElementCount()
expectedInsertions
that was used when constructing the filter.public boolean isCompatible(BloomFilter<T> that)
that
- The Bloom filter to check for compatibility.public void putAll(BloomFilter<T> that)
that
- The Bloom filter to combine this Bloom filter with. It is not mutated.IllegalArgumentException
- if isCompatible(that) == false
public boolean equals(@Nullable Object object)
java.lang.Object
The equals
method implements an equivalence relation
on non-null object references:
x
, x.equals(x)
should return
true
.
x
and y
, x.equals(y)
should return true
if and only if
y.equals(x)
returns true
.
x
, y
, and z
, if
x.equals(y)
returns true
and
y.equals(z)
returns true
, then
x.equals(z)
should return true
.
x
and y
, multiple invocations of
x.equals(y)
consistently return true
or consistently return false
, provided no
information used in equals
comparisons on the
objects is modified.
x
,
x.equals(null)
should return false
.
The equals
method for class Object
implements
the most discriminating possible equivalence relation on objects;
that is, for any non-null reference values x
and
y
, this method returns true
if and only
if x
and y
refer to the same object
(x == y
has the value true
).
Note that it is generally necessary to override the hashCode
method whenever this method is overridden, so as to maintain the
general contract for the hashCode
method, which states
that equal objects must have equal hash codes.
equals
in class Object
object
- the reference object with which to compare.true
if this object is the same as the obj
argument; false
otherwise.Object.hashCode()
,
HashMap
public int hashCode()
java.lang.Object
HashMap
.
The general contract of hashCode
is:
hashCode
method
must consistently return the same integer, provided no information
used in equals
comparisons on the object is modified.
This integer need not remain consistent from one execution of an
application to another execution of the same application.
equals(Object)
method, then calling the hashCode
method on each of
the two objects must produce the same integer result.
Object.equals(java.lang.Object)
method, then calling the hashCode
method on each of the
two objects must produce distinct integer results. However, the
programmer should be aware that producing distinct integer results
for unequal objects may improve the performance of hash tables.
As much as is reasonably practical, the hashCode method defined by
class Object
does return distinct integers for distinct
objects. (This is typically implemented by converting the internal
address of the object into an integer, but this implementation
technique is not required by the
Java™ programming language.)
hashCode
in class Object
Object.equals(java.lang.Object)
,
System.identityHashCode(java.lang.Object)
public static <T> Collector<T,?,BloomFilter<T>> toBloomFilter(Funnel<? super T> funnel, long expectedInsertions)
Collector
expecting the specified number of insertions, and yielding a BloomFilter
with false positive probability 3%.
Note that if the Collector
receives significantly more elements than specified, the
resulting BloomFilter
will suffer a sharp deterioration of its false positive
probability.
The constructed BloomFilter
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of
ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed BloomFilter
will useexpectedInsertions
- the number of expected insertions to the constructed BloomFilter
; must be positiveCollector
generating a BloomFilter
of the received elementspublic static <T> Collector<T,?,BloomFilter<T>> toBloomFilter(Funnel<? super T> funnel, long expectedInsertions, double fpp)
Collector
expecting the specified number of insertions, and yielding a BloomFilter
with the specified expected false positive probability.
Note that if the Collector
receives significantly more elements than specified, the
resulting BloomFilter
will suffer a sharp deterioration of its false positive
probability.
The constructed BloomFilter
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of
ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed BloomFilter
will useexpectedInsertions
- the number of expected insertions to the constructed BloomFilter
; must be positivefpp
- the desired false positive probability (must be positive and less than 1.0)Collector
generating a BloomFilter
of the received elementspublic static <T> BloomFilter<T> create(Funnel<? super T> funnel, int expectedInsertions, double fpp)
BloomFilter
with the expected number of insertions and expected false
positive probability.
Note that overflowing a BloomFilter
with significantly more elements than specified,
will result in its saturation, and a sharp deterioration of its false positive probability.
The constructed BloomFilter
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of
ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed BloomFilter
will useexpectedInsertions
- the number of expected insertions to the constructed BloomFilter
; must be positivefpp
- the desired false positive probability (must be positive and less than 1.0)BloomFilter
public static <T> BloomFilter<T> create(Funnel<? super T> funnel, long expectedInsertions, double fpp)
BloomFilter
with the expected number of insertions and expected false
positive probability.
Note that overflowing a BloomFilter
with significantly more elements than specified,
will result in its saturation, and a sharp deterioration of its false positive probability.
The constructed BloomFilter
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of
ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed BloomFilter
will useexpectedInsertions
- the number of expected insertions to the constructed BloomFilter
; must be positivefpp
- the desired false positive probability (must be positive and less than 1.0)BloomFilter
public static <T> BloomFilter<T> create(Funnel<? super T> funnel, int expectedInsertions)
BloomFilter
with the expected number of insertions and a default expected
false positive probability of 3%.
Note that overflowing a BloomFilter
with significantly more elements than specified,
will result in its saturation, and a sharp deterioration of its false positive probability.
The constructed BloomFilter
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of
ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed BloomFilter
will useexpectedInsertions
- the number of expected insertions to the constructed BloomFilter
; must be positiveBloomFilter
public static <T> BloomFilter<T> create(Funnel<? super T> funnel, long expectedInsertions)
BloomFilter
with the expected number of insertions and a default expected
false positive probability of 3%.
Note that overflowing a BloomFilter
with significantly more elements than specified,
will result in its saturation, and a sharp deterioration of its false positive probability.
The constructed BloomFilter
will be serializable if the provided Funnel<T>
is.
It is recommended that the funnel be implemented as a Java enum. This has the benefit of
ensuring proper serialization and deserialization, which is important since equals(java.lang.Object)
also relies on object identity of funnels.
funnel
- the funnel of T's that the constructed BloomFilter
will useexpectedInsertions
- the number of expected insertions to the constructed BloomFilter
; must be positiveBloomFilter
public void writeTo(OutputStream out) throws IOException
BloomFilter
to an output stream, with a custom format (not Java
serialization). This has been measured to save at least 400 bytes compared to regular
serialization.
Use readFrom(InputStream, Funnel) to reconstruct the written BloomFilter.
IOException
public static <T> BloomFilter<T> readFrom(InputStream in, Funnel<? super T> funnel) throws IOException
BloomFilter
.
The Funnel
to be used is not encoded in the stream, so it must be provided here.
Warning: the funnel provided must behave identically to the one used to populate
the original Bloom filter!
IOException
- if the InputStream throws an IOException
, or if its data does not
appear to be a BloomFilter serialized using the writeTo(OutputStream) method.Copyright © 2010–2018. All rights reserved.