Skip to content

Commit 313c80c

Browse files
committed
from b to bloom
1 parent c132d50 commit 313c80c

File tree

1 file changed

+23
-23
lines changed

1 file changed

+23
-23
lines changed

Diff for: data_structures/hashing/bloom_filter.py

+23-23
Original file line numberDiff line numberDiff line change
@@ -4,61 +4,61 @@
44
The use of this data structure is to test membership in a set.
55
Compared to Python's built-in set() it is more space-efficient.
66
In the following example, only 8 bits of memory will be used:
7-
>>> b = Bloom(size=8)
8-
>>> "Titanic" in b
7+
>>> bloom = Bloom(size=8)
8+
>>> "Titanic" in bloom
99
False
1010
1111
Initially the filter contains all zeros:
12-
>>> b.bitstring
12+
>>> bloom.bitstring
1313
'00000000'
1414
1515
When an element is added, two bits are set to 1
1616
since there are 2 hash functions in this implementation:
17-
>>> b.add("Titanic")
18-
>>> b.bitstring
17+
>>> bloom.add("Titanic")
18+
>>> bloom.bitstring
1919
'01100000'
20-
>>> "Titanic" in b
20+
>>> "Titanic" in bloom
2121
True
2222
2323
However, sometimes only one bit is added
2424
because both hash functions return the same value
25-
>>> b.add("Avatar")
26-
>>> b.format_hash("Avatar")
25+
>>> bloom.add("Avatar")
26+
>>> bloom.format_hash("Avatar")
2727
'00000100'
28-
>>> b.bitstring
28+
>>> bloom.bitstring
2929
'01100100'
3030
3131
Not added elements should return False ...
32-
>>> "The Goodfather" in b
32+
>>> "The Goodfather" in bloom
3333
False
34-
>>> b.format_hash("The Goodfather")
34+
>>> bloom.format_hash("The Goodfather")
3535
'00011000'
36-
>>> "Interstellar" in b
36+
>>> "Interstellar" in bloom
3737
False
38-
>>> b.format_hash("Interstellar")
38+
>>> bloom.format_hash("Interstellar")
3939
'00000011'
40-
>>> "Parasite" in b
40+
>>> "Parasite" in bloom
4141
False
42-
>>> b.format_hash("Parasite")
42+
>>> bloom.format_hash("Parasite")
4343
'00010010'
44-
>>> "Pulp Fiction" in b
44+
>>> "Pulp Fiction" in bloom
4545
False
46-
>>> b.format_hash("Pulp Fiction")
46+
>>> bloom.format_hash("Pulp Fiction")
4747
'10000100'
4848
4949
but sometimes there are false positives:
50-
>>> "Ratatouille" in b
50+
>>> "Ratatouille" in bloom
5151
True
52-
>>> b.format_hash("Ratatouille")
52+
>>> bloom.format_hash("Ratatouille")
5353
'01100000'
5454
5555
The probability increases with the number of added elements
56-
>>> b.estimated_error_rate()
56+
>>> bloom.estimated_error_rate()
5757
0.140625
58-
>>> b.add("The Goodfather")
59-
>>> b.estimated_error_rate()
58+
>>> bloom.add("The Goodfather")
59+
>>> bloom.estimated_error_rate()
6060
0.390625
61-
>>> b.bitstring
61+
>>> bloom.bitstring
6262
'01111100'
6363
"""
6464
from hashlib import md5, sha256

0 commit comments

Comments
 (0)