Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 4 Next »

Introduction

Have we ever wondered how websites validate millions of records within seconds? One of the solutions is using the Bloom filter

What is it ?

A bloom filter is a probabilistic data structure meant for checking whether a given entry does not occur in a list. It is meant to be quite fast, and is used as a way of not doing costly queries when it can be determined that no results will be returned.

Under the hood, the Bloom filter is just an array of bits with all bits set to zero initially. The below diagram depicts the Bloom filter of size 19:

On this page.

What are the Bloom Filter Operations ?

Insert

We can perform Insert operations in constant space and time complexity.  Bloom filter performs the below steps for Insert operation:

  1. Hash the input value

  2. Mod the result by the length of the array

  3. Set corresponding bit to 1

Let’s illustrate this by inserting two strings – John Doe and Jane Doe into the Bloom filter. Let’s assume that hash values for these two strings are 1355 and 8337, respectively. Now, let’s perform the modulo operation to get an index within the bounds of the bit array: 1355%19 = 6 and 8337%19 = 15.

The Bloom filter will look like this after inserting values:

Lookup

What are the False Positive Analysis ?

Where are used the Bloom Filter ?

  • No labels