Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

Have we ever wondered how websites validate millions of records within seconds? One of the solutions is using the Bloom filter

What is it ?

A bloom filter is a probabilistic data structure meant for checking whether a given entry does not occur in a list. It is meant to be quite fast, and is used as a way of not doing costly queries when it can be determined that no results will be returned.

Under the hood, the Bloom filter is just an array of bits with all bits set to zero initially. The below diagram depicts the Bloom filter of size 19:

Image Added

On this page.

Table of Contents
minLevel1
maxLevel7

What are the Bloom Filter Operations ?

Insert

We can perform Insert operations in constant space and time complexity.  Bloom filter performs the below steps for Insert operation:

  1. Hash the input value

  2. Mod the result by the length of the array

  3. Set corresponding bit to 1

Let’s illustrate this by inserting two strings – John Doe and Jane Doe into the Bloom filter. Let’s assume that hash values for these two strings are 1355 and 8337, respectively. Now, let’s perform the modulo operation to get an index within the bounds of the bit array: 1355%19 = 6 and 8337%19 = 15.

The Bloom filter will look like this after inserting values:

Image Added

Lookup

What are the False Positive Analysis ?

Where are used the Bloom Filter ?