Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Framework proposed in this space (Alex Xu) is not applied here : Getting started - a framework to propose...

Introduction - What is Hashing ?

A hash function is a function that maps one piece of data—typically describing some kind of object, often of arbitrary size—to another piece of data, typically an integer, known as hash code, or simply hash.

Hashing is widely used in algorithms, data structures, and cryptography.

  1. Hash Functions
    Hash functions take variable-length input data and produce a fixed-length output value.

    Code Block
    languagejava
    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;
            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }
    
    public static void main(String[] args) {
        String anotherTest = "java";
        String test = "test";
        String oneMoreTest = "dev";
        System.out.println(test.hashCode()); // output: 3556498
        System.out.println(test.hashCode()); // output: 3556498
        System.out.println(anotherTest.hashCode()); // output: 3254818
        System.out.println(oneMoreTest.hashCode()); // output: 99349
    }

  2. Hashing Algo
    Hashing is a technique or process of mapping keys, values into the hash table by using a hash function : Data Structure and Algorithms - Hash Table (tutorialspoint.com)

  3. Cryptographic Hash Functions
    Cryptographic hash functions are a specialized group of hash functions. They provide an increased level of security. Thus, they are used for cryptography purposes like password verification, data integrity validation, blockchain (cryptocurrencies).

On this page.

Table of Contents

The Rehashing problem : How to balance the load to x servers ?

Why do we need Hashing first ?

Image Added

But, hashing is a common way to balance the load to servers : serverIndex = hash(key) % N where N is the size of the server pool. (See an example in the page : the hashes for 4 servers - Hash % 4 servers).

But, the problem is when the servers pool is not static anymore ! New servers are added and existing servers could be removed.

If we removed 1 server, modulo changes : Hash % 3 servers. Most of keys are redistributed.

But, when 1 server goes offline, most cache clients will connect to the wrong servers to fetch data.

The solution : Consistent hashing

When a server goes offline, we need to remap, …