0% found this document useful (0 votes)

20 views

HAshing (Satish sir)

Uploaded by

ashishkumar581388

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

HAshing (Satish sir)

Uploaded by

ashishkumar581388

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Hashing

Dr. Satish Kumar T

Data Structures and Application
Contents
• What is Hashing
• Components of Hashing
• How does Hashing work?
• What is a Hash function?
Types of Hash functions
Properties of a Good hash function

2
Contents Cond..
• Problem with Hashing: • What is Rehashing?
What is collision? • Applications of Hash Data structure
How to handle Collisions?
• 1) Separate Chaining:
• 2) Open Addressing:
• 2.a) Linear Probing:
• 2.b) Quadratic Probing:
• 2.c) Double Hashing:

3
What is Hashing

• Hashing refers to the process of generating a fixed-size output from an

input of variable size using the mathematical formulas known as hash
functions. This technique determines an index or location for the storage
of an item in a data structure.
• Hashing is a technique of mapping a large chunk of data into small tables
using a hashing function. It is also known as the message digest function.
It is a technique that uniquely identifies a specific item from a collection
of similar items.
4
Components of Hashing
1. Key: A Key can be anything string or integer which is fed as input in the
hash function the technique that determines an index or location for
storage of an item in a data structure.
2. Hash Function: The hash function receives the input key and returns the
index of an element in an array called a hash table. The index is known
as the hash index.
3. Hash Table: Hash table is a data structure that maps keys to values
using a special function called a hash function. Hash stores the data in
an associative manner in an array where each data value has its own
unique index.

5
Components of Hashing

6
How does Hashing work?
• Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to
store it in a table.
• Let the string itself will act as the value of the string but how to store
the value corresponding to the key?
• Step 1: We know that hash functions (which is some mathematical
formula) are used to calculate the hash value which acts as the index
of the data structure where the value will be stored.
• Step 2: So, let’s assign
“a” = 1, “b”=2, .. etc, to all alphabetical characters.
7
How does Hashing work?
• Step 3: Therefore, the numerical value by summation of all characters
of the string:
“ab” = 1 + 2 = 3,
“cd” = 3 + 4 = 7 ,
“efg” = 5 + 6 + 7 = 18
• Step 4: Now, assume that we have a table of size 7 to store these strings.
The hash function that is used here is the sum of the characters in key
mod Table size. We can compute the location of the string in the array
by taking the sum(string) mod 7.
8
How does Hashing work?
• Step 5: So we will then store
“ab” in 3 mod 7 = 3,
“cd” in 7 mod 7 = 0, and
“efg” in 18 mod 7 = 4.

Mapping key with indices of array

9
What is a Hash function?

• The hash function creates a mapping between key and value, this is done
through the use of mathematical formulas known as hash functions. The
result of the hash function is referred to as a hash value or hash.
• Types of Hash functions
 Division Method.
 Mid Square Method.
 Folding Method.
 Multiplication Method.

10
Division Method

h(K) = k mod M
Here, k is the key value, and
M is the size of the hash table.

Ex: k = 12345
M = 95
h(12345) = 12345 mod 95 = 90
k = 1276

M = 11
h(1276) = 1276 mod 11 = 0
11
Mid Square Method

It involves two steps to compute the hash value:

• Square the value of the key k i.e. k2
• Extract the middle r digits as the hash value.

h(K) = h(k x k)
Here, k is the key value.
• Ex: k = 60
• k x k = 60 x 60 = 3600
• h(60) = 60, The hash value obtained is 60
12
Digit Folding Method

It involves two steps to compute the hash value:

• Divide the key-value k into a number of parts i.e. k1, k2, k3,….,kn, where
each part has the same number of digits except for the last part that can
have lesser digits than the other parts.
• Add the individual parts. The hash value is obtained by ignoring the last
carry if any.

k = k1, k2, k3, k4, ….., kn

s = k1+ k2 + k3 + k4 +….+ kn, h(K)= s
Here, s is obtained by adding the parts of the key k
13
Digit Folding Method

It involves two steps to compute the hash value:

k = 12345
k1 = 12, k2 = 34, k3 = 5
s = k1 + k2 + k3
= 12 + 34 + 5
= 51
h(K) = 51

14
Multiplication Method

This method involves the following steps

1. Choose a constant value A such that 0 < A < 1.
2. Multiply the key value with A.
3. Extract the fractional part of kA.
4. Multiply the result of the above step by the size of the hash table i.e. M.
5. The resulting hash value is obtained by taking the floor of the result
obtained in step 4.

15
Multiplication Method

h(K) = floor (M (kA mod 1))

Here,
M is the size of the hash table.
k is the key value.
A is a constant value.

16
Multiplication Method

Example:
k = 12345
A = 0.357840
M = 100
h(12345) = floor[ 100 (12345*0.357840 mod 1)]
= floor[ 100 (4417.5348 mod 1) ]
= floor[ 100 (0.5348) ]
= floor[ 53.48 ]
= 53
17
Properties of a Good hash function

1. Efficiently computable.
2. Should uniformly distribute the keys (Each table position is equally
likely for each).
3. Should minimize collisions.
4. Should have a low load factor(number of items in the table divided by
the size of the table).

18
Problem with Hashing

• If we consider the above example, the hash function we used is the sum
of the letters, but if we examined the hash function closely then the
problem can be easily visualized that for different strings same hash
value is begin generated by the hash function.

• For example: {“ab”, “ba”} both have the same hash value, and string
{“cd”,”be”} also generate the same hash value, etc. This is known as
collision and it creates problem in searching, insertion, deletion, and
updating of value.
19
What is collision?

• The hashing process generates a small number for a big key, so there is a
possibility that two keys could produce the same value. The situation
where the newly inserted key maps to an already occupied, and it must
be handled using some collision handling technology.

20
21
How to handle Collisions?

• There are mainly two methods to handle collision:

 Separate Chaining
 Open Addressing

22
Separate Chaining

• The idea is to make each cell of the hash table point to a linked list of
records that have the same hash function value. Chaining is simple but
requires additional memory outside the table.
• Example: We have given a hash function and we have to insert some
elements in the hash table using a separate chaining method for
collision resolution technique.
Hash function = key % 5,
Elements = 12, 15, 22, 25 and 37.

23
Separate Chaining