HASHING
• How to store Employee records, which consists of Telephone no.,
Name, City, Department, Salary, etc…
Telephone Name City Dept. Salary
2
Solution
• Use an Array:
• Search takes a linear time.
• If stored in sorted order this search can be done in O(log n) using
binary search but then insertion and deletion becomes costly because
we have to maintain the sorted order.
3
Solution
• Use an Linked List:
• Search takes again linear time to search for particular record as se need to go
through all the records until the required one is found.
• Use balanced BST to store records:
• Insertion takes O(log n).
• Search takes O(log n).
• Deletion takes O(log n).
4
Solution
• Create a Direct Access Table:
• Insertion takes O(1).
• Search takes O(1).
• Deletion takes O(1).
5
• Limitations:
• 1] Size of Table: m * 10^n
• Where, m = size of pointer to the record
n = digits in the telephone number
Telephone Memory Location
(Pointer)
…………………….
9856234178 0x1…
8695234512 0x2…
…………………….
Name City Dept …….
Tejas Pune IT
Name City Dept …….
Rahul Mumbai HR
• Limitations:
• 2] Integer may not hold the
size of n digits
6
Introduction to Hashing
• Provides O(1) time on average for insert, search and delete.
• Find items with keys matching a given search key
• Given an array A, containing n keys, and a search key x, find the
index i such as x=A[i]
• Example of Record
KEY Other Data
7
Continue..
• Hashing is an important Data Structure which is designed to use a
special function called the Hash function.
• Hash function is used to map a given value with a particular key for
faster access of elements.
• The efficiency of mapping depends of the efficiency of the hash
function used.
• Some examples of how hashing is used in our lives include:
• In universities, each student is assigned a unique roll number that can be
used to retrieve information about them.
• In libraries, each book is assigned a unique number that can be used to
determine information about the book, such as its exact position in the
library or the users it has been issued to etc.
8
Continue..
• Assume that you have an object and you want to assign a key to it
to make searching easy.
• To store the key/value pair, you can use a simple array like a data
structure where keys (integers) can be used directly as an index to
store values.
• However, in cases where the keys are large and cannot be used
directly as an index, you should use hashing.
9
Continue..
• In hashing, large keys are converted into small keys by using hash
functions.
• The values are then stored in a data structure called hash table.
• The idea of hashing is to distribute entries (key/value pairs)
uniformly across an array.
• Each element is assigned a key (converted key).
• By using that key you can access the element in O(1) time. Using
the key, the algorithm (hash function) computes an index that
suggests where an entry can be found or inserted.
10
Important Keywords:
• Hash Table:
• A data structure used for storing and retrieving data quickly.
Where each entry in hash table is made using hash function.
• Hash Function:
• It maps a big number or string into a small integer that can be
used as index in hash table.
• Used to place data in hash table and also to retrieve data
from hash table.
• A function that can be used to map a data set of an arbitrary
size to a data set of a fixed size, which falls into the hash
table.
• The values returned by a hash function are called hash values,
hash codes, hash sums, or simply hashes.
• Bucket:
• Each position of hash table is called bucket.
• Collision:
• It is a situation in which hash function returns the same
address for more than one record.
0 25
1 Bucket 16
2 42
3 33
4 54
Given: 25, 33, 54, 16, 42 and table
of size 5.
Hash Function is KEY mod 5
Key mod 5
25 mod 5 = 0
33 mod 5 = 3
54 mod 5 = 4
16 mod 5 = 1
42 mod 5 = 2
30 mod 5 = 0
This is collision.
Hash Table
11
Important Keywords:
• Probe:
• Each calculation of an address and test for success
is known as probe.
• Synonym:
• The set of keys that has the same location are
called synonyms.
• Ex: 25 and 30 are the synonyms.
• Overflow:
• When hash table becomes full and new record
needs to be inserted and resulting into more
collisions the it is called overflow.
• Perfect hash function:
• It is a function that maps distinct key elements into
the hash table with no collisions.
0 25
1 16
2 42
3 33
4 54
Entry of 25, 33, 54, 16, 42 in a table
of size 5 with probe 5
30
Where 30 mod 5 = 0
is a collision, synonym for 25 and it
also shows the overflow situation
Hash Table
Consider 25, 33, 54, 16, 42, 30.
Show Perfect Hash Function
12
• To achieve a good hashing mechanism, It is important to
have a good hash function with the following basic
requirements:
• Easy to compute
• Uniform distribution
• Less Collision
13
Important Keywords:
14
Types of Hash Functions
• Division Method
• Multiplication Method
• Extraction Method
• Mid Square Method
• Folding and Universal Method
15
Division Method
• It depends upon the remainder of division.
• Typically the divisor is table length.
• Ex: 54, 72, 89, 37 is to be placed in hash table & if table size 10.
0
1
2
3
4
5
6
7
8
9
Hash Function
h(key) = record % table_size
54 % 10 = 4
72 % 10 = 2
89 % 10 = 9
37 % 10 = 7
54
72
89
37
Disadvantage:
The consecutive keys map to
consecutive hash values in the
hash table. This leads to a poor
performance.
16
Multiplication Method
17
Extraction Method
• Some digits are extracted from the key to form the address location in
hash table.
• Ex:
Suppose 1st, 3rd and 4th digits from left are selected for hash key.
4 9 7 8 2 4
478
At 478 location in the
hash table of size 1000
the key can be stored.
18
Mid Square Method
• Square the key.
• Then extract middle part of the result.
• That will be the location of key element in hash table.
• Ex:
A] 3111 = Square(3111) B] 16 = Square(16)
9678321
At 783 location in the
hash table of size 1000
the key can be stored.
256
At 56 location in
the hash table.
19
Folding Method
• Two folding techniques:
• A] Fold Shift:
The key is divided into separate
parts based on size of required address.
Finally add those parts.
• Ex: 345678123
• And required address is of 3 digits.
• B] Fold Boundary:
The key is divided into separate
parts based on size of required address.
Then left & right parts are folded and
then add those parts.
• Ex: 345678123
345678123
345
678
123
-------
1146
345678123
543
678
321
-------
1542
20
Universal Hashing
• In this technique randomized algorithm is used to select the hash
function at random from family of hash functions with the help of
certain mathematical properties.
• Let U is the set of Universal Keys and H be the finite collection of Hash
functions mapping U into {0, 1, …….., m-1}.
• Say there is collision or your hash function perform badly for input
key or a pair (x, y) i.e. { (x,y) | h(x)=h(y)}.
U
0
.
.
m-1
Bad portion
of key
21
Universal Hashing
• The fundamental weakness of hash functions is collision so to avoid
that what we do we can choose the Universal hashing where the hash
function randomly at the runtime.
• Lets consider H = set of all hash functions h from U to {0,1,…,m-1}.
{ h | h : U  (0,1,…,m-1) }
H
Bad portion
22
Universal Hashing
23
24
Properties of Good Hash Functions
• Rules for choosing good hash function:
1. The hash function should be simple to compute.
2. Number of collisions should be less ideally no collision should occur (perfect
hash function).
3. Hash function should produce such keys which will get distributed uniformly
over an array.
4. The hash function should depend on every bit of the key and not on the
extracted portion of the key.
25
Collision Resolution Strategies
26
Separate chaining (open hashing)
• The open hashing is also known as separate chaining in which
collisions are stored outside the table.
• Separate chaining is one of the most commonly used collision
resolution techniques.
• It is usually implemented using linked lists. In separate chaining, each
element of the hash table is a linked list.
• To store an element in the hash table you must insert it into a specific
linked list.
• If there is any collision (i.e. two different elements have same hash
value) then store both the elements in the same linked list.
27
Input:
81, 0, 64, 25, 1, 36, 4, 49, 16, 9.
28
Advantages:
1. Simple to implement.
2. Hash table never fills up, we can always add more elements to chain.
3. Less sensitive to the hash function.
4. It is mostly used when it is unknown how many and how frequently
keys may be inserted or deleted
Disadvantages:
1. Wastage of space.
2. Cache performance of chaining is not good as keys are stored using
linked list.
3. If the chain becomes long then search time can become O(n) in worst
case.
4. Uses extra space for links.
29
Open Addressing
• A] Linear Probing with chaining (without
replacement):
• A separate chain table is maintained for
colliding data.
• The address of colliding data is stored with
first colliding element in chain table.
• We can place element at next available
bucket or slot and will keep track of that in
chain column.
Ex: 131, 3, 4, 21, 61, 6, 71, 8, 9.
Index Data Chain
0 -1
1 -1
2 -1
3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
131
3
4
21 mod 10 = 1 Collision
21
2
61 mod 10 = 1 Collision
61
5
6
71
7
8
9
71 mod 10 = 1 Collision
-1
Index Data Chain
0 -1 -1
1 131 2
2 21 5
3 3 -1
4 4 -1
5 61 7
6 6 -1
7 71 -1
8 8 -1
9 9 -1
• Drawback: Finding the next empty location and that results into the element actually belonging to
that empty location cannot obtain its location.
30
Linear Probing with chaining (without replacement)
Ex: Consider elements as 21, 31, 41, 2, 6, 5, 7, 55 with
table size of 10.
Index Data Chain
0 -1
1 -1
2 -1
3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
21
41
2
31
2
5
3
6
-1
Index Data Chain
0 -1 -1
1 21 2
2 31 3
3 41 4
4 2 -1
5 5 8
6 6 -1
7 7 -1
8 55 -1
9 -1 -1
7
55
-1
4
8
31
B]Linear Probing with chaining (with replacement)
• Ex: 131, 21, 31, 4, 5, 2.
Index Data Chain
0 -1
1 -1
2 -1
3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
131
31
4
21 mod 10 = 1 Collision
21
2
31 mod 10 = 1 Collision
5
3
2 mod 10 = 2
Index Data Chain
0 -1
1 131 2
2
3 31 -1
4 4 -1
5 5 -1
6
7 -1
8 -1
9 -1
Replace 21 by 2 and
accordingly chain table will
be updated.
2 21 3
21 3
-1
6
Index Data Chain
0 -1 -1
1 131 6
2 2 -1
3 31 -1
4 4 -1
5 5 -1
6 21 3
7 -1 -1
8 -1 -1
9 -1 -1
32
B]Linear Probing with chaining (with replacement)
Ex: 131, 3, 4, 21, 61, 6, 71, 16, 85.
Index Data Chain
0 -1
1 131 2
2 21 5
3 3 -1
4 4 -1
5 61 7
6 6 -1
7 71 -1
8 -1
9 -1
16 mod 10 = 6 85 mod 10 = 5
Index Data Chain
0 -1
1 131 2
2 21 5
3 3 -1
4 4 -1
5 61 7
6 6 8
7 71 -1
8 16 -1
9 -1
Index Data Chain
0 -1
1 131 2
2 21 5
3 3 -1
4 4 -1
5 85 -1
6 6 8
7 71 -1
8 16 -1
9 61 7
131, 3, 4, 21, 61, 6, 71
33
Quadratic Probing
• Problem of clustering
• Accumulation of Consecutive cells n hash table
• When we want to add element we have to go for
many probes means need to search for more
elements to put element in table.
Index Data
0 79
1 41
2 22
3 32
4 44
5 33
6
7
8 18
9 59
• Solution: Quadratic Probing
34
Clustering Example
• Ex: 2, 22, 12, 42, 32
Index Data Chain
0 -1
1 -1
2 2 3
3 22 4
4 12 5
5 42 6
6 32 -1
7 -1
8 -1
9 -1
Index Data
0
1
2 2
3
4
5
6
7
8
9
[Hash(key)+i^2] mod M
(2 + 0^2)mod 10 = 2
(2 + 1^2)mod 10 = 3
22
12
(2 + 2^2)mod 10 = 6
(2 + 3^2)mod 10 = 11  1
42
(2 + 4^2)mod 10 = 18  8
32
35
Quadratic Probing
• Example: 37, 90, 55, 22, 11, 17, 49, 87 with m = 10. Index Data
0 90
1 11
2 22
3
4
5 55
6
7 37
8
9
17
49
87
36
Clustering Example
• Ex: 2, 12, 22, 32, 42, 52, 62
Index Data
0
1 32
2 2
3 12
4
5
6 22
7 52
8 42
9
[Hash(key)+i^2] mod M
(2 + 6^2)mod 10 = 38  8
(2 + 4^2)mod 10 = 18  8
(2 + 7^2)mod 10 = 51  1
(2 + 8^2)mod 10 = 66  6
(2 + 9^2)mod 10 = 83  3
(2 + 10^2)mod 10 = 102  2
This is known as secondary
clustering.
Here we are unable to add
an element in quadrating
probing because the index
calculated is never empty.
37
Double Hashing
• Second hash function is applied to the key when collision occurs.
38
Example
K H1 H2
18
41
22
44
5 7 - (18 % 7) = 7 - 4 = 3
2 1
9 6
5 5
39
Example
K H1 H2
18 5 3
41 2 1
22 9 6
44 5 5
Index Data
0
1
2
3
4
5
6
7
8
9
10
11
12
18
For 41 = 2 + 0 * 1 = 2
41
For 22 = 9 + 0 * 6 = 9 22
For 44 = 5 + 0 * 5 = 5
For 44 = 5 + 1 * 5 = 10
42
40
Home Work
41

Hashing techniques, Hashing function,Collision detection techniques

  • 1.
  • 2.
    • How tostore Employee records, which consists of Telephone no., Name, City, Department, Salary, etc… Telephone Name City Dept. Salary 2
  • 3.
    Solution • Use anArray: • Search takes a linear time. • If stored in sorted order this search can be done in O(log n) using binary search but then insertion and deletion becomes costly because we have to maintain the sorted order. 3
  • 4.
    Solution • Use anLinked List: • Search takes again linear time to search for particular record as se need to go through all the records until the required one is found. • Use balanced BST to store records: • Insertion takes O(log n). • Search takes O(log n). • Deletion takes O(log n). 4
  • 5.
    Solution • Create aDirect Access Table: • Insertion takes O(1). • Search takes O(1). • Deletion takes O(1). 5
  • 6.
    • Limitations: • 1]Size of Table: m * 10^n • Where, m = size of pointer to the record n = digits in the telephone number Telephone Memory Location (Pointer) ……………………. 9856234178 0x1… 8695234512 0x2… ……………………. Name City Dept ……. Tejas Pune IT Name City Dept ……. Rahul Mumbai HR • Limitations: • 2] Integer may not hold the size of n digits 6
  • 7.
    Introduction to Hashing •Provides O(1) time on average for insert, search and delete. • Find items with keys matching a given search key • Given an array A, containing n keys, and a search key x, find the index i such as x=A[i] • Example of Record KEY Other Data 7
  • 8.
    Continue.. • Hashing isan important Data Structure which is designed to use a special function called the Hash function. • Hash function is used to map a given value with a particular key for faster access of elements. • The efficiency of mapping depends of the efficiency of the hash function used. • Some examples of how hashing is used in our lives include: • In universities, each student is assigned a unique roll number that can be used to retrieve information about them. • In libraries, each book is assigned a unique number that can be used to determine information about the book, such as its exact position in the library or the users it has been issued to etc. 8
  • 9.
    Continue.. • Assume thatyou have an object and you want to assign a key to it to make searching easy. • To store the key/value pair, you can use a simple array like a data structure where keys (integers) can be used directly as an index to store values. • However, in cases where the keys are large and cannot be used directly as an index, you should use hashing. 9
  • 10.
    Continue.. • In hashing,large keys are converted into small keys by using hash functions. • The values are then stored in a data structure called hash table. • The idea of hashing is to distribute entries (key/value pairs) uniformly across an array. • Each element is assigned a key (converted key). • By using that key you can access the element in O(1) time. Using the key, the algorithm (hash function) computes an index that suggests where an entry can be found or inserted. 10
  • 11.
    Important Keywords: • HashTable: • A data structure used for storing and retrieving data quickly. Where each entry in hash table is made using hash function. • Hash Function: • It maps a big number or string into a small integer that can be used as index in hash table. • Used to place data in hash table and also to retrieve data from hash table. • A function that can be used to map a data set of an arbitrary size to a data set of a fixed size, which falls into the hash table. • The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes. • Bucket: • Each position of hash table is called bucket. • Collision: • It is a situation in which hash function returns the same address for more than one record. 0 25 1 Bucket 16 2 42 3 33 4 54 Given: 25, 33, 54, 16, 42 and table of size 5. Hash Function is KEY mod 5 Key mod 5 25 mod 5 = 0 33 mod 5 = 3 54 mod 5 = 4 16 mod 5 = 1 42 mod 5 = 2 30 mod 5 = 0 This is collision. Hash Table 11
  • 12.
    Important Keywords: • Probe: •Each calculation of an address and test for success is known as probe. • Synonym: • The set of keys that has the same location are called synonyms. • Ex: 25 and 30 are the synonyms. • Overflow: • When hash table becomes full and new record needs to be inserted and resulting into more collisions the it is called overflow. • Perfect hash function: • It is a function that maps distinct key elements into the hash table with no collisions. 0 25 1 16 2 42 3 33 4 54 Entry of 25, 33, 54, 16, 42 in a table of size 5 with probe 5 30 Where 30 mod 5 = 0 is a collision, synonym for 25 and it also shows the overflow situation Hash Table Consider 25, 33, 54, 16, 42, 30. Show Perfect Hash Function 12
  • 13.
    • To achievea good hashing mechanism, It is important to have a good hash function with the following basic requirements: • Easy to compute • Uniform distribution • Less Collision 13
  • 14.
  • 15.
    Types of HashFunctions • Division Method • Multiplication Method • Extraction Method • Mid Square Method • Folding and Universal Method 15
  • 16.
    Division Method • Itdepends upon the remainder of division. • Typically the divisor is table length. • Ex: 54, 72, 89, 37 is to be placed in hash table & if table size 10. 0 1 2 3 4 5 6 7 8 9 Hash Function h(key) = record % table_size 54 % 10 = 4 72 % 10 = 2 89 % 10 = 9 37 % 10 = 7 54 72 89 37 Disadvantage: The consecutive keys map to consecutive hash values in the hash table. This leads to a poor performance. 16
  • 17.
  • 18.
    Extraction Method • Somedigits are extracted from the key to form the address location in hash table. • Ex: Suppose 1st, 3rd and 4th digits from left are selected for hash key. 4 9 7 8 2 4 478 At 478 location in the hash table of size 1000 the key can be stored. 18
  • 19.
    Mid Square Method •Square the key. • Then extract middle part of the result. • That will be the location of key element in hash table. • Ex: A] 3111 = Square(3111) B] 16 = Square(16) 9678321 At 783 location in the hash table of size 1000 the key can be stored. 256 At 56 location in the hash table. 19
  • 20.
    Folding Method • Twofolding techniques: • A] Fold Shift: The key is divided into separate parts based on size of required address. Finally add those parts. • Ex: 345678123 • And required address is of 3 digits. • B] Fold Boundary: The key is divided into separate parts based on size of required address. Then left & right parts are folded and then add those parts. • Ex: 345678123 345678123 345 678 123 ------- 1146 345678123 543 678 321 ------- 1542 20
  • 21.
    Universal Hashing • Inthis technique randomized algorithm is used to select the hash function at random from family of hash functions with the help of certain mathematical properties. • Let U is the set of Universal Keys and H be the finite collection of Hash functions mapping U into {0, 1, …….., m-1}. • Say there is collision or your hash function perform badly for input key or a pair (x, y) i.e. { (x,y) | h(x)=h(y)}. U 0 . . m-1 Bad portion of key 21
  • 22.
    Universal Hashing • Thefundamental weakness of hash functions is collision so to avoid that what we do we can choose the Universal hashing where the hash function randomly at the runtime. • Lets consider H = set of all hash functions h from U to {0,1,…,m-1}. { h | h : U  (0,1,…,m-1) } H Bad portion 22
  • 23.
  • 24.
  • 25.
    Properties of GoodHash Functions • Rules for choosing good hash function: 1. The hash function should be simple to compute. 2. Number of collisions should be less ideally no collision should occur (perfect hash function). 3. Hash function should produce such keys which will get distributed uniformly over an array. 4. The hash function should depend on every bit of the key and not on the extracted portion of the key. 25
  • 26.
  • 27.
    Separate chaining (openhashing) • The open hashing is also known as separate chaining in which collisions are stored outside the table. • Separate chaining is one of the most commonly used collision resolution techniques. • It is usually implemented using linked lists. In separate chaining, each element of the hash table is a linked list. • To store an element in the hash table you must insert it into a specific linked list. • If there is any collision (i.e. two different elements have same hash value) then store both the elements in the same linked list. 27
  • 28.
    Input: 81, 0, 64,25, 1, 36, 4, 49, 16, 9. 28
  • 29.
    Advantages: 1. Simple toimplement. 2. Hash table never fills up, we can always add more elements to chain. 3. Less sensitive to the hash function. 4. It is mostly used when it is unknown how many and how frequently keys may be inserted or deleted Disadvantages: 1. Wastage of space. 2. Cache performance of chaining is not good as keys are stored using linked list. 3. If the chain becomes long then search time can become O(n) in worst case. 4. Uses extra space for links. 29
  • 30.
    Open Addressing • A]Linear Probing with chaining (without replacement): • A separate chain table is maintained for colliding data. • The address of colliding data is stored with first colliding element in chain table. • We can place element at next available bucket or slot and will keep track of that in chain column. Ex: 131, 3, 4, 21, 61, 6, 71, 8, 9. Index Data Chain 0 -1 1 -1 2 -1 3 -1 4 -1 5 -1 6 -1 7 -1 8 -1 9 -1 131 3 4 21 mod 10 = 1 Collision 21 2 61 mod 10 = 1 Collision 61 5 6 71 7 8 9 71 mod 10 = 1 Collision -1 Index Data Chain 0 -1 -1 1 131 2 2 21 5 3 3 -1 4 4 -1 5 61 7 6 6 -1 7 71 -1 8 8 -1 9 9 -1 • Drawback: Finding the next empty location and that results into the element actually belonging to that empty location cannot obtain its location. 30
  • 31.
    Linear Probing withchaining (without replacement) Ex: Consider elements as 21, 31, 41, 2, 6, 5, 7, 55 with table size of 10. Index Data Chain 0 -1 1 -1 2 -1 3 -1 4 -1 5 -1 6 -1 7 -1 8 -1 9 -1 21 41 2 31 2 5 3 6 -1 Index Data Chain 0 -1 -1 1 21 2 2 31 3 3 41 4 4 2 -1 5 5 8 6 6 -1 7 7 -1 8 55 -1 9 -1 -1 7 55 -1 4 8 31
  • 32.
    B]Linear Probing withchaining (with replacement) • Ex: 131, 21, 31, 4, 5, 2. Index Data Chain 0 -1 1 -1 2 -1 3 -1 4 -1 5 -1 6 -1 7 -1 8 -1 9 -1 131 31 4 21 mod 10 = 1 Collision 21 2 31 mod 10 = 1 Collision 5 3 2 mod 10 = 2 Index Data Chain 0 -1 1 131 2 2 3 31 -1 4 4 -1 5 5 -1 6 7 -1 8 -1 9 -1 Replace 21 by 2 and accordingly chain table will be updated. 2 21 3 21 3 -1 6 Index Data Chain 0 -1 -1 1 131 6 2 2 -1 3 31 -1 4 4 -1 5 5 -1 6 21 3 7 -1 -1 8 -1 -1 9 -1 -1 32
  • 33.
    B]Linear Probing withchaining (with replacement) Ex: 131, 3, 4, 21, 61, 6, 71, 16, 85. Index Data Chain 0 -1 1 131 2 2 21 5 3 3 -1 4 4 -1 5 61 7 6 6 -1 7 71 -1 8 -1 9 -1 16 mod 10 = 6 85 mod 10 = 5 Index Data Chain 0 -1 1 131 2 2 21 5 3 3 -1 4 4 -1 5 61 7 6 6 8 7 71 -1 8 16 -1 9 -1 Index Data Chain 0 -1 1 131 2 2 21 5 3 3 -1 4 4 -1 5 85 -1 6 6 8 7 71 -1 8 16 -1 9 61 7 131, 3, 4, 21, 61, 6, 71 33
  • 34.
    Quadratic Probing • Problemof clustering • Accumulation of Consecutive cells n hash table • When we want to add element we have to go for many probes means need to search for more elements to put element in table. Index Data 0 79 1 41 2 22 3 32 4 44 5 33 6 7 8 18 9 59 • Solution: Quadratic Probing 34
  • 35.
    Clustering Example • Ex:2, 22, 12, 42, 32 Index Data Chain 0 -1 1 -1 2 2 3 3 22 4 4 12 5 5 42 6 6 32 -1 7 -1 8 -1 9 -1 Index Data 0 1 2 2 3 4 5 6 7 8 9 [Hash(key)+i^2] mod M (2 + 0^2)mod 10 = 2 (2 + 1^2)mod 10 = 3 22 12 (2 + 2^2)mod 10 = 6 (2 + 3^2)mod 10 = 11  1 42 (2 + 4^2)mod 10 = 18  8 32 35
  • 36.
    Quadratic Probing • Example:37, 90, 55, 22, 11, 17, 49, 87 with m = 10. Index Data 0 90 1 11 2 22 3 4 5 55 6 7 37 8 9 17 49 87 36
  • 37.
    Clustering Example • Ex:2, 12, 22, 32, 42, 52, 62 Index Data 0 1 32 2 2 3 12 4 5 6 22 7 52 8 42 9 [Hash(key)+i^2] mod M (2 + 6^2)mod 10 = 38  8 (2 + 4^2)mod 10 = 18  8 (2 + 7^2)mod 10 = 51  1 (2 + 8^2)mod 10 = 66  6 (2 + 9^2)mod 10 = 83  3 (2 + 10^2)mod 10 = 102  2 This is known as secondary clustering. Here we are unable to add an element in quadrating probing because the index calculated is never empty. 37
  • 38.
    Double Hashing • Secondhash function is applied to the key when collision occurs. 38
  • 39.
    Example K H1 H2 18 41 22 44 57 - (18 % 7) = 7 - 4 = 3 2 1 9 6 5 5 39
  • 40.
    Example K H1 H2 185 3 41 2 1 22 9 6 44 5 5 Index Data 0 1 2 3 4 5 6 7 8 9 10 11 12 18 For 41 = 2 + 0 * 1 = 2 41 For 22 = 9 + 0 * 6 = 9 22 For 44 = 5 + 0 * 5 = 5 For 44 = 5 + 1 * 5 = 10 42 40
  • 41.

Editor's Notes

  • #27 Collision resolution is the most important issue in hash table implementations.