0% found this document useful (0 votes)
6 views

Lecture 08 - Hash Tables

Uploaded by

mhmdmfsr07
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lecture 08 - Hash Tables

Uploaded by

mhmdmfsr07
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Data Structures

and Algorithms

Lecture 7 –Hash Tables

1
SE1052 - Data Structures and Algorithms

• Understand standard data structures

• Explain the purpose of having different Data Structures

• Identify the time complexity of an algorithm

• Construct a solution to a problem in the form of an algorithm

• Solve a problem using standard Data Structures

2
Hash Table

• A Hash Table is a data structure that maps keys to values for efficient
lookup.
Why Hash Tables are important

• If you have 100 employees with Emp IDs from 0 to 99, what is the
best way to assign employees to the array.
• If you have 100 employees with Emp IDs from 00000 to 99999, what
is the best way to assign employees to the array.
Hash Tables - Hashing

• A hash table stores each data element using an associated key


• The key is later used to find the element efficiently
• Hash tables convert the key into an index via an arithmetic function and
then place the data at this index
• This conversion is referred to as “hashing”: applying an arithmetic function
to a key to map it to a location (index) in an array for storing the data associated
with that key
• The arithmetic function is called the hashing function
• The location it maps a key to is called the hash index
Hashing

• So a hash table has two major components:


• Array: (table) to store the data
• Hash function: to map keys to integer indexes in the array
• When a new element is to be added, both a key and data must be provided to the
hash table
• The hash table hashes the key and stores both the key and data at the calculated
hash index
• Key must be unique
• Thus time complexity is the time it takes to perform the hash calculation O(?) plus
the O(1) array access time
• Accessing data in the hash table is the same
Properties of a Good Hash Function

A good hash function should:


1. Return indexes that fit within the size of the array
i.e., [0 .. arrayLength-1]
2. Be fast to compute
The hash function is a critical factor in access time
3. Be repeatable (i.e., always return same index) for a given key
4. Distribute keys evenly over the full range of the array
This is to minimise collisions, a major issue in hash tables
Hash Functions – Division Method

• Most simplest method


• h(x) = x mod M

• Calculate the hash values of keys 1234 and 5462 when M is 97


Hash Functions – Multiplication Method

Step 1: Choose a constant A such that 0 < A < 1.


Step 2: Multiply the key k by A.
Step 3: Extract the fractional part of kA.
Step 4: Multiply the result of Step 3 by the size of hash table (m).

h(k) = m (kA mod 1)


Where k – Key
A – a constant between 0 to 1
m – size of the hash table
Hash Functions – Multiplication Method

Example

Given a hash table of size 1000, map the key 12345 to an appropriate location in the
hash table.
Mid-Square Method

• Step 1: Square the value of the key. That is, find k2.
• Step 2: Extract the middle r digits of the result obtained in Step 1.
• Suppose we have a hash table of size 100 (i.e., we need a two-digit hash value)
and we want to hash the key 456.
• Step 1: Square the key:
4562 = 207936
• Step 2: Extract the middle digits: The squared result is 207936. To get the hash
value, extract the middle two digits (in this case, the 3rd and 4th digits).
Folding Method

• Step 1: Divide the key value into a number of parts. That is, divide k into parts k1,
k2, ..., kn, where each part has the same number of digits except the last part
which may have lesser digits than the other parts.

• Step 2: Add the individual parts. That is, obtain the sum of k1 + k2 + ... + kn. The
hash value is produced by ignoring the last carry, if any.
Folding Method

• Given a hash table of 100 locations, calculate the hash value using folding method
for keys 5678, 321, and 34567.
• Since there are 100 memory locations to address, we will break the key into parts
where each part (except the last) will contain two digits. The hash values can be
obtained as shown below:
Collision

• A collision occurs when two (or more) keys map to the same hash index
• Collision resolving methods
1. Open addressing
2. Chaining
Open Addessing

• All elements are stored within the hash table array itself; if a collision
occurs, the algorithm searches for another empty slot according to a
specific probe sequence.
• Linear probing
• Quadratic probing
• Double hasing
Linear probing

• In this method, hash table contains two types of values: sentinel


values (e.g., –1) and data values.
• When a key is mapped and if the location has the sentinal value , can
store the data value.
• When a key is mapped and if the location has a data value, then new
location should be found using linear probing.
Linear probing

• Hash function to resolve a collision is


h(k, i) = [h’(k) + i] mod m
Where,
h’(k) – k mod m
m – size of the table
i – prob number
Linear probing - example

• Try to store key 92,


Quadratic Probing

• hash function is used to resolve the collision:


h(k, i) = [h’(k) + c1i + c2i2] mod m
• where m is the size of the hash table, h’(k) = (k mod m), i is the probe
number that varies from 0 to m–1, and c1 and c2 are constants such
that c1 and c2 ≠ 0.
Quadratic Probing - example
Collision Resolution by Chaining

• In chaining, each hashtable entry a pointer to a linked list


• When a collision occurs, the new key-value pair is simply added to the
linked list

Initial hash table After inserting the keys 7, 24, 18, 52, 36, 54, 11, and 23 in a chained
hash table of 9 memory locations using h(k) = k mod m.

You might also like