0% found this document useful (0 votes)

3 views

Unit 7

Uploaded by

aaptakai

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Unit 7

Uploaded by

aaptakai

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

Unit 7

Hashing-7
Prof. Yogendra singh, Assistant Professor
Computer Science & Engineering
UNIT-7
Hashing
Hash Table Organizations

Hash table organization is a fundamental concept in data structures and

involves the arrangement and management of data within a hash table.
Here's a breakdown of key aspects of hash table organization.

1 Hash Function: The hash function is at the core of a hash table. It maps
keys to indices in the hash table's array. A good hash function aims to
distribute keys evenly across the array to minimize collisions.
2 Array Size: The size of the array used in the hash table affects its
performance. A larger array can reduce the likelihood of collisions but may
consume more memory. The array size is typically chosen based on the
expected number of elements and the desired trade-off between memory
usage and performance.
Hash Table Organizations

3 Collision Resolution: Collisions occur when two keys are mapped to the
same index by the hash function. There are several methods for resolving
collisions:
Separate Chaining: Each bucket in the hash table array contains a
linked list (or another data structure) to handle multiple elements hashed to
the same index.
Open Addressing: In this approach, when a collision occurs, the
algorithm probes the array for an alternative location to store the collided
element. This can involve linear probing, quadratic probing, or other
techniques.
Hash Table Organizations

Robin Hood Hashing: An extension of open addressing that seeks to

reduce the variance in probe lengths by "robbing" elements from bins with
shorter probe lengths and placing them in bins with longer probe lengths.

Cuckoo Hashing: This method involves maintaining multiple hash

functions and utilizing multiple hash tables to resolve collisions by
relocating elements to other tables.
Hash Table Organizations

4 Load Factor: The load factor of a hash table is the ratio of the number
of elements stored in the table to the total number of slots (or buckets)
in the array. It influences the likelihood of collisions and affects the
efficiency of operations. A common practice is to resize the hash table
when the load factor exceeds a certain threshold to maintain
performance.

5 Resizing: As the number of elements stored in the hash table increases

or decreases, resizing may be necessary to maintain an appropriate load
factor and performance. Resizing involves creating a new, larger array and
rehashing the elements from the old array into the new one.
Hash Table Organizations

6 Key-Value Pairs: Hash tables often store key-value pairs, where each key
is associated with a value. The hash function is applied to the keys to
determine their storage location in the array.

Overall, effective hash table organization requires careful consideration of

the hash function, collision resolution strategy, array size, load factor, and
resizing policy to achieve optimal performance and memory usage for a
given application.
Hashing Functions

Hashing functions play a crucial role in hash tables and other hashing-
based data structures. They are responsible for converting keys (such as
strings or integers) into indices within the hash table's array. Here are some
common characteristics and types of hashing functions.
Characteristics of a Good Hash Function

Deterministic: A hashing function should always produce the same hash

value for the same input key.

Efficient: Hashing functions should be computationally efficient to calculate,

ensuring that hashing operations can be performed quickly.

Uniform Distribution: Ideally, a hashing function should evenly distribute

keys across the hash table's array, reducing the likelihood of collisions.

Minimal Collisions: While collisions (multiple keys mapping to the same

hash value) are inevitable, a good hashing function should minimize
collisions, especially for different keys.
Common Hash Functions
1 Division Method: One of the simplest hashing functions involves taking
the remainder of the key divided by the size of the hash table's array:

hash(key) = key % array_size

While easy to implement, this method may lead to clustering and poor
distribution, especially if the array size is not prime.

2 Multiplication Method: This method involves multiplying the key by a

constant \( A \) and taking the fractional part of the product, then
multiplying by the size of the array:

hash(key) = floor(array_size * ((key * A) mod 1))

.
Common Hash Functions

3 Universal Hashing : Universal hashing involves randomly selecting a

hashing function from a family of hash functions. This approach aims to
provide strong probabilistic guarantees against collisions.

4 Cryptographic Hash Functions: These hashing functions are designed to

produce a fixed-size hash value (digest) from an input of arbitrary size.
Examples include SHA-256 and MD5. Cryptographic hash functions have
properties such as collision resistance and preimage resistance, making
them suitable for security applications.
Common Hash Functions

5 Custom Hash Functions: Depending on the characteristics of the keys

and the application requirements, custom hashing functions may be
designed to achieve better distribution and performance.

Choosing an appropriate hashing function depends on factors

such as the nature of the keys, the size of the hash table, and
performance considerations. It's essential to evaluate the distribution of
hash values and collision rates to ensure the effectiveness of the chosen
hashing function.
Static and Dynamic Hashing

Static Hashing:

Static hashing, also known as perfect hashing, is a hashing technique where

each key is mapped directly to a unique slot in the hash table. Unlike
traditional hashing methods, static hashing ensures that there are no
collisions, which simplifies retrieval operations and guarantees constant-
time access.

Here's an example of static hashing:

Suppose we have a set of keys that we want to store in a hash table:

Static Hashing

Keys: {10, 22, 31, 4, 15, 28, 17, 88, 59}

We want to create a hash table to store these keys with minimal collisions.
We decide to use static hashing with a hash function that maps each key
directly to its value.

1 Hash Function: For static hashing, the hash function is simple: it

directly maps each key to its value. For example:

hash(key) = key
In this case, the hash value of a key is the key itself.
Static Hashing

2 Hash Table Creation: Based on the range of keys, we create a hash table
with slots corresponding to each possible key value. In this example, the
keys range from 4 to 88, so we create a hash table with slots from 4 to 88.

3 Insertion: We insert each key into its corresponding slot in the hash
table based on the hash function. Since static hashing guarantees no
collisions, each key is inserted directly into its assigned slot.
Static Hashing

Slot 4: 4
Slot 10: 10
Slot 15: 15
Slot 17: 17
Slot 22: 22
Slot 28: 28
Slot 31: 31
Slot 59: 59
Slot 88: 88

As you can see, each key is stored directly in its assigned slot without any
collisions.
Static Hashing

4 Retrieval: Retrieving a key from a static hash table is straightforward. We

calculate the hash value of the key, which is the key itself, and then access
the corresponding slot in the hash table.

For example, if we want to retrieve the key 15, we calculate the hash value:

hash(15) = 15

We access slot 15 in the hash table, which contains the key 15.
Static Hashing

Static hashing is efficient for datasets where the keys are known in advance
and do not change frequently. It provides constant-time access without the
need for collision resolution mechanisms. However, it may not be suitable
for dynamic datasets where keys are inserted or deleted frequently, as
resizing the hash table can be challenging.
Dynamic Hashing:

Dynamic hashing is a technique used in computer science to handle data

that might grow or shrink unpredictably. It's particularly useful in scenarios
where you need to efficiently store and retrieve data in a hash table, but
you don't know beforehand how many items will be stored or what their
distribution will be like.

In dynamic hashing, the number of buckets in the hash table is not fixed.
Instead, it adjusts dynamically based on the number of items being stored
and retrieved. This helps in maintaining a good balance between space
efficiency and lookup efficiency.

Here's how dynamic hashing works with a simplified example:

Dynamic Hashing

Let's say we're implementing a hash table to store the names of students
along with their corresponding grades in a class. We want to efficiently
retrieve the grade of any student given their name.

1 Initialization: Initially, we start with a small number of buckets in our

hash table. Let's say we start with 4 buckets

2 Hashing: Each student's name is hashed to determine which bucket it

should go into. For simplicity, let's use a basic hash function that takes
the first letter of the student's name and maps it to a bucket
Dynamic Hashing

3 Insertion: We start inserting students into buckets based on their

hashed values. For example:

"Alice" hashes to bucket 1

"Bob" hashes to bucket 2
"Charlie" hashes to bucket 3
"David" hashes to bucket 4
"Eva" also hashes to bucket 1 (hash collision)
Dynamic Hashing

4 Collision Handling: When two or more items hash to the same bucket,
we typically use techniques like chaining (maintaining a linked list of
items in each bucket) or open addressing (probing for an empty bucket
nearby) to handle collisions.
5 Expansion: As more items are inserted into the hash table, if the load
factor (the ratio of the number of items to the number of buckets)
exceeds a certain threshold, we dynamically increase the number of
buckets. For example, if the load factor exceeds 0.75, we can double the
number of buckets to 8.
6 Rehashing: When expanding the number of buckets, all existing items
need to be rehashed and redistributed into the new bucket structure.
This ensures that the distribution remains balanced and efficient.
Dynamic Hashing

7 Lookup: When we want to retrieve the grade of a student, we hash

their name to find the corresponding bucket and then search within
that bucket for the student's name.
8 Deletion and Contraction: Similarly, if the number of items decreases
significantly, we can dynamically reduce the number of buckets to save
space. This involves redistributing items and possibly rehashing again to
maintain efficiency.

Dynamic hashing ensures that the hash table can adapt to

changing storage requirements while still providing efficient lookup and
insertion times. It's a powerful technique used in many real-world
applications where the size and distribution of data can vary
unpredictably.
Extendible Hashing

Directory: An array of pointers to buckets.

Buckets: Store actual entries.

Directory Doubling: When a bucket overflows, the directory size is doubled,

and existing buckets are split based on a bit from the hash value.

Bucket Splitting: Only the overflowing bucket is split, reducing the overhead
compared to resizing the entire table.
Linear Hashing

Buckets: Organized in a sequence.

Level: Indicates the current round of splitting.

Splitting Rule: Buckets are split one at a time in a linear order, which allows
for gradual growth.

Hash Functions: Two hash functions, ℎh and ℎ′h′, are used. When a bucket
overflows, the next bucket in the sequence is split.
Conclusion

Hashing is a fundamental technique in computer science for efficient data

retrieval. Understanding the various hash table organizations, hashing
functions, and the differences between static and dynamic hashing is
crucial for designing effective and scalable data structures
www.paruluniversity.ac.in

DS UNIT-3
No ratings yet
DS UNIT-3
100 pages
Hash Function - Wikipedia
No ratings yet
Hash Function - Wikipedia
44 pages
Hashing
No ratings yet
Hashing
5 pages
Hashing
No ratings yet
Hashing
31 pages
Dat Astruc T Hashing Rep
No ratings yet
Dat Astruc T Hashing Rep
13 pages
Hashing
No ratings yet
Hashing
4 pages
Hashing
No ratings yet
Hashing
56 pages
DS_Lecture_01.1_Fall-24-35
No ratings yet
DS_Lecture_01.1_Fall-24-35
20 pages
As 3
No ratings yet
As 3
4 pages
Hash Tables: Hash Tables Are A Type of Data Structure Used in Computation To Efficiently Store Data. They Are
No ratings yet
Hash Tables: Hash Tables Are A Type of Data Structure Used in Computation To Efficiently Store Data. They Are
2 pages
Hash Function - Wikipedia, The Free Encyclopedia
No ratings yet
Hash Function - Wikipedia, The Free Encyclopedia
5 pages
DSA_M5
No ratings yet
DSA_M5
38 pages
DSA G5 Hashing Handouts
No ratings yet
DSA G5 Hashing Handouts
7 pages
Values, Hash Codes, Hash Sums, Checksums or Simply Hashes.: From Wikipedia, The Free Encyclopedia
100% (1)
Values, Hash Codes, Hash Sums, Checksums or Simply Hashes.: From Wikipedia, The Free Encyclopedia
11 pages
Hash Function
No ratings yet
Hash Function
4 pages
CH 4 Hash Table
No ratings yet
CH 4 Hash Table
20 pages
Hashing
No ratings yet
Hashing
8 pages
DS - Unit 5 - Notes
No ratings yet
DS - Unit 5 - Notes
8 pages
DSA LABTASK 12
No ratings yet
DSA LABTASK 12
5 pages
Hash Function Instruction Count
No ratings yet
Hash Function Instruction Count
6 pages
Matrix Hashing With Two Level of Collision Resolution: National Institute of Technology Raipur
No ratings yet
Matrix Hashing With Two Level of Collision Resolution: National Institute of Technology Raipur
7 pages
Unit Iii
No ratings yet
Unit Iii
58 pages
Ijirt172005 Paper
No ratings yet
Ijirt172005 Paper
4 pages
Lesson-11-Hash Table
No ratings yet
Lesson-11-Hash Table
3 pages
Notes of advanced data structures
No ratings yet
Notes of advanced data structures
202 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
6 pages
DS Module-X
No ratings yet
DS Module-X
74 pages
Hash
No ratings yet
Hash
10 pages
Hashing
No ratings yet
Hashing
23 pages
Implementation Priority Queue Using Array
No ratings yet
Implementation Priority Queue Using Array
3 pages
Hashing Part1 - 241021 - 152911
No ratings yet
Hashing Part1 - 241021 - 152911
10 pages
06 Hashtables
No ratings yet
06 Hashtables
3 pages
Hashing Notes
No ratings yet
Hashing Notes
5 pages
Hashing Data Structure
No ratings yet
Hashing Data Structure
22 pages
Hashing
No ratings yet
Hashing
37 pages
ADI Hashing
No ratings yet
ADI Hashing
47 pages
Hashing
No ratings yet
Hashing
12 pages
6 Dec. 24 Unit 5 DSA
No ratings yet
6 Dec. 24 Unit 5 DSA
56 pages
Hashing
No ratings yet
Hashing
29 pages
Lab5 Hashing Algos
No ratings yet
Lab5 Hashing Algos
10 pages
GROUP 15.Pptx Presentation
No ratings yet
GROUP 15.Pptx Presentation
29 pages
Data Structure
No ratings yet
Data Structure
21 pages
Hashing in DBMS: Static & Dynamic With Examples
No ratings yet
Hashing in DBMS: Static & Dynamic With Examples
8 pages
HASHING
No ratings yet
HASHING
8 pages
Lab 2
No ratings yet
Lab 2
10 pages
Hashing in Data Structures
No ratings yet
Hashing in Data Structures
8 pages
Week 12 Hashing
No ratings yet
Week 12 Hashing
24 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Unit III-Hashing
100% (1)
Unit III-Hashing
135 pages
Hash
No ratings yet
Hash
7 pages
Hashing: Why We Need Hashing?
No ratings yet
Hashing: Why We Need Hashing?
22 pages
Hashing
No ratings yet
Hashing
9 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Unit 5 Data Structure
No ratings yet
Unit 5 Data Structure
12 pages
Data Structures
No ratings yet
Data Structures
6 pages
MODULE 5_BCS304_HASHING_Leftisht trees_OBST_Notes
No ratings yet
MODULE 5_BCS304_HASHING_Leftisht trees_OBST_Notes
32 pages
Hashing
No ratings yet
Hashing
13 pages
FullStackCafe QAS 1712833162841
No ratings yet
FullStackCafe QAS 1712833162841
3 pages
Hash Function
No ratings yet
Hash Function
9 pages
11 What Is Hashing in DBMS
No ratings yet
11 What Is Hashing in DBMS
20 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
Task 4 - Hashing - Separate Chaining and Rehashing
No ratings yet
Task 4 - Hashing - Separate Chaining and Rehashing
8 pages
AAOA Assignment I
No ratings yet
AAOA Assignment I
3 pages
Sha2 256 Fips 180
No ratings yet
Sha2 256 Fips 180
97 pages
DSA Unit VI Hashing and File Organization
No ratings yet
DSA Unit VI Hashing and File Organization
56 pages
The SHA-3 Zoo - The ECRYPT Hash Function Website
No ratings yet
The SHA-3 Zoo - The ECRYPT Hash Function Website
4 pages
BFS DFS Uniform Cost Search Algos
No ratings yet
BFS DFS Uniform Cost Search Algos
27 pages
Hash Table
No ratings yet
Hash Table
36 pages
OCS351_AIML_Unit_2
No ratings yet
OCS351_AIML_Unit_2
37 pages
TL06 - AVL and Hashing
No ratings yet
TL06 - AVL and Hashing
4 pages
Materi AI-006
No ratings yet
Materi AI-006
27 pages
ESI 6448 Discrete Optimization Theory: Section Number 5643
No ratings yet
ESI 6448 Discrete Optimization Theory: Section Number 5643
22 pages
Unit 2 AI
No ratings yet
Unit 2 AI
107 pages
The Secure Hash Function (SHA) : Network Security
No ratings yet
The Secure Hash Function (SHA) : Network Security
24 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
7 pages
The Electronic Equivalent of The Document and Fingerprint Pair Is The Message and Digest Pair
No ratings yet
The Electronic Equivalent of The Document and Fingerprint Pair Is The Message and Digest Pair
36 pages
AI.02a - Solving Problems by Searching - T
No ratings yet
AI.02a - Solving Problems by Searching - T
118 pages
UNIT 2 - AI - Merged
No ratings yet
UNIT 2 - AI - Merged
45 pages
Ai-Unit-Ii Notes
No ratings yet
Ai-Unit-Ii Notes
77 pages
CSE 537: Sample Mid-Term Questions
No ratings yet
CSE 537: Sample Mid-Term Questions
2 pages
String Searching Algorithms Slides
100% (1)
String Searching Algorithms Slides
102 pages
Ads Unit-3
No ratings yet
Ads Unit-3
20 pages
Chapter2 State-Space Search Part1
No ratings yet
Chapter2 State-Space Search Part1
47 pages
II B.tech I Semester Data Structures Question Bank Final
No ratings yet
II B.tech I Semester Data Structures Question Bank Final
5 pages
Artificial Intelligence: Lab Manual # 06
No ratings yet
Artificial Intelligence: Lab Manual # 06
10 pages
Module 5
No ratings yet
Module 5
25 pages
AI - Problem Solving - INCOMPLETE
No ratings yet
AI - Problem Solving - INCOMPLETE
67 pages
Uninformed Search Techq
No ratings yet
Uninformed Search Techq
28 pages
Simple Search Algorithm: S Q Q (S) Visited (S) Q X Q X X X Q X Visited Q X
No ratings yet
Simple Search Algorithm: S Q Q (S) Visited (S) Q X Q X X X Q X Visited Q X
36 pages
1 State Space Search
No ratings yet
1 State Space Search
18 pages

Unit 7

Uploaded by

Unit 7

Uploaded by

Unit 7

Hash table organization is a fundamental concept in data structures and

Robin Hood Hashing: An extension of open addressing that seeks to

Cuckoo Hashing: This method involves maintaining multiple hash

5 Resizing: As the number of elements stored in the hash table increases

Overall, effective hash table organization requires careful consideration of

Deterministic: A hashing function should always produce the same hash

Efficient: Hashing functions should be computationally efficient to calculate,

Uniform Distribution: Ideally, a hashing function should evenly distribute

Minimal Collisions: While collisions (multiple keys mapping to the same

hash(key) = key % array_size

2 Multiplication Method: This method involves multiplying the key by a

hash(key) = floor(array_size * ((key * A) mod 1))

3 Universal Hashing : Universal hashing involves randomly selecting a

4 Cryptographic Hash Functions: These hashing functions are designed to

5 Custom Hash Functions: Depending on the characteristics of the keys

Choosing an appropriate hashing function depends on factors

Static hashing, also known as perfect hashing, is a hashing technique where

Here's an example of static hashing:

Suppose we have a set of keys that we want to store in a hash table:

Keys: {10, 22, 31, 4, 15, 28, 17, 88, 59}

1 Hash Function: For static hashing, the hash function is simple: it

4 Retrieval: Retrieving a key from a static hash table is straightforward. We

Dynamic hashing is a technique used in computer science to handle data

Here's how dynamic hashing works with a simplified example:

1 Initialization: Initially, we start with a small number of buckets in our

2 Hashing: Each student's name is hashed to determine which bucket it

3 Insertion: We start inserting students into buckets based on their

"Alice" hashes to bucket 1

7 Lookup: When we want to retrieve the grade of a student, we hash

Dynamic hashing ensures that the hash table can adapt to

Directory: An array of pointers to buckets.

Buckets: Store actual entries.

Directory Doubling: When a bucket overflows, the directory size is doubled,

Buckets: Organized in a sequence.

Level: Indicates the current round of splitting.

Hashing is a fundamental technique in computer science for efficient data

You might also like