0% found this document useful (0 votes)

15 views13 pages

Lecture 4 Hash Table Stu

Uploaded by

vohaidung19122006

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views13 pages

Lecture 4 Hash Table Stu

Uploaded by

vohaidung19122006

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Hash tables
Hash functions
Collision Resolution
HASH TABLE • Chaining
Lê Viết Tuấn
• Open addressing
Email: [email protected]

Homepage: vt-le.github.io Applications of Hash Table

1 2

Direct-address tables Direct-address tables

Directaddressing is a simple technique that works Slotk points to an element in the set with key k.
well when the universe U of keys is reasonably If the set contains no element with key k, then
small. 𝑇 𝑘 = 𝑁𝑈𝐿𝐿.
Suppose that an application needs a dynamic set
in which each element has a distinct key drawn
from the universe 𝑈 = 0,1, … , 𝑚 − 1 , where m
is not too large.
Torepresent the dynamic set, you can use an
array, or direct-address table, denoted by
𝑇 0: 𝑚 − 1 , in which each position, or slot,
corresponds to a key in the universe U .
3 4

3 4
Hash tables Hash tables
With direct addressing, an element with key k is The hash function reduces the range of array
stored in slot k. indices and hence the size of the array.
With hashing, we use a hash function h to
compute the slot number from the key k, so that
the element goes into slot ℎ(𝑘).
The hash function h maps the universe U of keys
into the slots of a hash table 𝑇 0: 𝑚 − 1 :
ℎ: 𝑈 → 0,1, … , 𝑚 − 1
where the size m of the hash table is typically much less
than 𝑈 .

5 6

Hash tables Hash tables

We will refer to the table size as 𝑇𝑆𝑖𝑧𝑒. Weseek a hash function that distributes the keys
Thecommon convention is to have the table run evenly among the cells.
0
from 0 to 𝑇𝑆𝑖𝑧𝑒 − 1. For example 1
Eachkey is mapped into some number in the • john hashes to 3 2
range 0 to 𝑇𝑆𝑖𝑧𝑒 − 1 and placed in the 3 john 25000
• phil hashes to 4
appropriate cell.
• dave hashes to 6, 4 phil 31250
The mapping is called a hash function, which
• mary hashes to 7 5
ideally should be simple to compute and should
ensure that any two distinct keys get different 6 dave 27500
cells. 7 mary 28200
8
7 8

7 8
Division
ℎ 𝐾 = 𝐾 𝑚𝑜𝑑 𝑇𝑆𝑖𝑧𝑒
if K is a number, where
𝑇𝑆𝑖𝑧𝑒 = 𝑠𝑖𝑧𝑒𝑜𝑓 𝑡𝑎𝑏𝑙𝑒

HASH FUNCTIONS Itis best if 𝑇𝑆𝑖𝑧𝑒 is a prime number; otherwise,

ℎ(𝐾) = (𝐾 𝑚𝑜𝑑 𝑝) mod 𝑇𝑆𝑖𝑧𝑒 for some prime
𝑝 > 𝑇𝑆𝑖𝑧𝑒 can be used.
However, nonprime divisors may work equally well
as prime divisors provided they do not have prime
factors less than 20 (Lum et al. 1971).
9 10

9 10

Folding Shift folding

The key is divided into several parts. In shiftfolding, they are put underneath one
another and then processed. For examples:
These parts are combined or folded together and
• A socialsecurity number (SSN) 123-45-6789 can be
are often transformed in a certain way to create
divided into three parts, 123, 456, 789, and then these
the target address. parts can be added.
There are two types of folding: • The resulting number, 1,368, can be divided modulo TSize or, if
the size of the table is 1,000, the first three digits can be used
• Shift folding. for the address.
• Boundary folding. • Anotherpossibility is to divide the same number 123-
45-6789 into five parts (say, 12, 34, 56, 78, and 9), add
them, and divide the result modulo TSize.

11 12

11 12
Boundary folding Boundary folding
The key is seen as being written on a piece of Consider the same three parts of the SSN: 123,
paper that is folded on the borders between 456, and 789.
different parts of the key. • The first part, 123, is taken in the same order,
In this way,
• Then the piece of paper with the second part is folded
every other underneath it so that 123 is aligned with 654, which is
part will be the second part, 456, in reverse order.
put in the
reverse order. • When the folding continues, 789 is aligned with the two
previous parts.
• The result is 123 + 654 + 789 = 1,566.

https://blacksquareprintmedia.co.uk/7-types-of-paper-folds-for-leaflets-and-flyers/
13 14

13 14

Mid-Square Function Mid-Square Function

In the mid-square method, the key is squared and In practice, it is more efficient to choose a power of 2
the middle or mid part of the result is used as the for the size of the table and extract the middle part of
address. the bit representation of the square of a key.

For example, if If we assume that the size of the table is 1,024, then,
in this example, the binary representation of 3,1212
• The key is 3,121 is the bit string 100101001010000101100001, with
• 3,1212 = 9,740,641 the middle part shown in bold.
• For the 1,000-cell table This middle part, the binary number 0101000010, is
• ℎ(3,121) = 406, which is the middle part of 3,1212. equal to 322. This part can easily be extracted by
using a mask and a shift operation.

15 16

15 16
Collision Resolution
Foralmost all hash functions, more than one key
can be assigned to the same position.
• For example, if the hash function h1 applied to names
COLLISION returns the ASCII value of the first letter of each name
(i.e., h1(name) = name[0]), then all names starting with
RESOLUTION the same letter are hashed to the same position.
This problem can be solved by finding a function
that distributes names more uniformly in the
table.
• For example, the function h2 could add the first two
letters (i.e., h2(name) = name[0] + name[1]), which is
better than h1.
17 18

17 18

Collision Resolution
Buteven if all the letters are considered (i.e.,
ℎ3(𝑛𝑎𝑚𝑒) = 𝑛𝑎𝑚𝑒[0] + · · ·
+ 𝑛𝑎𝑚𝑒[𝑠𝑡𝑟𝑙𝑒𝑛(𝑛𝑎𝑚𝑒) – 1]), the possibility of
COLLISION
hashing different names to the same location still
exists.
RESOLUTION BY
Increasingthis size may lead to better hashing,
CHAINING
but not always!
Thereare scores of strategies that attempt to
avoid hashing multiple keys to the same location.

19 20

19 20
Separate chaining Separate chaining
Each nonempty slot points to a linked list, and all Slot j contains a pointer to the head of the list of
the elements that hash to the same slot go into that all stored elements with hash value j . If there are
slot’s linked list. no such elements, then slot j contains NIL
Slot j contains a pointer to the head of the list of
all stored elements with hash value j . If there are
no such elements, then slot j contains NIL.

21 22

Separate chaining Separate chaining

struct ChainedHashTable unsigned int ChainedHashTable::hash(int const& key) const
{ {
private: return (key % capacity);
IntSLL* table; }
int capacity;
int size; void ChainedHashTable::insert(int const& k)
unsigned int hash(int const& key) const; {
public: unsigned int bin = hash(k);
ChainedHashTable(int cap = 101) if (!table[bin].isInList(k))
{ {
capacity = cap; table[bin].addToHead(k);
table = new IntSLL[capacity]; size++;
size = 0; }
} }
~ChainedHashTable() { . . . }
};

23 24

23 24
Separate chaining Coalesced chaining
int ChainedHashTable::remove(int const& k) Inthis method, the first available position is found
{ for a key colliding with another key, and the index of
unsigned int bin = hash(k);
if (table[bin].deleteNode(k))
this position is stored with the key already in the
{ table.
size--;
return 1; Inthis way, a sequential search down the table can be
} avoided by directly accessing the next element on the
return 0; linked list.
}
Eachposition pos of the table stores an object with
bool ChainedHashTable::search(int const& k) const
{
two members: 𝒊𝒏𝒇𝒐 for a key and 𝒏𝒆𝒙𝒕 with the
unsigned int bin = hash(k); index of the next key that is hashed to 𝑝𝑜𝑠.
return (table[bin].isInList(k));
} Availablepositions can be marked by, say, –2 in next;
–1 can be used to indicate the end of a chain.

25 26

Coalesced chaining Coalesced chaining

Coalesced hashing puts a colliding key in the last Coalesced hashing that uses a cellar.
position of the table. • Noncolliding keys are stored in their home positions.
• Collidingkeys are put in the last available slot of the
cellar and added to the list starting from their home
position

27 28

27 28
Coalesced chaining

OPEN ADDRESSING

29 30

Open Addressing Linear probing

When a key collides with another key, the Assume we are inserting into bin k:
collision is resolved by finding an available table • If bin k is empty, we occupy it
entry other than the position (address) to which
• Otherwise,check bin k + 1, k + 2, and so on, until an
the colliding key is originally hashed.
empty bin is found
If
position h(K) is occupied, then the positions in • If we reach the end of the array, we start at the front (bin 0)
the probing sequence
𝑛𝑜𝑟𝑚(ℎ(𝐾) + 𝑝(1)), 𝑛𝑜𝑟𝑚(ℎ(𝐾)
+ 𝑝(2)), . . . , 𝑛𝑜𝑟𝑚(ℎ(𝐾) + 𝑝(𝑖)), . . .

31 32

31 32
Linear probing Linear probing
Consider a hash table with M = 16 bins Insertthese numbers into this initially empty hash
table: 19A, 207, 3AD, 488, 5BA, 680, 74C, 826,
946, ACD, B32, C8B, DBE, E9C
Given a 3-digit hexadecimal number:
0 1 2 3 4 5 6 7 8 9 A B C D E F
• Theleast-significant digit is the primary hash
function (bin)
• Example: for 6B72A16 , the initial bin is A and the
jump size is 3

33 34

Linear probing Linear probing

Having completed these insertions: Thesimplest method is linear probing, for which
• The load factor is 𝜆 = 14/16 = 0.875 𝑝(𝑖) = 𝑖.
• The average number of probes is 38/14 ≈ 2.71 ℎ 𝑘, 𝑖 = ℎ 𝑘 + 𝑖 𝑚𝑜𝑑 𝑚
0 1 2 3 4 5 6 7 8 9 A B C D E F
for 𝑖 = 0,1, … , 𝑚 − 1.
680 D59 B32 E93 826 207 488 946 19A 5BA 74C 3AD ACD C8B
Thevalue of ℎ 𝑘 determines the entire probe
sequence, and so assuming that ℎ 𝑘 can take on
any value in 0,1, … , 𝑚 − 1 , linear probing
allows only m distinct probe sequences.

35 36

35 36
Linear probing
Searching:start at the appropriate bin, and
searching forward until
• 1. The item is found,
LINEAR PROBING - • 2. An empty bin is found, or
SEARCHING • 3. We have traversed the entire array
0 1 2 3 4 5 6 7 8 9 A B C D E F

680 D59 B32 E93 826 207 488 946 19A 5BA 74C 3AD ACD C8B

37 38

Linear probing
Searching for C8B

ERASING
0 1 2 3 4 5 6 7 8 9 A B C D E F
5B 3A AC
680 D59 B32 E93 826 207 488 946 19A 74C C8B
A D D

39 40

39 40
Erasing Erasing
We cannot simply remove elements from the hash We cannot simply remove elements from the hash
table table
• For example, consider erasing 3AD

0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F

680 D59 B32 E93 826 207 488 946 19A 5BA 74C 3AD ACD C8B 680 D59 B32 E93 826 207 488 946 19A 5BA 74C 3AD ACD C8B

41 42

Erasing Erasing
In general, assume: The first possibility is that hole < index
• The
currently removed object has created a hole at • In this case, the hash value of the object at index must either
index hole • equal to or less than the hole or
• The
object we are checking is located at the position • it must be greater than the index of the potential candidate
index and has a hash value of hash

• Remember: if we are checking the object ? at location

index, this means that all entries between hole and index are
both occupied and could not have been copied into the hole

43 44

43 44
Erasing Quadratic probing
The other possibility is we wrapped around the end Quadratic function
of the array, that is, hole > index 𝑝 𝑖 =𝑖
• In
this case, the hash value of the object at index must be
both ℎ 𝑘, 𝑖 = ℎ 𝑘 + 𝑖 𝑚𝑜𝑑 𝑚
• greater than the index of the potential candidate and Another quadratic function
• it must be less than or equal to the hole
𝑝 𝑖 = ℎ 𝑘 + (−1) 𝑖 + 1 /2
for 𝑖 = 1,2, … , 𝑚
This formular can be expressed in a simple form
In
either case, if the move is successful, the ? Now ℎ 𝑘 + 𝑖 ,ℎ 𝑘 − 𝑖
becomes the new hole to be filled
for 𝑖 = 1,2, … , 𝑚
45 46

45 46

Double hashing Double hashing

Double hashing uses a hash function of the form Insertion bydouble hashing. The hash
ℎ 𝑘, 𝑖 = ℎ 𝑘 + 𝑖ℎ 𝑘 𝑚𝑜𝑑 𝑚 table has size 13 with ℎ 𝑘 =
𝑘 𝑚𝑜𝑑 13 and ℎ 𝑘 = 1 +
where both h1 and h2 are auxiliary hash functions. 𝑘 𝑚𝑜𝑑 11 .
Theinitial probe goes to position 𝑇[ℎ 𝑘 ], and Since 14 = 1 (𝑚𝑜𝑑 13) and 14 =
successive probe positions are offset from 3(𝑚𝑜𝑑 11), the key 14 goes into empty
previous positions by the amount ℎ 𝑘 , modulo slot 9, after slots 1 and 5 are examined
m. and found to be occupied.

47 48

47 48
Double hashing Q&A
In orderfor the entire hash table to be searched,
the value ℎ (𝑘) must be relatively prime to the
hash-table size m.
Letm be prime and to design ℎ so that it always
returns a positive integer less than m. For example
• We could choose m prime and let
ℎ 𝑘 = 𝑘 𝑚𝑜𝑑 𝑚
ℎ 𝑘 = 1 + (𝑘 𝑚𝑜𝑑 𝑚 )
where 𝑚 is chosen to be slightly less than m (say, 𝑚 −
1).
49 50

49 50

Hash Table Fundamentals and Techniques
No ratings yet
Hash Table Fundamentals and Techniques
39 pages
06 - APS - Hash Table
No ratings yet
06 - APS - Hash Table
28 pages
Understanding Hash Tables and Functions
No ratings yet
Understanding Hash Tables and Functions
51 pages
Hashing
No ratings yet
Hashing
33 pages
Hash Tables: Concepts & Implementations
No ratings yet
Hash Tables: Concepts & Implementations
53 pages
Hashing
No ratings yet
Hashing
20 pages
Hash Tables: A Guide for CS Students
No ratings yet
Hash Tables: A Guide for CS Students
48 pages
Hashing
No ratings yet
Hashing
56 pages
Hash Table Search Complexity Explained
No ratings yet
Hash Table Search Complexity Explained
43 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
44 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Lec12 Hash Tables 09092024 090609pm
No ratings yet
Lec12 Hash Tables 09092024 090609pm
48 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
Hash Tables and Collision Resolution
No ratings yet
Hash Tables and Collision Resolution
47 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
32 pages
DSA2 Chapter 5 Hashing
No ratings yet
DSA2 Chapter 5 Hashing
44 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
53 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Group 15 Hash Tables
No ratings yet
Group 15 Hash Tables
42 pages
Hashing
No ratings yet
Hashing
9 pages
Primary Clustering in Hashing
No ratings yet
Primary Clustering in Hashing
61 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
Hashing RPK
No ratings yet
Hashing RPK
61 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
University Institute of Engineering CSE-2 Year: Advanced Data Structures and Algorithms
No ratings yet
University Institute of Engineering CSE-2 Year: Advanced Data Structures and Algorithms
26 pages
Hashing Cropped
No ratings yet
Hashing Cropped
12 pages
Hash Table Data Structure
No ratings yet
Hash Table Data Structure
34 pages
Hashing
No ratings yet
Hashing
21 pages
Hash Table
No ratings yet
Hash Table
9 pages
CH 4
No ratings yet
CH 4
58 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
20 pages
Hashing
No ratings yet
Hashing
42 pages
Unit29 Hashing2
No ratings yet
Unit29 Hashing2
20 pages
DSA Lab 11 Hashing
No ratings yet
DSA Lab 11 Hashing
9 pages
Hash Tables: A Programmer's Guide
No ratings yet
Hash Tables: A Programmer's Guide
26 pages
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
No ratings yet
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
24 pages
Understanding Hashing Techniques
No ratings yet
Understanding Hashing Techniques
35 pages
Hashing Techniques Done
No ratings yet
Hashing Techniques Done
53 pages
Hashing PDF
No ratings yet
Hashing PDF
61 pages
HASHING
No ratings yet
HASHING
16 pages
Hashing Presentation
No ratings yet
Hashing Presentation
12 pages
Understanding Hash Tables and Collisions
No ratings yet
Understanding Hash Tables and Collisions
27 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
23 pages
4
No ratings yet
4
29 pages
6 - Hashing
No ratings yet
6 - Hashing
52 pages
UNIT 8 Hashing
No ratings yet
UNIT 8 Hashing
24 pages
Lecture 27 - Hashing
No ratings yet
Lecture 27 - Hashing
48 pages
Hashing
No ratings yet
Hashing
16 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Hashing
No ratings yet
Hashing
37 pages
DSA Hashing Techniques Overview
No ratings yet
DSA Hashing Techniques Overview
16 pages
Collision
No ratings yet
Collision
24 pages
Hashing New
No ratings yet
Hashing New
48 pages
Northeast Corner Transportation Method
No ratings yet
Northeast Corner Transportation Method
6 pages
Classifying Maqams of Quranic Recitations Using D
No ratings yet
Classifying Maqams of Quranic Recitations Using D
11 pages
Numerical Differentiation
No ratings yet
Numerical Differentiation
26 pages
CE F324 Numerical Analysis Course Handout
No ratings yet
CE F324 Numerical Analysis Course Handout
4 pages
Gauss Elimination Examples & Solutions
No ratings yet
Gauss Elimination Examples & Solutions
8 pages
RandONets Shallow Networks With Random Projections F 2025 Journal of Comput
No ratings yet
RandONets Shallow Networks With Random Projections F 2025 Journal of Comput
22 pages
Here Is A Pascal Program To Solve Small Problems Using The Simplex Algorithm
No ratings yet
Here Is A Pascal Program To Solve Small Problems Using The Simplex Algorithm
12 pages
Code Rumble: Speed Coding Challenge
No ratings yet
Code Rumble: Speed Coding Challenge
3 pages
Euclid Notes 3 Polynomial Functions
No ratings yet
Euclid Notes 3 Polynomial Functions
4 pages
Global Sequence Alignment Guide
No ratings yet
Global Sequence Alignment Guide
24 pages
CNN Vs Transformer Variants Malware Classification Using Binary Malware
No ratings yet
CNN Vs Transformer Variants Malware Classification Using Binary Malware
9 pages
Introduction To: Algorithm Design and Analysis
No ratings yet
Introduction To: Algorithm Design and Analysis
37 pages
Huffman Coding: File Compression Guide
No ratings yet
Huffman Coding: File Compression Guide
11 pages
Recursive Filtering with Matrices
No ratings yet
Recursive Filtering with Matrices
2 pages
Labrep 3 Numsol
No ratings yet
Labrep 3 Numsol
4 pages
Lecture 5 Linear & Binary Search
No ratings yet
Lecture 5 Linear & Binary Search
14 pages
Introduction to Linear Optimization
60% (10)
Introduction to Linear Optimization
267 pages
Function Approximation Case Study: Smart Sensor
No ratings yet
Function Approximation Case Study: Smart Sensor
10 pages
Worksheets 4 Asal It TR
No ratings yet
Worksheets 4 Asal It TR
6 pages
Operations-Research (Set 3)
No ratings yet
Operations-Research (Set 3)
17 pages
Construct AVL Tree For The Following Data 21,26,30,9,4,14,28,18,15,10,2,3,7
No ratings yet
Construct AVL Tree For The Following Data 21,26,30,9,4,14,28,18,15,10,2,3,7
9 pages
FEM Ritz Method
No ratings yet
FEM Ritz Method
7 pages
Finite Element Methods Parallel Sparse Statics and Eigen Solutions 1st Edition Duc Thai Nguyen Updated 2025
100% (6)
Finite Element Methods Parallel Sparse Statics and Eigen Solutions 1st Edition Duc Thai Nguyen Updated 2025
147 pages
Elements of Computational Metrology: Vijay Srinivasan
No ratings yet
Elements of Computational Metrology: Vijay Srinivasan
2 pages
Machine Learning with Go Overview
No ratings yet
Machine Learning with Go Overview
9 pages
AES Block Cipher Explained
No ratings yet
AES Block Cipher Explained
13 pages
Chapter 3 Design of Digital Control Systems Using State Space Methods
No ratings yet
Chapter 3 Design of Digital Control Systems Using State Space Methods
47 pages
Rainbow - Combining Improvements in Deep Reinforcement Learning (1710.02298)
No ratings yet
Rainbow - Combining Improvements in Deep Reinforcement Learning (1710.02298)
14 pages
Lecture 4,5 ANN Cont.
No ratings yet
Lecture 4,5 ANN Cont.
31 pages
Time Delay Neural Network Overview
No ratings yet
Time Delay Neural Network Overview
6 pages

Lecture 4 Hash Table Stu

Uploaded by

Lecture 4 Hash Table Stu

Uploaded by

Table of Contents

Homepage: vt-le.github.io Applications of Hash Table

Direct-address tables Direct-address tables

Hash tables Hash tables

HASH FUNCTIONS Itis best if 𝑇𝑆𝑖𝑧𝑒 is a prime number; otherwise,

Folding Shift folding

Mid-Square Function Mid-Square Function

Separate chaining Separate chaining

Coalesced chaining Coalesced chaining

Open Addressing Linear probing

Linear probing Linear probing

• Remember: if we are checking the object ? at location

Double hashing Double hashing

You might also like