ANJUMAN COLLEGE
OF ENGINEERING AND
TECHNOLOGY,
SADAR, NAGPUR.
DATA STRUCTURE AND PROGRAM
DESIGN
2017-18
Computer science and
Engineering
Seminar on:
Hashing
Guided by: Prof. Farhina Sheikh
COMPILED BY:
oAafaqueahmad Khan
oAsad Fazlani
oDanish Sheikh
oHimanshu Wasnik
oIrfan Sheikh
oJafar Sheikh
oRafiuddin
oRohit Raut
oSahil khan
oSourabh Phulpagar
HAS
HING
CONTENTS
INTRODUCTION
HASHING
HASH FUNCTION
CHARACTERISTICS OF HASH
FUNCTION
HASH TABLE
STATIC HASHING
TYPES OF HASH FUNCTION
DIVISION METHOD
INTRODUCTION
HASHING
Hashing is the process of indexing and
retrieving element (data) in a data structure to
provide faster way of finding the element
using the hash key.
In hashing time required to search an element
doesn’t depend on the number of element.
Using hashing data structure, an element is
searched with constant time complexity.
Hashing provides very fast access to records
on certain search conditions.
HASH FUNCTION
A hash function is a function which
takes a piece of data (i.e. key) as input
and outputs an integer (i.e. hash value)
which maps the data to a particular
index in the hash table.
CHARACTERISTICS OF
HASH FUNCTION
 It should be easy and quick to
compute.
 It should ideally be mathematically
one-to-one the set of relevant key
values.
 The number of collision should be less
while placing the records in the hash
table.
 It should achieve even distribution of
the key values that actually occur across
HASH TABLE
Hash Table also known as Hash Map is a
data structure which uses hash function
to generate key corresponding to the
associated value.
It is a tables from which an item can be
searched in o(1) time using a hash
function to form address from the key
STATIC HASHING
Primary Area: # primary pages fixed, allocated
sequentially, never deallocated; (say M buckets).
•A simple hash function: h(K) = f(K) mod M
Overflow area: disjoint from the primary area. It
keeps buckets which hold records whose key
maps to full bucket.
•Adding the address of an overflow bucket to a
primary area bucket is called chaining.
Collision does not cause a problem as long as
there is still room in the mapped bucket.
Overflow occurs during insertion when a record is
hashed to the bucket that is already full.
EXAMPLE
Assume f(k) = k. Let M = 5. So, h(k) = k mod
5
Bucket factor = 3 records
0 35 60
1 6 46
2 12 57 62
3 33
4 44
17
PROBLEMS OF STATIC
HASHING
 The main problem with static hashing: the number
of buckets is fixed:
• Long overflow chains can develop and degrade
performance.
• On the other hand, if a file shrinks greatly, a lot of
bucket space will be wasted.
 There are some other hashing techniques that
allow dynamically growing and shrinking hash
index. These include:
• Linear hashing
• Extendible hashing
TYPES OF HASH
FUNCTION:
A. Division method
B. MID square method
C. Folding Method
DIVISION METHOD
In this method, we choose a number m (i.e.,
memory locations) larger than the number of n
(i.e. number of records) of keys which uniquely
determine records). The number of usually
chosen to be a prime number or a number
without small divisors since this frequently
minimizes the number of The hash function H is
defined by:
H(K)=K(mod m)
Here K (mod m) denotes the remainder when K
is divided by m. The second formula is used
For example, a company has 68 employees and
they have been assigned 4-digit employee
number. Assume L (memory addresses in the
table) of 100 two digit addresses: 00, 01, 02...99.
Applying the above hash function, say, for
employee numbers:
3205, 7148, 2345
Here, we choose a prime number m close to 99,
such as 97 then
H (3205) = 4
H (7148) = 67
H(2345) = 17
That is, dividing 3205 by 97 gives a
remainder of 4, dividing 7148 by 97 gives a
remainder of 67 and dividing 2345 by 97
gives a remainder of 17. In the other case
where the memory addresses begin with 01
rather than 00, we choose the hash function:
H(3205) = 4+1=5
H(7148) = 67+1=68
H(2345) = 17+1=18
Ques:- Given following list of element
{63, 92, 84, 33, 90, 69, 97, 91}
Use Division method of hashing to sort the
hashing.
Solution:-
 Formula :-
• h(k) = k mode m
Given,
 k= keys element.
• The keys element is:
63 92 84 33 90 69 97 91
 m= Total no of Buckets.
• Total no of Buckets is 8 because we have 8
Calculation:-
h(k) = 63 % 8
h(k) = 7
h(k) = 92 % 8
h(k) = 6
h(k) = 84 % 8
h(k) = 4
h(k) = 32 % 8
h(k) = 0
h(k) = 90 % 8
h(k) = 2
h(k) = 69 % 8
h(k) = 5
h(k) = 97 % 8
h(k) = 1
h(k) = 91 % 8
h(k) = 3
Sr.
No.
Key Element,
(k)
Hash
Function
h(k) = k mod
m
Buckets
1. 63 63 % 8 0 32
2. 92 92 % 8 1 97
3. 84 84 % 8 2 90
4. 32 32 % 8 3 91
5. 90 90 % 8 4 84
6. 69 69 % 8 5 69
7. 97 97 % 8 6 92
Tabular
representation :-

Hashing

  • 1.
    ANJUMAN COLLEGE OF ENGINEERINGAND TECHNOLOGY, SADAR, NAGPUR. DATA STRUCTURE AND PROGRAM DESIGN 2017-18 Computer science and Engineering Seminar on: Hashing Guided by: Prof. Farhina Sheikh
  • 2.
    COMPILED BY: oAafaqueahmad Khan oAsadFazlani oDanish Sheikh oHimanshu Wasnik oIrfan Sheikh oJafar Sheikh oRafiuddin oRohit Raut oSahil khan oSourabh Phulpagar
  • 3.
  • 4.
    CONTENTS INTRODUCTION HASHING HASH FUNCTION CHARACTERISTICS OFHASH FUNCTION HASH TABLE STATIC HASHING TYPES OF HASH FUNCTION DIVISION METHOD
  • 5.
  • 6.
    HASHING Hashing is theprocess of indexing and retrieving element (data) in a data structure to provide faster way of finding the element using the hash key. In hashing time required to search an element doesn’t depend on the number of element. Using hashing data structure, an element is searched with constant time complexity. Hashing provides very fast access to records on certain search conditions.
  • 7.
    HASH FUNCTION A hashfunction is a function which takes a piece of data (i.e. key) as input and outputs an integer (i.e. hash value) which maps the data to a particular index in the hash table.
  • 8.
    CHARACTERISTICS OF HASH FUNCTION It should be easy and quick to compute.  It should ideally be mathematically one-to-one the set of relevant key values.  The number of collision should be less while placing the records in the hash table.  It should achieve even distribution of the key values that actually occur across
  • 9.
    HASH TABLE Hash Tablealso known as Hash Map is a data structure which uses hash function to generate key corresponding to the associated value. It is a tables from which an item can be searched in o(1) time using a hash function to form address from the key
  • 10.
    STATIC HASHING Primary Area:# primary pages fixed, allocated sequentially, never deallocated; (say M buckets). •A simple hash function: h(K) = f(K) mod M Overflow area: disjoint from the primary area. It keeps buckets which hold records whose key maps to full bucket. •Adding the address of an overflow bucket to a primary area bucket is called chaining. Collision does not cause a problem as long as there is still room in the mapped bucket. Overflow occurs during insertion when a record is hashed to the bucket that is already full.
  • 11.
    EXAMPLE Assume f(k) =k. Let M = 5. So, h(k) = k mod 5 Bucket factor = 3 records 0 35 60 1 6 46 2 12 57 62 3 33 4 44 17
  • 12.
    PROBLEMS OF STATIC HASHING The main problem with static hashing: the number of buckets is fixed: • Long overflow chains can develop and degrade performance. • On the other hand, if a file shrinks greatly, a lot of bucket space will be wasted.  There are some other hashing techniques that allow dynamically growing and shrinking hash index. These include: • Linear hashing • Extendible hashing
  • 13.
    TYPES OF HASH FUNCTION: A.Division method B. MID square method C. Folding Method
  • 14.
    DIVISION METHOD In thismethod, we choose a number m (i.e., memory locations) larger than the number of n (i.e. number of records) of keys which uniquely determine records). The number of usually chosen to be a prime number or a number without small divisors since this frequently minimizes the number of The hash function H is defined by: H(K)=K(mod m) Here K (mod m) denotes the remainder when K is divided by m. The second formula is used
  • 15.
    For example, acompany has 68 employees and they have been assigned 4-digit employee number. Assume L (memory addresses in the table) of 100 two digit addresses: 00, 01, 02...99. Applying the above hash function, say, for employee numbers: 3205, 7148, 2345 Here, we choose a prime number m close to 99, such as 97 then H (3205) = 4 H (7148) = 67 H(2345) = 17
  • 16.
    That is, dividing3205 by 97 gives a remainder of 4, dividing 7148 by 97 gives a remainder of 67 and dividing 2345 by 97 gives a remainder of 17. In the other case where the memory addresses begin with 01 rather than 00, we choose the hash function: H(3205) = 4+1=5 H(7148) = 67+1=68 H(2345) = 17+1=18
  • 17.
    Ques:- Given followinglist of element {63, 92, 84, 33, 90, 69, 97, 91} Use Division method of hashing to sort the hashing. Solution:-  Formula :- • h(k) = k mode m Given,  k= keys element. • The keys element is: 63 92 84 33 90 69 97 91  m= Total no of Buckets. • Total no of Buckets is 8 because we have 8
  • 18.
    Calculation:- h(k) = 63% 8 h(k) = 7 h(k) = 92 % 8 h(k) = 6 h(k) = 84 % 8 h(k) = 4 h(k) = 32 % 8 h(k) = 0 h(k) = 90 % 8 h(k) = 2 h(k) = 69 % 8 h(k) = 5 h(k) = 97 % 8 h(k) = 1 h(k) = 91 % 8 h(k) = 3
  • 19.
    Sr. No. Key Element, (k) Hash Function h(k) =k mod m Buckets 1. 63 63 % 8 0 32 2. 92 92 % 8 1 97 3. 84 84 % 8 2 90 4. 32 32 % 8 3 91 5. 90 90 % 8 4 84 6. 69 69 % 8 5 69 7. 97 97 % 8 6 92 Tabular representation :-