DS(U5)
DS(U5)
UNIT V
Tables: Rectangular tables - Jagged tables – Inverted tables - Symbol tables – Static tree tables - Dynamic tree
tables - Hash tables. Files: queries - Sequential organization – Index techniques. External sorting: External
storage devices – Sorting with tapes and disks.
2 MARKS
1. What are the methods available in storing sequential files?
The methods available in storing sequential files are:
Straight merging,
Natural merging,
Polyphase sort,
Distribution of Initial runs.
2. What is Symbol table? (Nov 12)
Symbol table is a data structure it contains the information about the identifier.
Symbol table have two field
Identifier name
Memory location
Juke-box systems, with large numbers of removable disks, a few drives, and a mechanism for
automatic loading/unloading of disks available for storing large volumes of data
17. Write short notes on Magnetic Tapes.
Non-volatile, used primarily for backup (to recover from disk failure), and for archival data.
Sequential-access – much slower than disk.
Very high capacity (40 to 300 gb tapes available).
Hold large volumes of data and provide high transfer rates.
Few GB for DAT (Digital Audio Tape) format, 10-40 GB with DLT (Digital Linear Tape) format, 100
GB+ with Ultrium format, and 330 GB with Ampex helical scan format.
Transfer rates from few to 10s of MB/s.
18. Write short notes on Hash tables. (Nov 14)
Hashing is differs the address or location of identifier is obtained by computing the function f(x).
It gives address is referred as hash or address of x.
The memory address refer to hash table (HT) hash table divided in number of buckets.
HT (0)…...................HT (B-1).Each bucket can hold S
records. Each bucket contain of S slot. Each slot can hold one
record.
19. Write short notes on Floppy Disks.
Mylar plastics usually 5 ½ or 8 inches in dia-coating magnetic material.
8 to 26 sector/track with 128 to 512 bytes.
Capacity between 125 KB to 1 MB and transmission rate is 5 to 10 characters/m sec.
1. Explain in detail about Symbol tables with its applications. (May 14)
SYMBOL TABLES-DEFINITION
⚫
A symbol table is a set of locations containing a record for each identifier with fields for the attribute of
the identifier.
⚫
The attributes stored in a symbol table are
⚫ DATA TYPE: Numeric or Character
⚫ SCOPE: Where the program is valid
⚫ ARGUMENT VALUES : The argument values that is used or returned in the program
⚫ An essential function of a compiler is to record the identifiers and the related information
about its attributes types
SYMBOL TABLES-REPRESENTATION
⚫
A typical symbol is represented as,
Memory location
1 x Location
2 y
3 z
If if
repeat
loop while
loop
->2.3+4.3+5.2+15.1 = 43
->2.2+4.2+5.2+15.2 = 52
⚫ Another application of binary trees with minimal external path length is to obtain an optional
set of codes for messages M1……Mn+1.
⚫ Corresponding to codes 000, 0001, 01 and 1 or messages M1, M2, M3 and M4 respectively.
These codes are called Huffman codes.
⚫ For any tree in L with root node T and depth greater than 1, WEIGHT(T) is the sum of weights of all
external nodes in T
PROCEDURE HUFFMAN
1. Procedure HUFFMAN(L,n)
// L is a list of n single node binary trees as described above //
2. for I <-1 to n-1 do //loop n-1 times//
3. call GETNODE (T) //create a new binary tree//
4. LCHILD (T) <- LEAST (L) //by combining the trees//
5. RCHILD (T) <- LEAST (L) //with two smallest weights//
6. WEGTH (T) <- WEIGHT (LCHILD (T)) + WEIGHT (RCHILD(T))
7. Call INSERT (L,T)
8. end
9. end HUFFMAN
The way that this algorithm makes is given by an example shown below
Suppose we have given the weights q1=2 , q2=3, q3=5, q4=7, q5=9 and q6=13. Then sequence of trees we
would get is,
⚫ If the binary search tree contains the identifiers a1,a2,…..an with a1 < a2 < …< an and the probability of
searching for each ai is pi.
⚫ Then the total cost of any binary search tree is ∑(1 ≤ i ≤ n) [pi ], level ai when only successful searches
are made
EXAMPLE:
⚫ The possible binary search trees for the identifier set (a1,a2,a3) = (do,if,stop) are :
JAN
FEB MAR
DEC OCT
NOV
JULY
FEB MAY
APR
AUG
DEC
FEB
JAN
JULY
JUNE
MAR
MAY
NOV
OCT
SEPT
The best search method, binary search technique involves number of comparisons, which has a search time of
O(log2n). Another approach is to compute the location of the desired record. The nature of this computation
depends on the key set and the memory-space requirements of the desired record.
This key-to-address transformation problem is defined as a mapping or hashing function H, which maps the
key space (K) into an address space (A).
Slot1 Slot2
1 A A2
2 0 0
3
0 0
4
D 0
5
0 0
As an example, consider the hash table HT with b = 26 buckets, each bucket having exactly two slots,
i.e. s = 2. The hash function f must map each of the possible identifiers into one of the numbers 1 – 26. Here, A-
Z corresponds to the numbers 1-26 respectively, then the function f is defined by: f(X) = the first character of
X. The identifiers A, B, C… will be hashed into buckets 1, 2, 3…
The identifiers A, A1, A2 are synonyms. Take for eg., if A and A1 are already stored in the bucket. If A2
is to be stored then overflow occurs, since s = 2.
HASHING FUNCTIONS
23
Next, we insert 50 and its position is 1 and the arrangement is as follows.
0 1 2 3 4 5 6
50 23
Then we insert 30 and its position is 2, but the bucket number 2 is already occupied by 23. So, collision has
occurred. Therefore, the value 30 gets next available cell, which is 3. The orientation is as follows.
0 1 2 3 4 5 6
50 23 30
Similarly, when we wish to insert 38, and we face collision again. Now it is placed at the next available cell,
which is 4. The arrangement of data elements is as follows.
0 1 2 3 4 5 6
50 23 30 38
Overhead in this technique is the time taken for finding the next available cell.
2. Quadratic Probing
In this case, when the collision occurs at hash address h, then this method searches the table at location h+1,
h+2, and h+9. The hash function will now be defined as
When we wish to insert 23, we can easily insert at location 3 as shown below.
0 1 2 3 4 5 6 7 8 9
23
81 23
Now we want to insert 93 and as the position 3 is already occupied, collision takes place. So, the cell with
distance one apart is checked and if it is free then the new data element is placed, which is as shown below.
0 1 2 3 4 5 6 7 8 9
81 23 93
Now we wish to insert 113, in this case the position 3 and 4 are already occupied, so the cell with distance 4 is
checked and it is found empty then the new value is placed at location 7
0 1 2 3 4 5 6 7 8 9
81 23 93 113
Key Link
Overflow Node
Each overflow location in the overflow area consists of two major parts OR and KEY.
OR - containing the address of the next location in a chain. KEY is the key of the record contained in the
OR.
A pointer to a chain of overflow records is included in the each bucket, and for the ith bucket, this pointer
is designated by PTRi.
If there is no overflow record in the bucket, then PTRi has the value NULL. Otherwise it has the value of
the address of the first record in the overflow chain for that bucket.
ALGORITHM: DIRECT_INSERT
Given a record R with key X, it is required to insert R into the direct file with n primary buckets B1,B2,…….,Bn.,
in which the particular bucket Bi contains m record locations Bi1,Bi2,…..Bim. If the record is resident at one
location bij, then its key is denoted by kij. If no record is present, then the key field is represented with a
negative number.
Mp-> primary
buckets=8 Bp=3
A primary bucket can hold to a maximum of Bp records and an overflow bucket can hold up to Bv
records.
Hashing function uses division method to obtain a position for a key ie KEY mod TABSIZE when more
than Bp keys are hashed into a location L1 an overflow bucked is allocated and associated with the primary
bucket at location L.
Length Of Search (LOS) refers to the number of buckets which must be accessed to retrieve a key for eg.
LOS (1345) =3
∞
ALOS = Σ (i+1) *Ni
i=0 N
∞
ALOS Σ (i+1) *Ni =1(16)+2(5)+3(1)
i=0 N 22
ALOS = 1.318
As overflow buckets continue to be added the ALOS will rise and the access performance will
deteriorate. For that, the following hashing methods are used,
1. Linear hashing.
2. Virtual hashing.
LINEAR HASHING:-
Hashing function Ho suggests placing key 3820 at position 4, which has a full primary bucked.
Therefore ALOS=1.318 the table is doubled in size i.e. Mp is changed from 8 to 16.
At the same time the IN_USE table is doubled in size and bits 8 through 15 of this table are set to 0.
Rather than adding key 3820 as an overflow, all entries at 4th place are rehashed using H1. When
function H1 is used to split bucket 4, keys 6652 and 76 are moved to bucket 12. Since 12th position is
now in use, IN_USE [12] is set to 1. Using H1, the suggested location for 3820 is position 12.
Distribution – Dependent hashing function mainly denotes the address of the key.
In which SCK is required to find a hashing function H which maps the elements of S to address space.
The required function can be obtained from discrete cumulative distribution function Fz(x)=P(Z≤X).
To find the address we have to follow digit analysis and piece-wise linear function.
DIGIT ANALYSIS:
Digit analysis is a hashing transformation which is in a sense , distribution dependent.
That digits or bits of the original key are selected and then shifted in order to form addresses.
As an example, a key 123456789 would be transform to an address 7654 if digits in positions 4
through 7 were selected and their order reversed.
For a given key set, the same position of key and the same rearrangement pattern must be used
consistently.
A PIECE-WISE LINEAR FUNCTION:
It is the second dependent distribution hashing function.
A key space consist of integers in interval (a,d) this interval is divided into j equal subintervals
of length L, ie L=(d-a)/j.
A interval location of a given key x is found by the formula.
The following algorithm illustrates how the piece-wise linear function for indirect addressing is
calculated.
ALGORITHM:
Piece-wise. Given j,a,d,m and n as previously defined and a key set { x1,x2,….xn}. it is required to calculate
interval length L and frequencies and cumulative frequencies Ni,Gi, 1≤i≤j, for piece-wise linear function.
1. [Initialize array N to zero]
Repeat for i=1,2,……j
Ni<-0
2. [Determine interval length and interval frequency]
L<-(d-a)/j
Repeat for k=1,2,……n
I<- 1+[(Xk-a)/L]
Ni<- Ni+1
3. [Calculate interval cumulative frequencies]
G1<- N1
Repeat for i=2,3,…….j
Gi-< Gi-1+ Ni
4 [Finished]
Exit.
From the given parameters, the following assignment statement can be used to calculate an address H
from a key X in interval (a,d)
i<-1+[(x-a)/L]
if Gi≠0
then H(x)=[m(Gi+((x-a)/L-i)Ni)/n]
else H(x)<- 1
ALGORITHM MULTIPLE FREQUENCY:
Let Nik be number of keys in interval Ii of the Kth key set, let Gik be the number of keys less than a+iL
in same key set K. It is required to calculate H, the value of function H(x) for a key x from initial key using q
frequency distribution mapping.
Magnetic-disk
Data is stored on spinning disk, and read/written magnetically
NOTE: Diagram is schematic, and simplifies the structure of actual disk drives
Read-write head
Positioned very close to the platter surface (almost touching it)
Reads or writes magnetically encoded information.
Surface of platter divided into circular tracks
Over 16,000 tracks per platter on typical hard disks
Each track is divided into sectors.
A sector is the smallest unit of data that can be read or written.
Sector size typically 512 bytes
Typical sectors per track: 200 (on inner tracks) to 400 (on outer tracks)
To read/write a sector
disk arm swings to position head on right track
platter spins continually; data is read/written as sector passes under head
Head-disk assemblies
P a g e | 34 DATA STRUCTURES
DEPARTMENT OF CSE
multiple disk platters on a single spindle (typically 2 to 4)
One head per platter, mounted on a common arm.
Cylinder i consists of ith track of all the platters
Performance Measures of Disks
Access time – the time it takes from when a read or write request is issued to when data transfer
begins. Consists of:
Seek time – time it takes to reposition the arm over the correct track.
Average seek time is 1/2 the worst case seek time.
Would be 1/3 if all tracks had the same number of sectors, and we ignore the
time to start and stop arm movement
4 to 10 milliseconds on typical disks
Rotational latency – time it takes for the sector to be accessed to appear under the head.
Average latency is 1/2 of the worst case latency.
4 to 11 milliseconds on typical disks (5400 to 15000 r.p.m.)
Data-transfer rate – the rate at which data can be retrieved from or stored to the disk.
4 to 8 MB per second is typical
Multiple disks may share a controller, so rate that controller can handle is also important
E.g. ATA-5: 66 MB/second, SCSI-3: 40 MB/s
Fiber Channel: 256 MB/s
Mean time to failure (MTTF) – the average time the disk is expected to run continuously without any
failure.
Typically 3 to 5 years
Probability of failure of new disks is quite low, corresponding to a
“theoretical MTTF” of 30,000 to 1,200,000 hours for a new disk
E.g., an MTTF of 1,200,000 hours for a new disk means that given 1000 relatively new
disks, on an average one will fail every 1200 hours
MTTF decreases as disk ages
Optical storage
non-volatile, data is read optically from a spinning disk using a laser
CD-ROM (640 MB) and DVD (4.7 to 17 GB) most popular forms
Write-one, read-many (WORM) optical disks used for archival storage (CD-R and DVD-R)
Multiple write versions also available (CD-RW, DVD-RW, and DVD-RAM)
Few GB for DAT (Digital Audio Tape) format, 10-40 GB with DLT (Digital Linear Tape) format, 100
GB+ with Ultrium format, and 330 GB with Ampex helical scan format
Transfer rates from few to 10s of MB/s
Currently the cheapest storage medium
Tapes are cheap, but cost of drives is very high
Very slow access time in comparison to magnetic disks and optical disks
Limited to sequential access.
Some formats (Accelis) provide faster seek (10s of seconds) at cost of lower capacity
Used mainly for backup, for storage of infrequently used information, and as an off-line medium for
transferring information from one system to another.
Tape jukeboxes used for very large capacity storage
(terabyte (1012 bytes) to petabye (1015 bytes)
9. Explain sequential file organization / Explain in detail about indexing techniques (Nov 12, Nov 13,14)
FILE ORGANIZATION
The database is stored as a collection of files. Each file is a sequence of records. A record is a sequence of
fields.
Fields: Account_Number, Brach_Name and Balance.
Collection of Records in a file is described by the following diagram,
Indexed sequential file are important for applications where data needs to be accessed sequentially and
randomly using the index.
An indexed sequential file allows fast access to a specific record.
E.g.:- A company may store details about its employees as an indexed sequential file.
Sometimes the file is accessed,
Sequentially, for e.g.:- When the whole of the file is processed to produce pay slips at the end of the month.
Randomly, may be an employee changes address, or a female employee gets married and changes her
surname.
Disadvantage of Sequential Files - The retrieval of a record from a sequential file, on average, requires
access to half the records in the file, making such enquiries not only inefficient but very time consuming for
large files. To improve the query responses time of a sequential file, a type of indexing technique can be added.
For a large file this is a costly and inefficient process, instead, the records that overflow then logical area is
shifted into a designated overflow area and a pointer is provided in the logical area are shifted into a
designated overflow area and a pointer is provided in the logical area of associated index entry points to the
overflow location.
This is illustrated below (figure).
Record 615 is inserted in the original logical block causing a record to be moved to an overflow block.
▪ This cylinder index shows that on cylinder 15 the largest key that will be found is 2000. If ISAM is
seeking a record 1880, an examination of cylinder 15 taking place. The read/write mechanism moves to
cylinder 15, selects track 0, and consults the track index. Then track 1 is selected,and thus file is found out.
PRIME AREA
The file itself along with the track indexes is called the prime area.
OVERFLOW RECORDS IN ISAM
The records are forced into overflow area as a result of insertion. For examples, if the record at the end of a
track are
…….. 26 28 30 31 33 35 37
And record 34 is to be added, and then the track will be changed to,
……… 26 28 30 31 33 34 35
And record 37 will be dropped off the end. The track’s highest key is now 35 and the track index is changed
accordingly. The question, of course, is what to do with record 37 that was dropped. If it is added to the next
track, it will cause the record at the end of that track to be dropped at the end and a domino effect will cascade
through all the records on the file.
In actual fact, there are two entries for each track on a given cylinder. We shall design them as ‘N’ and
‘0’ entries, where “N” denotes a normal entry and “0” an overflow entry. Before overflow records are added to
the file, both entries are the same. For eg:- the same track index for cylinder 6 of a file might appear as
N 0 N 0 N
1 120 1 120 2 200 2 200 3 250 …….
As indicated by the track index, the largest key to be found on track 2 is 200. Now suppose record 185
is to be added to this track forcing record 200 off the end into the overflow area.
Track 2 now becomes,
130 145 150 ……………………. 130 185 190
As the largest key on track 2 is now 190, the N entry for this track in the index must be changed to 190
as follows.
N 0 N 0 N
1 120 1 120 2 190 2 200 3 250 …………….
Suppose further that record 200 is placed in an overflow area on track 1 and is the first record on this
overflow track. If this is designated as 10: 1, the overflow area should be changed as follows.
N 0 N 0 N
1 120 1 120 2 190 10:1 200 3 250 ………….
In effect, then record 200 as become the first of many possible records in the overflow area.
If record 186 is added to track 2, forcing 190 off the end into the overflow area leaving track as
130 145 150 …………………… 180 185 186
Then record 190 will be added as the second record in the overflow area, namely 10:2, and the
overflow entry on the track index will be replaced by 10:2 so that the track index becomes
N 0 N 0 N
1 120 1 120 2 180 10:2 200 3 250 ………………….
Note that in the 0 entry the 200 is not changed as it still represents the largest record key in the
overflow area. In fact the previous entry 10:1 is added to the latest record to be added to the overflow area,
record 190, so that it is not lost the overflow area now looks like
# 200 10:1 190 ……………………..
This multiple key access has many real time applications in databases like employee details, hospital
patient’s details, student details to perform an effective search for complicated queries.
A key which is used to access each record in a file is known as a primary key
Usually the index or the serial number of a record is maintained as a primary key.
All the other fields in the file are assumed as secondary index items (commonly known as secondary
keys).
These secondary keys are useful for handling queries based on the value of the items.
Let us consider hospital management system for our information retrieval as shown below
Fig (ii)
The above figure illustrates two multilists – one for patient’s doctor and another for drug prescribed.
To provide a clear picture, lists for only three item values are shown, and a unique method is used for
representing the links for each item in the diagram.
In the above two multilists, each has 3 fields namely: name of the item value, link field and length of
the list.
Obviously for this example, it is more efficient to retrieve the two records corresponding to patients
taking CYROL and examine patient’s doctor to Novak.
Memory allocation for multilist structure
Normally, these address fields contains absolute auxiliary memory or the primary key value. An auxiliary
memory provides quicker access but it is affected by the physical movements of the records. The primary key
value remains unaffected by the physical movement of the records but it is slower to access a record.
Fig (iii)
The fig(iii)shows a primary key type of linkage for the doctor index using only first five records.
Advantages:
Simplicity of programming and flexibility in performing updates.
Disadvantages:
The greatest disadvantage of the multilist organization is to respond for a conjunctive query.
All the records corresponding to the shortest list must be individually brought up into main memory
for examination
Fig (v)
The above figure shows a partially inverted list of the hospitalization record and ward level only.
This type of list is used to handle queries like “How many patients are in recovery?”
An inverted list can appear as sequential, index sequential or direct file depending on the time to
respond for a query.
Likewise, here in fig (iv), the patient’s name list is very long and it cannot be stored in main
memory for ready access. But in the case of fig(v), The patient’s ward list has 5 sublists and so it can be stored
and retrieved from the main memory itself.
Advantages:
The major advantage in inverted list is its ability to handle queries with conjunctive term.
The statistics concerning the number of times a secondary index item has been used can be easily kept.
Disadvantages:
The secondary items being inverted generally have to be included in both inverted list and the master
file
Comparison and Tradeoff in the Design of Multikey File
Both inverted files and multi-list files have
An index for each secondary key.
An index entry for each distinct value of the secondary
key. In either file organization
The index may be tabular or tree-structured.
The entries in an index may or may not be sorted.
Thus an inversion index may have variable-length entries whereas a multi-list index has fixed-length entries
12. Explain the concept of virtual memory
VIRTUAL MEMORY
Some of the large programs cannot fit in main memory for execution. The usual solution is to introduce
management schemes that are intelligently allocates portions of memory to users as necessary for the
efficient running of their programs. The use of virtual memory is to achieve this goal.
One type of system which provides logical extension is called as virtual memory system.
When a program is executing and referencing data, all virtual addresses are translated automatically by
the operating system into real main memory addresses
There are three types of virtual memory system
Paging is a memory management technique in which virtual address space is split into fixed length blocks
called pages.
Main memory space is divided into physical sections of equal size subsection called page frames.
The virtual address in a paging section is divided into two components ‘p’ and
‘d’ p- Page number
d- Page offset
OS maintains a page table for each process.
Page table shows the frame location for each page of the process
If the bit is set invalid, the page will not be in the main memory.
Advantages of paging
Paging eliminates fragmentation Segmentation
Segmentation is a memory management scheme it divides the program into smaller blocks called
segments.
Segment can be defined as a logical grouping of information such as arrays or data area.
Segmentation is a variable size
Logical address using segmentation consist of two parts
(i) Segment number(s)
(ii) Offset(d)
Segmentation eliminates internal fragmentation but suffers from external fragmentation.
Segment table is used for mapping the two dimensional user defined addresses into an one dimensional
physical addresses
Segment table consists of,
(i) Segment base( contains the starting physical address where the segment resides in main
memory)
(ii) Segment limit( specifies the length of the segment)
Advantages of segmentation
Segmentation is visible to the user.
Difference between Paging and Segmentation:
PAGING SEGMENTATION
(I) Divisions is performed by
Os
(i) Divisions are performed by users
The VSAM index and data are assigned to distinct block of Virtual storage called a control interval.
The control interval contains a number of empty index and data blocks, which are used when a data blocks
overflows the index entry I1 indicates that the highest key value of the data block I2 is 73. The pointer to data
blocks I2 is indicated by I2.
HANDLING OVERFLOW
Suppose the records to be added have the key values of 55 & 60.These records will logically be added into
data blocks I2.However, since I2 has a block size of 4.
Only one record can be added without an overflow. The solution used in VSAM is to split the logical block
i2 into two blocks, let us say I2 & D7.
The records are inserted in the correct logical sequence.
In VSAM, a Number of control intervals are grouped together into a control area.
An index exists for each control area. control interval can be viewed as a track and a control area as a
cylinder of the Index-Sequential Organization
Two types of VSAM file,
Key sequenced file
CONTROL INTERVAL
Fig.: Logical Blocks of Control Interval
.
Control information is placed at the end of the Control interval.
Record definition and Control interval definition are present in the Control Information.
CONTROL AREA
Control interval are logically grouped together to form a Control Area.
A set of indices is created for each control area and each particular set point to the control interval.
The effect of removing the record with the key of A8 from the third control interval of the
file as shown in the following figure.
Note that if record B3 is removed from this interval the third entry is the first sequence set record
and first entry in the index set record must be altered to indicate that B1 is now the largest index
in that particular control interval and control area.
For eg.:- Suppose a record with key A9 is added to the third control interval as shown in the figure. The result
of this insertion that the record B1 and B3 are moved, displaying some of the free space area
VSAM handles this situation by performing a control-interval split which is almost identical to a data-block
split in a CDC scope indexed sequential file. In a control-interval split, stored records in the control interval are
moved to an empty control interval in the same control area and the new record is inserted in its proper key
sequence.
Just how the interval is split depends on the type of processing that is taking place .for a sequential insertion,
the new record is placed in the new control interval, if possible, and all subsequent records are placed in the
new control interval. Such a control interval split is as shown in the figure.
If the new record is too large to be placed to be placed in the new control interval, it and all remaining record
in the original control interval are placed in the new control interval.
In many cases, a table holding a collection of records must support efficient lookup by more than one key
value. For example, we may have a set of customer records consisting of a name field, an address field and a
phone number field. If we use a simple array, sorted by name, we have good support for searching on the
name field, but not for address or phone number searches:
8. Table as an ADT
A table can be thought of as an abstract data type. Given a set of index values I, and a base type T, a table is a
function M from I to T that supports the operations:
- access evaluate the function at any index value (retrieval)
- assignment modify the value of M(I) for any index value I
- creation define a new function M
- clearing remove all elements from I, so M’s domain is empty
Rectangular table:
Table Data
The table is made of 2000 rows, each row represents the data of one name
Each row is divided into 4 fields
Each of the 4 fields has its own name. The field names are: name, rank, gender, year
Tables Are Very Common
Tables are a very common structure for computer data
Number of fields is small (categories)
Number of rows can be millions or billions
e.g. email inbox: one row = one message, fields: date, subject, from, to, ...
e.g. craigslist: one row = one thing for sale: description, price, seller, listing date, ...
11 MARKS
APRIL 2011 (ARREAR)
1. Explain about tables & its types. (Pg. No. 54) (Qn. No. 14)
NOV 2011(REGULAR)
1. Discuss about external storage devices. (Pg. No. 32) (Qn. No. 8)
MAY 2012(ARREAR)
1. What do you mean by hashing? Explain the various hashing functions (Pg. No. 18) (Qn. No. 3)
MAY 2014(ARREAR)
1. Explain in detail about symbol tables and hash tables. (Pg. No. 7) (Qn. No. 1)
NOV 2014(REGULAR)
1. Classify hashing functions and explain each with an example. (Pg. No. 18) (Qn. No. 3)
2. Explain the collision resolution techniques in hashing. (Pg. No. 18) (Qn. No. 3)
3. Explain in detail about sequential and direct file access. (Pg. No. 36) (Qn. No. 9) (Pg. No. 22) (Qn. No. 5)
MAY 2015(ARREAR)
1. Define hash function and explain its methods in detail. (Pg. No. 18) (Qn. No. 3)
Nov 2015(REGULAR)
1. Explain Static tree Tables.
2. Briefly explain Sorting Performed ontapes and diskes.
MAY 2016
1. Write detailed notes on sequential file organization.
2. Explain any two external sorting techniques.