0% found this document useful (0 votes)
405 views

CS8492 DBMS - Part A & Part B

This document discusses concepts related to database management systems and relational databases. It defines key terms like database management system, database schema, and data models. It also describes relational database concepts such as the relational model, tables, rows, keys like primary keys and foreign keys, and Structured Query Language (SQL). Specific topics covered include data definition language, data manipulation language, data control language, static and dynamic SQL, and relational algebra operations.

Uploaded by

RAJA.S 41
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
405 views

CS8492 DBMS - Part A & Part B

This document discusses concepts related to database management systems and relational databases. It defines key terms like database management system, database schema, and data models. It also describes relational database concepts such as the relational model, tables, rows, keys like primary keys and foreign keys, and Structured Query Language (SQL). Specific topics covered include data definition language, data manipulation language, data control language, static and dynamic SQL, and relational algebra operations.

Uploaded by

RAJA.S 41
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

CS8492 - DATABASE MANAGEMENT SYSTEMS

UNIT I - RELATIOANL DATABASES


PART - A
1. Define database management system and its applications. (Nov/Dec 2008, 2014)
Database management system (DBMS) is a collection of interrelated data and a set of
programs to access those data.
Applications:
 Banking
 Airlines
 Universities
 Credit card transactions
 Tele communication
 Finance
 Manufacturing and Sales
 Human resources
2. What are the advantages of using a DBMS? What is the purpose of database
management system? (Nov/Dec 2014)
 Controlling redundancy
 Restricting unauthorized access
 Providing multiple user interfaces
 Enforcing integrity constraints
 Providing backup and recovery
3. What are the disadvantages of file processing system? (May/ June 2016)
 Data redundancy and inconsistency
 Difficulty in accessing data
 Atomicity of updates
 Concurrent access by multiple users
 Security problems
4. Specify the various levels of abstraction in a database.
 Physical level - Describes how the data are actually stored.
 Logical level - Describes what data are stored and what relationships exist among
those data.
 View level - Describes only part of the database.
5. What is meant by instance of database and schema?
 Instance - Collection of information stored in the database at a particular moment is
called an instance of the database.
 Schema - The overall design of the database is called database schema.
6. What is a data model? Mention its types.
Data model is a collection of conceptual tools for describing data, data relationships,
data semantics and consistency constraints.
 Relational model
 ER model
 Object based data model
 Object Relational data model
 Semi structured data model
 Network data model
 Hierarchical data model
7. What is relational model?
The relational model represents the database as a collection of relations. A relation is
nothing but a table of values. Every row in the table represents a collection of related data
values. These rows in the table denote a real-world entity or relationship.
8. What are the advantages of relational model?
 Simplicity: A relational data model is simpler than the hierarchical and network
model.
 Structural Independence: The relational database is only concerned with data and
not with a structure. This can improve the performance of the model.
 Easy to use: The relational model is easy as tables consisting of rows and columns
are quite natural and simple to understand
 Query capability: It makes possible for a high-level query language like SQL to
avoid complex database navigation.
 Data independence: The structure of a database can be changed without having to
change any application.
 Scalable: Regarding a number of records, or rows, and the number of fields, a
database should be enlarged to enhance its usability.
9. Define keys. Mention its types.
It is used to uniquely identify any record or row of data from the table. It is also used
to establish and identify relationships between tables.
Types:
 Primary key
 Candidate key
 Super key
 Foreign key
10. Define a primary key.
A column in a table whose values uniquely identify the rows in the table. A primary
key value cannot be NULL to matching columns in other tables.
11. What is a foreign key?
A column in a table that does not uniquely identify rows in that table, but is used as a
link to matching columns in other tables.
12. What is super key?
A super key is a set of one or more attributes that collectively allows us to identify
uniquely an entity in the entity set.
13. Define candidate key.
A candidate key is an attribute or set of an attribute which can uniquely identify a
tuple.
14. Give the reasons why null values might be introduced into the database.
 The NULL value is appropriate in two cases:
o The value of an attribute is unknown.
o There is no value associated with the attribute. However, the null value is not
the same as zero.
15. Define relational algebra.
The relational algebra is a procedural query language. It consists of a set of operations
that take one or two relation as input and produce a new relation as output.
16. Define SQL.
 SQL stands for Structured Query Language.
 It is designed for managing data in a relational database management system
(RDBMS).
 SQL is a database language, it is used for database creation, deletion, fetching rows,
and modifying rows, etc.
 SQL is based on relational algebra and tuple relational calculus.
17. What are the various data base languages in SQL? (April/May 2018)
 Data Definition Language (DDL)
o Commands that define a database, including creating, altering, and dropping
 Data Definition Language (DDL)
o Commands that define a database, including creating, altering, and dropping
tables and establishing constraints
 Data Manipulation Language (DML)
o Commands that maintain and query a database
 Data Control Language (DCL)
o Commands that control a database, including administering privileges and
committing data
18. Differentiate between procedural DML and nonprocedural DML.
 Procedural DML - It require a user to specify what data are needed and how to get
those data.
 Nonprocedural DML - It require a user to specify what data are needed without
specifying how to get those data.
19. List the aggregate functions in SQL.
 AVG – calculates the average of a set of values.
 COUNT – counts rows in a specified table or view.
 MIN – gets the minimum value in a set of values.
 MAX – gets the maximum value in a set of values.
 SUM – calculates the sum of values.
20. What is embedded SQL?
Embedded SQL statements are SQL statements written inline with the program source
code of the host language. The embedded SQL statements are parsed by an embedded SQL
preprocessor and replaced by host-language calls to a code library. The output from the
preprocessor is then compiled by the host compiler. This allows programmers to embed SQL
statements in programs written in any number of languages such as: C/C++, COBOL and
Fortran.
21. Define dynamic SQL.
Dynamic SQL is a programming technique that enables you to build SQL statements
dynamically at runtime. We can create more general purpose, flexible applications by using
dynamic SQL because the full text of a SQL statement may be unknown at compilation.
22. What is static SQL and how is it different from dynamic SQL. (Nov/Dec’17)
 Static or Embedded SQL are SQL statements in an application that do not change
at runtime and, therefore, can be hard-coded into the application.
 Dynamic SQL is SQL statements that are constructed at runtime; for example, the
application may allow users to enter their own queries.

PART - B
1. Explain the three different groups of data models with suitable example. (April/May’19)
2. State and explain the architecture of DBMS. (Nov/Dec’17)
3. Differentiate between foreign key constraints and referential integrity constraints with
suitable example. (Nov/Dec’17)
4. Write the DDL, DML, DCL commands for the student database.
Which contains student details: name, id, DOB, branch, DOJ.
Course details: Course name, Course id, Stud.id, Faculty name, id, marks. (Nov/Dec’17)
5. Define relational algebra. Explain various relational algebraic operations with example.
(Nov/Dec 2016, April/May 2017)
6. Explain the select, project, Cartesian product and join operations in relational algebra with
an example. (April/May’18)
7. Explain the aggregate functions in SQL with an example. (April/May’18)
8. Describe about the static and dynamic SQL in detail. (April/May’19)
UNIT II - DATABASE DESIGN
PART - A
1. What is an entity relationship model? (May/ June 2016)
The entity relationship model is a collection of basic objects called entities and
relationship among those objects. An entity is a thing or object in the real world that is
distinguishable from other objects.
2. Define an entity and entity set.
 Entities:
o Entity -a thing (animate or inanimate) of independent physical or conceptual
existence and distinguishable.
o Example: In the University database context, an individual student, faculty
member, a class room, courses are entities.
 Entity Set or Entity Type:
o Collection of entities all having the same properties.
o Example:
 Student entity set –collection of all student entities.
 Course entity set –collection of all course entities.
3. What is an attribute?
 Attributes - Each entity is described by a set of attributes/properties.
 Example: student entity
o StudName–name of the student.
o RollNumber–the roll number of the student.
o Sex–the gender of the student etc.
 All entities in an Entity set/type have the same set of attributes.
4. What is derived attributes?
 Derived attributes are those created by a formula or by a summary operation on other
attributes.
 An attribute which can be derived from other attributes of the entity type is
known as derived attribute. e.g.; Age (can be derived from DOB). In ER diagram,
derived attribute is represented by dashed oval.
5. What s multivalued attribute?
An attribute consisting more than one value for a given entity. For example,
Phone_No (can be more than one for a given student). In ER diagram, multivalued attribute
is represented by double oval.
6. What is composite attribute?
An attribute composed of many other attribute is called as composite attribute.
For example, Address attribute of student Entity type consists of Street, City, State, and
Country. In ER diagram, composite attribute is represented by an oval comprising of ovals.
7. Define cardinality.
 The number of times an entity of an entity set participates in a relationship set
is known as cardinality.
 Cardinality can be of different types:
 One to one
 Many to one
 Many to many
8. Define weak and strong entity sets. (April/May 2009, April/May 2018)
 Weak entity set: entity set that do not have key attribute of their own are called weak
entity sets.
 Strong entity set: Entity set that has a primary key is termed a strong entity set.
9. Define participation constraint.
 Participation Constraint is applied on the entity participating in the relationship set.
 Total Participation – Each entity in the entity set must participate in the
relationship. If each student must enroll in a course, the participation of student will
be total. Total participation is shown by double line in ER diagram.
 Partial Participation – The entity in the entity set may or may NOT participate
in the relationship. If some courses are not enrolled by any of the student, the
participation of course will be partial.
10. Define Specialization and Aggregation.
 It is the process of designating sub groupings within an entity set. It is a top down
process.
 Specialization which is represented by triangle. The label ISA stands for “is a” and
represent, for eg that customer is a person.
 Aggregation is a special kind of association that specifies a whole/part relationship
between the aggregate (whole) and a component part.
11. Give the distinction between primary key, candidate key and super key. (Nov/Dec
2006, 2009)
 Primary key – is used in a data base to avoid duplication of attributes and also makes
a relation to the other database.
 Candidate key - a key which is in the data base is called as candidate key, it might be
any key attribute.
 Super key – collection of keys of a database is called as super key.
12. Define functional dependency. Nov/Dec2010, Apr/ May 2015
 A functional dependency is denoted by an arrow "→". The functional dependency of
X on Y is represented by X → Y.
 X determines Y or Y is functionally dependent on X.
 Dependent - It is displayed on the right side of the functional dependency.
 Determinant - It is displayed on the left side of the functional dependency.
 X – Determinant, Y – Dependent
13. List the functional dependency rules.
 Reflexivity Rule
o A → B, if B is a subset of A.
 Transitivity Rule
o If A → B and B → C, then A → C i.e. a transitive relation
 Augmentation Rule
o AC → BC, if A → B
 Union Rule
o If A → B and A → C then A → BC
 Decomposition Rule
o If A → BC then A → B and A → C
 Pseudotransitivity Rule
o If A → B and CB → D then AC → D
14. Define attribute closure.
 Attribute closure of an attribute set can be defined as set of attributes which can be
functionally determined from it.
 Attribute closure is represented by X+
 Means that set of attributes determined by X
15. Why certain functional dependencies are called trivial functional dependencies?
A functional dependency FD: X → Y is called trivial if Y is a subset of X. In other
words, a dependency FD: X → Y means that the values of Y are determined by the values of
X. Two tuples sharing the same values of X will necessarily have the same values of Y.
16. Define normalization.
 Normalization is the process of organizing the data in the database.
 Normalization is used to minimize the redundancy from a relation or set of relations.
It is also used to eliminate the undesirable characteristics like Insertion, Update and
Deletion Anomalies.
 Normalization divides the larger table into the smaller table and links them using
relationship.
 The normal form is used to reduce redundancy from the database table.
17. Define data Anomalies.
Data anomalies are inconsistencies in the data stored in a database as a result of an
operation such as update, insertion, and/or deletion. Such inconsistencies may arise when
have a particular record stored in multiple locations and not all of the copies are updated.
18. Define 1NF.
 A relation will be 1NF if it contains an atomic value.
 It states that an attribute of a table cannot hold multiple values. It must hold only
single-valued attribute.
 First normal form disallows the multi-valued attribute, composite attribute, and their
combinations.
19. Define 2NF.
 In the 2NF, relational must be in 1NF.
 In the second normal form, all non-key attributes are fully functional dependent on the
primary key.
 No partial dependency. i.e., proper subset of candidate key -> nonprime attributes.
An attribute that is not part of any candidate key is known as non-prime attribute.
20. Define 3NF.
 A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency. (NPA -> NPA)
 3NF is used to reduce the data duplication. It is also used to achieve the data integrity.
 If there is no transitive dependency for non-prime attributes, then the relation must be
in third normal form.
 A relation is in third normal form if it holds atleast one of the following conditions for
every non-trivial function dependency X → Y.
 X is a super key.
 Y is a prime attribute, i.e., each element of Y is part of some candidate key.
21. Define BCNF.
 BCNF is the advance version of 3NF. It is stricter than 3NF.
 A table is in BCNF if every functional dependency X → Y, X is the super key of the
table.
 For BCNF, the table should be in 3NF, and for every FD, LHS is super key.
22. Define Multivalued Dependency
 A multivalued dependency is a full constraint between two sets of attributes in a
relation.
 For a dependency A → B, if for a single value of A, multiple values of B exists, then
the relation will be a multi-valued dependency.
23. Define 4NF.
 A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
 For a dependency A → B, if for a single value of A, multiple values of B exists, then
the relation will be a multi-valued dependency.
24. Define 5NF.
 A relation is in 5NF if it is in 4NF and not contains any join dependency and joining
should be lossless.
 5NF is satisfied when all the tables are broken into as many tables as possible in order
to avoid redundancy.
 5NF is also known as Project-join normal form (PJ/NF).
25. List the decomposition properties. (April/May 2017)
 Lossless: Data should not be lost or created when splitting relations up
 Dependency preservation: It is desirable that FDs are preserved when splitting
relations up
 Normalization to 3NF is always lossless and dependency preserving
 Normalization to BCNF is lossless, but may not preserve all dependencies
26. Differentiate 3NF and BCNF. (Nov/Dec 2010, April/May2011)
3NF BCNF
In 3NF there should be no transitive
dependency that is no non prime In BCNF for any relation A->B, A
attribute should be transitively dependent should be a super key of relation.
on the candidate key.
It is less strong than BCNF. It is comparatively stronger than 3NF.
In BCNF there may or may not be
In 3NF there is preservation of all
preservation of all functional
functional dependencies.
dependencies.
Lossless decomposition can be achieved Lossless decomposition is hard to
by 3NF. achieve in BCNF.
27. Why 4NF is more desirable than BCNF?
4NF is more desirable than BCNF because it reduces the repetition of information. If
we consider a BCNF schema not in 4NF we observe that decomposition into 4NF does not
lose information provided that lossless join decomposition is used, yet redundancy is reduced.

PART - B
1. Construct an E-R diagram for a car-insurance company whose customers own one or more
cars each. Each car has associated with it zero to any number of recorded accidents. State any
assumptions you make. (Nov/Dec’18)
2. Consider the following case study describing the academic functioning of a college:
 A college has many departments
 A department would have many students as well as employs many faculty members
 A student can register into various courses; similarly a course can be registered by
many students
 A student lives in a single hostel but a hostel accommodate many students
 A department offers many courses but a particular course is offered by a particular
department
 A faculty teaches many courses. A course is taught by many faculties.
Model a E-R diagram for the above scenario. (April/May’19)
3. Briefly discuss about the concept of functional dependency. (April/May’19)
4. What is normalization? Explain in detail about all the normal forms. (April/May’19)
5. For the following relation schema R and set of functional dependencies F: R(A, B, C, D), F
= { AC ->E, B ->D, E ->A}. List all candidate keys. (Nov/Dec’19)
6. Give an example of a relation that is in 3NF but not in BCNF. How will you convert that
relation into BCNF. (Nov/Dec’18)
7. Given a relation R( A, B, C, D) and Functional Dependency set FD = { AB → CD, B → C
}, determine whether the given R is in 2NF? If not convert it into 2 NF.
8. Given a relation R (P, Q, R, S, T) and Functional Dependency set FD = {PQ → R, S →
T}, determine whether the given R is in 2NF? If not convert it into 2 NF.
9. Given a relation R(X, Y, Z) and Functional Dependency set FD = {X → Y and Y → Z},
determine whether the given R is in 3NF? If not convert it into 3 NF.
10. Given a relation R(X, Y, Z) and Functional Dependency set FD = {X → Y and Y → Z},
determine whether the given R is in 3NF? If not convert it into 3 NF.
UNIT III - TRANSACTIONS
PART - A
1. What is transaction? Mention its properties. (Nov/Dec’14, Nov/Dec’17)
 Collections of operations that form a single logical unit of work are called
transactions.
 The properties of transactions are:
 Atomicity
 Consistency
 Isolation
 Durability
2. What are the two operations for accessing data in transaction?
 Read(x) - transfer data item x from database.
 Write(x) - transfer data item x from the local buffer.
3. List out the transaction states?
 Active
 Partially Committed
 Failed
 Abort
 Committed.
4. What is the need for concurrency?
 Improved throughput and resource utilization
 Reduced waiting time.
5. Define lock? Mention its mode. (Nov2009, Nov2011)
 Lock is the most common used to implement the requirement is to allow a transaction
to access a data item only if it is currently holding a lock on that item.
 The modes of lock are:
 Shared /Read
 Exclusive /Write
6. Define lock table?
 System maintains record for the items that are currently locked in lock table that
which could be organized as a hash file.
7. Define the phases of two phase locking protocol. (April/May 2009)
 Growing phase: a transaction may obtain locks but not release any lock.
 Shrinking phase: a transaction may release locks but may not obtain any new locks.
8. What are the differences between an exclusive lock and a shared lock?
Exclusive Lock Shared Lock
An exclusive lock is a lock on a data
A shared lock allows more than one
item when a transaction is about to
transaction to read the data items.
perform the write operation.
9. When is a transaction rolled back?
 Any changes that the aborted transaction made to the database must be undone.
 Once the changes caused by an aborted transaction have been undone, then the
transaction has been rolled back.
10. What are two pitfalls (problem) of lock-based protocols? (April/May’11)
 Deadlock
 Starvation
11. What benefit does strict two-phase locking provide? What disadvantages result?
Because it produces only cascadeless schedules, recovery is very easy. But the set of
schedules obtainable is a subset of those obtainable from plain two phase locking, thus
concurrency is reduced.
12. Differentiate strict two phase locking protocol and rigorous two phase locking
protocol. (May/June 2016)
 In strict two phase locking protocol all exclusive mode locks taken by a transaction is
held until that transaction commits.
 Rigorous two phase locking protocol requires that all locks be held until the
transaction commits.
13. When the schedule is said to cascade less?
A schedule is said to be cascadeless or avoid cascading roll back if every transaction
in the schedule reads only items that were written by committed transactions.
14. Give the reasons for allowing concurrency? (Nov/Dec 2017)
The reasons for allowing concurrency is if the transactions run serially, a short
transaction may have to wait for a preceding long transaction to complete, which can lead to
unpredictable delays in running a transaction. So concurrent execution reduces the
unpredictable delays in running transactions.
15. What is serializability? Mention its types. (April/May 2018)
 A (possibly concurrent) schedule is serializable if it is equivalent to a serial schedule.
 Different forms of schedule equivalence give rise to the notions of:
 Conflict serializability
 View serializability
16. What is serializible schedule? (April/May 2017)
 To process transactions concurrently, the database server must execute some
component statements of one transaction, then some from other transactions, before
continuing to process further operations from the first.
 The order in which the component operations of the various transactions are
interleaved is called the schedule.
17. When two operations in schedule are said to be conflict?
 Two operation belong to different transaction
 Two operation access the same item x
 At least one of the operation is write-item (x)
18. What is meant by log-based recovery? (April 2009)
 The most widely used structures for recording database modifications is the log.
 The log is a sequence of log records, recording all the update activities in the
database. There are several types of log records.
19. Define blocks? What are its types?
The database system resides permanently on nonvolatile storage, and is partitioned
into fixed length storage units called blocks.
 Physical blocks: The input and output operations are done in block units. The blocks
residing on the disk are referred to as physical blocks.
 Buffer blocks: The blocks residing temporarily in main memory are referred to as
buffer blocks
20. Suppose that there is a database system that never fails. Is a recovery manager
required for this system?
Even in this case the recovery manager is needed to perform roll-back of aborted
transactions.
21. List the four conditions for deadlock. (Nov/Dec 2016)
 Mutual exclusion: at least one process must be held in a non-sharable mode.
 Hold and wait: there must be a process holding one resource and waiting for another.
 No preemption: resources cannot be preempted.
 Circular wait: there must exist a set of processes [p1, p2, ..., pn] such that p1 is
waiting for p2, p2 for p3, and so on upto pn.
PART - B
1. List the ACID properties. Explain the usefulness of each. (April/May’19)
2. Explain with an example the properties that must be satisfied by a transaction.
(April/May’18)
3. During transaction execution, it passes through several states, until it finally commits or
aborts. List all possible sequences of states through which a transaction may pass. Explain
why each state transition may occur. (April/May’18)
4. Brief the states of a transaction with a neat diagram. (Nov/Dec’19)
5. What is concurrency control? How it is implemented in DBMS? Briefly elaborate with
suitable diagrams and examples. (April/May’19)
6. State and explain the lock based concurrency control with suitable example. (Nov/Dec’17)
7. Explain conflict and view serializability. (April/May’18)
8. Discuss elaborately the two-phase locking protocol that ensures serializability.
(April/May’19, Nov/Dec’19, April/May’18)
9. Discuss in detail about testing of serializability. (April/May’19)
10. Check whether the given schedule S is conflict serializable or not.
S : R1(A) , R2(A) , R1(B) , R2(B) , R3(B) , W1(A) , W2(B)
11. Check whether the given schedule S is conflict serializable and recoverable or not

12. Check whether the given schedule is view serializable or not.


13. What is deadlock? Explain the four conditions for deadlock with an example.
(April/May’19)
14. Narrate the actions that are considered for deadlock detection and the recovery from
deadlock. (Nov/Dec’19)
15. Outline the concept of deferred and immediate modification versions of the log based
recovery scheme. (April/May’19)
UNIT IV - IMPLEMENTATION TECHNIQUES
PART - A
1. What is meant by software and hardware RAID systems? May/June 2016
RAID can be implemented with no change at the hardware level, using only software
modification. Such RAID implementations are called software RAID systems and the
systems with special hardware support are called hardware RAID systems.
2. What is the use of RAID? April/May2009, Nov/Dec2010
A variety of disk-organization techniques, collectively called redundant arrays of
independent disks are used to improve the performance and reliability.
3. What are the factors to be taken into account when choosing a RAID level?
 Monetary cost of extra disk storage requirements.
 Performance requirements in terms of number of I/O operations
 Performance when a disk has failed and Performances during rebuild.
4. Define hot swapping?
Hot swapping permits the removal of faulty disks and replaces it by new ones without
turning power off. Hot swapping reduces the mean time to repair.
5. Define seek time.
The time for repositioning the arm is called the seek time and it increases with the
distance that the arm is called the seek time.
6. Define rotational latency time.
The time spent waiting for the sector to be accessed to appear under the head is called
the rotational latency time.
7. What is called mirroring?
The simplest approach to introducing redundancy is to duplicate every disk. This
technique is called mirroring or shadowing.
8. What is an index?
An index is a structure that helps to locate desired records of a relation quickly,
without examining all records.
9. Define Primary index and Secondary Index
 It is in a sequentially ordered file, the index whose search key specifies the sequential
order of the file. It is also called as clustering index. The search key of a primary
index is usually but not necessarily the primary key.
 It is an index whose search key specifies an order different from the sequential order
of the file. It is also called as non clustering index.
10. What are called index-sequential files?
The files that are ordered sequentially with a primary index on the search key are
called index-sequential files.
11. Distinguish between primary and secondary index.
Primary Index Secondary Index
Index on a set of fields that includes the
Index that is not a primary index and may
unique primary key and is generated not to
have duplicates.
contain duplicates.
Requires the rows in data blocks to be Does not have an impact on how the rows
ordered on the index key. are actually organized in data blocks.
There is only one primary index. There can be multiple secondary indices.
There can be duplicates in the secondary
There are no duplicates in the primary key.
index.
12. What is Multilevel Index?
 If primary index does not fit in memory, access becomes expensive.
 To reduce number of disk accesses to index records, treat primary index kept on disk
as a sequential file and construct a sparse index on it.
o outer index – a sparse index of primary index
o inner index – the primary index file
 If even outer index is too large to fit in main memory, yet another level of index can
be created, and so on.
13. State the difference between B tree and B+ tree indexing.
B tree B+ tree
All internal and leaf nodes have data Only leaf nodes have data pointers
pointers
Since all keys are not available at leaf, All keys are at leaf nodes, hence search is
search often takes more time. faster and accurate.
No duplicate of keys is maintained in the Duplicate of keys are maintained and all
tree. nodes are present at leaf.
Leaf nodes are not stored as structural Leaf nodes are stored as structural linked
linked list. list.
14. What is B-Tree?
 A B-tree eliminates the redundant storage of search-key values.
 It allows search key values to appear only once.
15. What is a B+-Tree index?
A B+-Tree index takes the form of a balanced tree in which every path from the root
of the root of the root of the tree to a leaf of the tree is of the same length.
16. What are the disadvantages of B-Tree over B+ Tree? (Nov/Dec16)
 Only small fraction of all search-key values are found early
 Non-leaf nodes are larger. Thus, B-Trees typically have greater depth than
corresponding B+-Tree
 Insertion and deletion more complicated than in B+-Trees
 Implementation is harder than B+-Trees.
17. What is a hash index?
A hash index organizes the search keys, with their associated pointers, into a hash file
structure.
18. What are the two main goals of parallelism?
 Load –balance multiple small accesses, so that the throughput of such accesses
increases.
 Parallelize large accesses so that the response time of large accesses is reduced.
19. What is hashing file organization?
In the hashing file organization, a hash function is computed on some attribute of each
record. The result of the hash function specifies in which block of the file the record should
be placed.
20. Differentiate static and dynamic hashing. (Apr/May 15) (Nov/Dec14, 15)
Static Hashing Dynamic Hashing
In static hashing, when a search-key value is Hash function, in dynamic hashing, is made
provided, the hash function always to produce a large number of values and
computes the same address. only a few are used initially.
Dynamic hashing provides a mechanism in
The number of buckets provided remains which data buckets are added and removed
unchanged at all times i.e. fixed dynamically and on-demand .i.e. no. of
buckets not fixed.
Space and overhead is more Minimum space and less overhead
As file grows performance decreases Performance do not degrade as file grows
21. List out the mechanisms to avoid collision during hashing. (Nov/Dec 16)
 In overflow chaining, the overflow buckets of a given bucket are chained together in a
linked list. It is as called closed hashing.
 An alternative, called open hashing, which does not use overflow buckets, is not
suitable for database applications.
22. What is known as heap file organization? Nov/Dec 2009
In the heap file organization, any record can be placed anywhere in the file where
there is space for the record. There is no ordering of records. There is a single file for each
relation.
23. What is known as sequential file organization? April/May2009
In the sequential file organization, the records are stored in sequential order,
according to the value of a “search key” of each record.
24. What is called query processing?
Query processing refers to the range of activities involved in extracting data from a
database.
25. What is called a query evaluation plan?
A sequence of primitive operations that can be used to evaluate be query is a query
evaluation plan or a query execution plan.
26. Define Query optimization. (May/June 16)
Query optimization refers to the process of finding the lowest cost method of
evaluating a given query.
27. State the need for Query Optimization. (Apr/May 15)
The query optimizer attempts to determine the most efficient way to execute a given
query by considering the possible query plans.
28. What are Cost Components of Query Execution?
The cost of executing the query includes the following components:
 Access cost to secondary storage.
 Storage cost.
 Computation cost.
 Memory uses cost.
 Communication cost.
PART - B

1. Explain the various levels of RAID systems. (Nov/ Dec 2019)


2. Explain about B trees indexing concepts with an example.
3. What is hashing? Explain static hashing and dynamic hashing with an example.
(April/May 2018, 2019)
4. Describe the structure of B+ tree and give the algorithm for search in the B+ tree with
example. (April/May 2019)
5. Write short note on secondary storage devices.
6. Discuss in detail about Query optimization with neat diagram.
7. What is RAID? Briefly discuss about RAID. (April/May 2019)
8. Suppose that we are using extendable hashing on a file that contains records with the
following search key values: 2, 3, 5, 7, 11, 17, 19, 23, 29, 31. Show the extendable hash
structure for this file if the hash function is h(x) = x mod 8 and the buckets can hold three
records.
9. Consider the below B+ tree, Show the form of the tree after performing following
operations.
(i) Insert 9, 10 and 8. (ii) Delete 23 and 19.

10. Construct a B+ tree for the cases order 6 and order 8 for the following set of key values.
2, 3, 5, 7, 11, 17, 19, 23, 29, 31
11. With simple algorithms explain the computing of Nested-loop join and Block nested-loop
join.
12. Sketch and concise the basic steps in Query Processing. (Nov/ Dec 2019)
13. Discuss in detail about Query optimization with neat diagram.
UNIT V - ADVANCED TOPICS
PART - A
1. Define Distributed Database Systems. (Nov/Dec 16)
Database spread over multiple machines (also referred to as sites or nodes).Network
interconnects the machines. Database shared by users on multiple machines is called
Distributed Database Systems.
2. What are Goals of Distributed Database system?
 Reliability: In distributed database system, if one system fails down or stops working
for some time another system can complete the task.
 Availability: In distributed database system reliability can be achieved even if sever
fails down. Another system is available to serve the client request.
 Performance: Performance can be achieved by distributing database over different
locations. So the databases are available to every location which is easy to maintain.
3. What is homogeneous distributed database and heterogeneous distributed database
 A homogeneous distributed database has identical software and hardware running all
databases instances, and may appear through a single interface as if it were a single
database.
 A heterogeneous distributed database may have different hardware, operating
systems, database management systems, and even data models for different databases.
4. Define fragmentation in Distributed Database.
 The system partitions the relation into several fragments and stores each fragment at
different sites.
 Two approaches:
o Horizontal Fragmentation
o Vertical Fragmentation
5. Define Database replication.
Database replication can be used on many database management systems, usually
with a master/slave relationship between the original and the copies. The master logs the
updates, which then ripple through to the slaves. The slave outputs a message stating that it
has received the update successfully, thus allowing the sending of subsequent updates.
6. What is Object database System?
An object database is a database management system in which information is
represented in the form of objects as used in object oriented programming. Object-relational
databases are a hybrid of both approaches.
7. Define ODMG Object model?
The ODMG object model is the data model upon which the object definition language
(ODL) and object query language (OQL) are based.
8. Define ODL.
 Object Definition Language (ODL) is the specification language defining the interface
to object types conforming to the ODMG Object Model. Often abbreviated by the
acronym ODL. This language's purpose is to define the structure of an Entity-
relationship diagram.
- ODL langauge is used to create object specifications: classes and interfaces
 Using the specific langauge bindings to specify how ODL constructs can be mapped
to constructs in specific programming language, such as C++, JAVA
9. Define Information Retrieval.
It is an activity of obtaining information resources relevant to an information need
from a collection of information resources.
10. Define Relevance Ranking. (Nov/Dec 14)
A system in which the search engine tries to determine the theme of a site that a link
is coming from.
11. Can we have more than one constructor in a class? If yes, explain the need for such a
situation. (Nov/Dec 15)
Yes, default constructor and constructor with parameter.
12. Define XML Database.
An XML database is a data persistence software system that allows data to be stored
in XML format. These data can then be queried, exported and serialized into the desired
format. XML databases are usually associated with document-oriented databases.
13. How does the concept of an object in the object-oriented model differ from the
concept of an entity in the entity-relationship model? (Nov/Dec 16)
An entity is simply a collection of variables or data items. An object is an
encapsulation of data as well as the methods (code) to operate on the data. The data members
of an object are directly visible only to its methods. The outside world can gain access to the
object’s data only by passing pre-defined messages to it and these messages are implemented
by the methods.
14. Is XML Hierarchical?
XML documents have a hierarchical structure and can conceptually be interpreted as a
tree structure, called an XML tree. XML documents must contain a root element (one that is
the parent of all other elements). All elements in an XML document can contain subelements,
text and attributes.
15. What is DTD?
A document type definition (DTD) contains a set of rules that can be used to validate
an XML file. After you have created a DTD, we can edit it manually, adding declarations that
define elements, attributes, entities, and notations, and how they can be used for any XML
files that reference the DTD file.
16. Define XML schema.
XML Schema defines a number of built-in types such as string, integer, decimal date,
and boolean. In addition, it allows user-defined types; these may be simple types with added
restrictions, or complex types constructed using constructors such as complex Type and
sequence.
17. What is the use of XML Schema?
XML Schema is commonly known as XML Schema Definition (XSD). It is used to
describe and validate the structure and the content of XML data. XML schema defines the
elements, attributes and data types. Schema element supports Namespaces.
18. What are Xpath and Xquery?
XPath can be used to navigate through elements and attributes in an XML document.
XPath is syntax for defining parts of an XML document. XPath uses path expressions to
navigate in XML documents. XPath contains a library of standard functions. XPath is a major
element in XSLT and in XQuery.
19. What are the Types of Queries in IR systems?
 Keyword Queries
 Boolean Queries
 Phrase Queries
 Proximity Queries
 Wildcard Queries
 Natural Language Queries
20. What are the retrieval models?
 Statistical models - Boolean, vector space, and probabilistic
 Semantic model
21. Write the System Failure Modes
 Failure of a site.
 Loss of messages.
 Failure of a communication link.
 Network partition
22. Write the Similarities between E/R and ODL.
 Both support all multiplicities of relationships
 Both support inheritance

PART - B
1. Explain in detail the Client - Server Architecture for DDBMS.
2. Explain in detail (i) Information Retrieval (iii) Transaction processing (Nov/Dec 14)
3. Explain about Object Oriented Databases and XM Databases.
4. Write short notes on Distributed Transactions. (Nov/Dec 14)
5. Suppose an Object Oriented database had an object A, which references object B, which in
turn references object C. Assume all objects are on disk initially? Suppose a program first
dereferences A, then dereferences B by following the reference from A, and then finally
dereferences C. Show the objects that are represented in memory after each dereference,
along with their state. (Nov/Dec 15)
6. Suppose that you have been hired as a consultant to choose a database system for your
client’s application. For each of the following applications, state what type of database
system (relational, persistent programming language–based OODB, object relational; do not
specify a commercial product) you would recommend. Justify your recommendation.
(i) A computer-aided design system for a manufacturer of airplanes.
(ii) A system to track contributions made to candidates for public office.
(iii) An information system to support the making of movies. (Nov/Dec 16)
7. Give the DTD for an XML representation of the following nested relational schema
Emp = (ename, ChildrenSet setof(Children), SkillsSet setof(Skills))
Children = (name, Birthday)
Birthday = (day, month, year)
Skills = (type, ExamsSet setof(Exams)).
Exams = (year, city) (Nov/Dec 16)
8. Explain various queries in IR Systems with an example.
9. Explain XML Schema with an example.
10. Explain ODMG – Object Model in detail.

You might also like