0% found this document useful (0 votes)
23 views16 pages

ADBMS 100% Crack Notes in 3 Hours

Uploaded by

dhirajwasu120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views16 pages

ADBMS 100% Crack Notes in 3 Hours

Uploaded by

dhirajwasu120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1.

Define Distributed Database


Ans:
A distributed database (DDB) is a collection of multiple, logically interrelated databases
distributed over a computer network.

2. Define Distributed Computing

A number of autonomous processing elements (not necessarily homogeneous) that are


interconnected by a computer network and that cooperate in performing their assigned
tasks.

3. What is main objective of Database?


Main Objectives of a Database
Data Management:

The primary objective of a database is to manage data efficiently. This involves


storing, organizing, retrieving, and updating data in a structured and systematic
way.

Data Integrity:
Ensuring the accuracy and consistency of data is crucial. Databases enforce rules
(like constraints, keys, and triggers) to maintain data integrity and prevent errors
like duplication or invalid entries.

Data Security:
Protecting sensitive data from unauthorized access or breaches is a key objective.
Databases implement access controls, encryption, and auditing mechanisms to
secure data.

Data Availability:
A database aims to make data readily available to authorized users. This means
ensuring that data can be accessed quickly and reliably whenever needed.
4. What are two main resources that needs to be managed in Distributed
Environments?

In distributed environments, the two main resources that need to be managed are:

1. Computational Resources: This includes processing power, memory, and storage


across different nodes. Management focuses on load balancing, resource allocation,
and ensuring fault tolerance.
2. Communication Resources: This involves the network infrastructure that connects
nodes. Management focuses on optimizing bandwidth, reducing latency, and ensuring
secure data transmission.

6. Explain Functional Layers of a Centralized DBMS.

1. Interface Layer

● User Interfaces: This component is responsible for interaction between the end-user or
application and the DBMS. It provides a means to execute queries and view results.
● View Management: It manages different views of the database, ensuring that users can see
only the data they are authorized to access. The relational calculus is often used here to
define queries.

2. Control Layer

● Semantic Integrity Control: Ensures that data adheres to certain rules or constraints. For
example, it might enforce that a "birthdate" column cannot have a future date.
● Authorization Checking: Handles security by ensuring that only authorized users can access
or modify the data. It often relies on relational calculus to verify these permissions.

3. Compilation Layer

● Query Decomposition and Optimization: Breaks down complex queries into simpler
operations and optimizes them for efficient execution. It converts high-level queries into a
series of relational algebra operations.
● Access Plan Management: Creates and manages the execution plans for queries,
determining the most efficient way to retrieve or update data.

4. Execution Layer

● Access Plan Execution Control: Manages the actual execution of the access plan generated
in the compilation layer.
● Algebra Operation Execution: Carries out the operations based on relational algebra, such
as selection, projection, and join. This is where data is actually retrieved or updated.

5. Data Access Layer

● Buffer Management: Handles data buffering, which involves temporarily storing data in
memory to optimize access speed.
● Access Methods: Manages the methods used to retrieve and store data, such as indexing or
scanning. This layer is responsible for efficiently fetching or updating the data in the
database.

6. Consistency Layer

● Concurrency Control: Ensures that multiple transactions can occur simultaneously without
leading to data inconsistency. It uses various algorithms to manage concurrent access to the
database.
● Logging: Maintains a log of all operations for recovery purposes. If something goes wrong,
the log can be used to restore the database to a consistent state.
7. Explain Client/Server Reference Architecture.

Client Side
● User Interface: The front-end where users interact with the system through a
GUI or command-line.
● Application Program: Sends SQL queries to the DBMS, executing data management
tasks.
● Client DBMS: Prepares and sends queries to the server, receiving results for the
application.
● Communication Software: Manages the network transmission of SQL queries and
results between client and server.

Server Side (System)

● Communication Software: On the server side, this software receives SQL queries
from the client and sends results back after processing. It manages the network
communication with the client.
● Semantic Data Controller: Ensures that the data complies with the defined
semantics, enforcing constraints and ensuring data integrity.
● Query Optimizer: Analyzes SQL queries to find the most efficient way to execute
them, considering factors like indexes, join methods, and data distribution.
● Transaction Manager: Manages database transactions, ensuring that they are
executed reliably and follow the ACID (Atomicity, Consistency, Isolation,
Durability) properties.
● Recovery Manager: Responsible for restoring the database to a consistent state in
case of failures, such as system crashes or power outages.
● Runtime Support Processor: Handles the execution of operations and procedures
required by the SQL queries during runtime, working with the other components to
execute and optimize the queries.

This is a two-level architecture where the functionality is divided into servers and clients.
The server functions primarily encompass data management, query processing, optimization
and transaction management. Client functions include mainly user interface. However, they
have some functions like consistency checking and transaction management.

8. Explain ANSI/SPARC Architecture.

The ANSI/SPARC Architecture is a high-level standard for designing database


management systems (DBMS). It is characterized by its three-layer design:
1. External Level: This is the highest level, closest to the user. It presents a
simplified view of the database, hiding internal details and providing a consistent
interface to users. The external level is responsible for:
○ Defining views and queries
○ Providing data independence
○ Concealing physical storage details
2. Conceptual Level: This level represents the logical structure of the database,
independent of the physical storage. It defines the relationships between data
entities, ensuring data consistency and integrity. The conceptual level:
○ Specifies the database schema
○ Defines data semantics and relationships
○ Is independent of the physical storage and access methods
3. Internal Level: This is the lowest level, responsible for physical storage and access
to the data. It defines how data are stored, indexed, and retrieved. The internal
level:
○ Specifies storage structures and access methods
○ Optimizes data retrieval and storage
○ Is specific to the underlying hardware and operating system

The ANSI/SPARC Architecture aims to provide a useful abstraction, simplifying


database access and management at varying levels of complexity. It emphasizes data
independence, allowing changes to the physical storage or access methods without
affecting the logical structure of the database.

9. Explain Distributed Database Reference Architecture


1. Global Schema

● Overview: Represents the unified view of the entire database system as seen by
users. It defines the structure, relationships, and constraints of data across the
distributed system.
● Function: Provides a consistent interface for users, abstracting the complexity of
data distribution and allowing seamless querying and manipulation.

2. Fragmentation Schema

● Overview: Defines how the global data is divided into smaller pieces called
fragments, distributed across different sites.
● Function: Improves performance by storing data closer to where it's needed and
enables parallel processing of different fragments across sites.

3. Allocation Schema

● Overview: Specifies where each data fragment is stored within the distributed
system, mapping fragments to physical locations.
● Function: Optimizes query performance by reducing data transfer and ensures
efficient resource use based on access patterns and site capacities.

4. Local Mapping Schema

● Overview: Maps the global schema to the local schema at each site, allowing
independent data management while staying consistent with the global schema.
● Function: Translates global queries into operations on the local database, bridging
global and local levels.

5. DBMS of Site1 and Site2

● Overview: Each site has its own DBMS, managing the data stored locally and
operating independently.
● Function: Handles local query processing, transaction management, and recovery,
while coordinating with the global DBMS to maintain system-wide consistency.
6. Local Databases at Site1 and Site2

● Overview: The physical databases at each site, storing a portion of the global data
as defined by fragmentation and allocation schemas.
● Function: Managed by the local DBMS, these databases allow independent querying
and updating while maintaining overall system integrity.

10. Explain Components of a Distributed DBMS.

The different components of DDBMS are as follows:

• Computer workstations or remote devices (sites or nodes) that form the network
system. The distributed database system must be independent of the computer
system hardware.

• Network hardware and software components that reside in each workstation or


device. The network components allow all sites to interact and exchange data.
Because the components—computers, operating systems, network hardware, and so
on—are likely to be supplied by different vendors, it is best to ensure that
distributed database functions can be run

• Communications media that carry the data from one node to another. The DDBMS
must be communications media-independent; that is, it must be able to support
several types of communications media.

• The transaction processor (TP), which is the software component found in each
computer or device that requests data. The transaction processor receives and
processes the application’s data requests (remote and local). The TP is also known
as the application processor (AP) or the transaction manager (TM).

• The data processor (DP), which is the software component residing on each
computer or device that stores and retrieves data located at the site. The DP is
also known as the data manager (DM). A data processor may even be a centralized
DBMS.
The following Figure illustrates the placement of the components and the
interaction among them. The communication among TPs and DPs shown in the figure
is made possible through a specific set of rules, or protocols, used by the DDBMS.

The protocols determine how the distributed database system will:

• Interface with the network to transport data and commands between data
processors (DPs) and transaction processors (TPs).

• Synchronize all data received from DPs (TP side) and route retrieved data to the
appropriate TPs (DP side).

20. Explain Data Independence.


Data independence is a key concept in database management systems (DBMS) that allows
users and applications to interact with data without being affected by changes to the
database structure. It allows for the separation of data storage and data access.
11. Explain Multidatabase System Architecture.

Multi – DBMS Architectures

This is an integrated database system formed by a collection of two or more autonomous


database systems.

Multi-DBMS can be expressed through six levels of schemas −


Multi-database View Level − Depicts multiple user views comprising of subsets of the
integrated distributed database.
Multi-database Conceptual Level − Depicts integrated multi-database that comprises of
global logical multi-database structure definitions.
Multi-database Internal Level − Depicts the data distribution across different sites and
multi-database to local data mapping.
Local database View Level − Depicts public view of local data.
Local database Conceptual Level − Depicts local data organization at each site.
Local database Internal Level − Depicts physical data organization at each site.
12. Explain Horizontal and Vertical Fragmentation with example.

Horizontal Fragmentation

Definition:

● Horizontal Fragmentation involves dividing a database table into multiple fragments, where
each fragment contains a subset of rows (records) based on certain criteria. All fragments
share the same set of attributes (columns) but differ in the rows they store.

Consider a CUSTOMER table with the following schema:

CUSTOMER(CUST_ID, NAME, CITY, AGE)

Suppose we want to horizontally fragment the CUSTOMER table based on the CITY attribute.

● Fragment 1: Customers from "New York":


CUSTOMER_NY(CUST_ID, NAME, CITY, AGE)

Vertical Fragmentation

Definition:

● Vertical Fragmentation involves dividing a database table into multiple fragments, where
each fragment contains a subset of columns (attributes). Each fragment typically includes
the primary key to allow reassembly of the original table when needed.

Example:

Using the same CUSTOMER table:


CUSTOMER(CUST_ID, NAME, CITY, AGE)

● Suppose we want to vertically fragment the CUSTOMER table into two fragments:

Fragment 1: Contains identification and basic details:


CUSTOMER_BASIC(CUST_ID, NAME)

○ Fragment 2: Contains additional details:


CUSTOMER_DETAILS(CUST_ID, CITY, AGE)
13. Explain Components of an MDBS.

User

● The user interacts with the MDBS by sending requests and receiving responses.
The user is typically unaware of the underlying multiple databases being managed by
the system.

2. MDBS Layer

● MDBS Layer: This is the central component that manages interactions between the
user and the underlying component DBMSs. It handles user requests, coordinates
operations across multiple databases, and consolidates the results into a unified
response.
● System Responses: The MDBS layer aggregates and processes the data from
different component DBMSs and sends the final result back to the user.
● User Requests: The MDBS layer receives queries from the user, which may involve
accessing multiple databases.

3. Component DBMS

● Component DBMS: These are the individual database management systems that the
MDBS coordinates. Each Component DBMS manages its own database
independently.
● Each Component DBMS processes parts of the query as instructed by the MDBS
layer and then sends the required data back to the MDBS layer.
14. Explain coarse grained and fine-grained parallelism.

Coarse-Grained Parallelism

● Overview: Coarse-grained parallelism involves dividing a task into large, independent


chunks that can run in parallel with minimal interaction.
● Characteristics:
○ Large tasks with infrequent communication.
○ Lower synchronization overhead.
● Use Cases: Ideal for distributed systems where communication costs are high, such
as batch processing or large-scale simulations.

Fine-Grained Parallelism

● Overview: Fine-grained parallelism breaks a task into smaller units that frequently
interact and synchronize.
● Characteristics:
○ Small tasks with frequent communication.
○ Higher synchronization overhead.
● Use Cases: Best suited for shared memory systems or real-time applications where
rapid, coordinated processing is needed.

15. Explain Test and Set(M) and Compare and Swap(M) atomic instructions.

Test-And-Set (M):

● Purpose: To manage access to a shared resource in concurrent programming by using a


mutual exclusion mechanism.
● How It Works:
○ Initial State: The memory location MMM is initially set to 0.
○ Operation: When a process executes Test-and-set(M), it sets MMM to 1 and
returns the old value of MMM.
○ Outcome:
■ If the return value is 0, the process has successfully acquired the mutex.
■ If the return value is 1, the mutex is already held by another process, so it
must try again later.
○ Releasing the Mutex: This is done by setting MMM back to 0, allowing other
processes to acquire the mutex.
Compare-and-Swap (M, V1, V2):

● Purpose: Provides a way to perform conditional updates to shared variables atomically,


crucial for implementing lock-free data structures.
● How It Works:
○ Operation: Compare-and-swap(M, V1, V2) atomically checks if the value of MMM
equals V1V1V1.
■ If true, it sets MMM to V2V2V2 and returns success.
■ If false, it leaves MMM unchanged and returns failure.
○ Use Case:
■ With M=0M = 0M=0 initially, CAS(M, 0, 1) behaves like Test-and-set(M).
■ Alternatively, CAS(M, 0, id) can be used to not only acquire the mutex but
also to record which thread or process id has the mutex.

16. Explain optimization techniques for Data Servers.


Indexing:

● Definition: Use indexes to speed up data retrieval by providing quick access paths to
data.
● Example: Indexing columns frequently used in WHERE clauses improves query
performance.

Caching:

● Definition: Store frequently accessed data in memory to reduce database load and
improve response times.
● Example: Implementing a caching layer for common queries.

Query Optimization:

● Definition: Refine SQL queries to reduce execution time and resource usage.
● Example: Avoiding SELECT * and using specific columns instead.

Load Balancing:

● Definition: Distribute incoming requests evenly across multiple servers to prevent


any single server from becoming a bottleneck.
● Example: Using a load balancer to route requests to the least busy server.
Partitioning:

● Definition: Split large tables into smaller, more manageable pieces to improve query
performance.
● Example: Horizontal partitioning based on date ranges in a transaction table.

17. Explain Speed up and scale up.

18. Explain throughput and response time.

Throughput

● Definition: Throughput is the amount of work a system can process in a given period
of time.
● Example: In a web server, throughput could be measured as the number of requests
handled per second.

Response Time

● Definition: Response time is the time taken for a system to respond to a request
from the moment it is made.
● Example: In a database query, response time is the time from submitting the query
to receiving the result.
19. Consider an engineering firm that has offices in Boston, Waterloo, Paris and
San Francisco. They run projects at each of these sites and would like to
maintain a database of their employees, the projects and other related data.
Assuming that the database is relational, we can store this information in two
relations: EMP (ENO, ENAME, TITLE)1 and PROJ (PNO, PNAME,
BUDGET). We also introduce a third relation to store salary information: SAL
(TITLE, AMT) and a fourth relation ASG which indicates which employees
have been assigned to which projects for what duration with what
responsibility’s (ENO, PNO, RESP, DUR). Write down the SQL query to find
out the names and employees who worked on a project for more than 24
Months.

Ans:

SELECT E.ENAME
FROM EMP E
JOIN ASG A ON E.ENO = A.ENO
WHERE A.DUR > 24;

21. Explain Design Issues in DDBs.

When designing a distributed database system (DDBS), two main issues to consider are
data fragmentation and data allocation:
Data fragmentation: This is the process of splitting relations into smaller parts.
The fragments must be able to be reconstructed into the original relation without
losing any data.
Data allocation: This is the process of determining how the fragments should be
allocated.

Other challenges to consider when designing a distributed system include:


Heterogeneity, Scalability, Openness, Concurrency, Security, and Failure handling.
Data replication is a popular fault tolerance technique for DDBS. It involves storing
separate copies of the database at multiple sites.

You might also like