Unit - 1 Notes
Unit - 1 Notes
Data Storage: A database system provides a structured and organized method for storing vast
amounts of data, ranging from simple text and numbers to multimedia content like images,
audio, and video.
Data Retrieval: Users can easily search and retrieve specific data from the database, using
queries and filters. This makes it efficient to extract information needed for various tasks and
analyses.
Data Integrity and Security: Database systems employ mechanisms to ensure data integrity,
preventing data duplication and maintaining consistency. Additionally, access controls and
authentication mechanisms are implemented to secure sensitive data from unauthorized access.
Data Manipulation: Database systems support various operations like inserting new data,
updating existing records, and deleting unwanted information, enabling users to manage the
data effectively.
Data Sharing and Collaboration: Multiple users can access the same database
simultaneously, facilitating collaboration and information sharing within an organization.
Data Concurrency Control: Database systems ensure that multiple users can access and
manipulate data concurrently without causing inconsistencies or conflicts.
Data Recovery: Database systems often include backup and recovery mechanisms to protect
against data loss due to system failures, human errors, or disasters.
Data Analysis and Reporting: By organizing data in a structured manner, database systems
enable the use of analytical tools and techniques to derive insights, generate reports, and make
informed decisions.
Scalability: Database systems can handle increasing amounts of data and user demands by
scaling up or out, depending on the requirements.
Overall, database systems play a critical role in managing and leveraging data effectively,
making them an integral part of modern information systems and applications.
In the context of a database system, the terms "database schema" and "database instances" refer
to fundamental concepts related to the organization and content of data.
Database Schema:
A database schema defines the logical and structural layout of a database. It represents the
overall blueprint of how the data is organized, the relationships between different data
elements, and the constraints imposed on the data. A schema is essential for ensuring data
integrity, consistency, and efficient data retrieval.
Key components of a database schema include:
Tables: Tables are the primary structures within a database and represent entities, such as
customers, products, or orders. Each table consists of rows and columns where data is stored.
Columns: Columns, also known as attributes, represent individual data fields within a table.
Each column has a specific data type, defining the kind of data it can store (e.g., text, numbers,
dates).
Primary Key: A primary key is a unique identifier for each row in a table. It ensures that each
record can be uniquely identified and helps establish relationships between tables.
Foreign Key: A foreign key is a column in one table that refers to the primary key of another
table. It establishes relationships between tables, enabling the creation of links between related
data.
Constraints: Constraints define rules and conditions that the data in the database must follow,
ensuring data consistency and accuracy. Examples include unique constraints, check
constraints, and not-null constraints.
Database Instances:
A database instance, also known as a database state, refers to the actual data stored in a database
at a particular point in time. It represents the current snapshot of the data contained within the
database schema. When data is inserted, updated, or deleted in the database, the instance
changes to reflect the modifications.
For example, consider a database schema that includes two tables: "Customers" and "Orders."
The schema defines the structure of these tables, the relationships between them (e.g., the
"Orders" table may have a foreign key referring to the "Customers" table), and any constraints
on the data.
A database instance would be the specific set of data currently present in the "Customers" and
"Orders" tables, including all the rows and their values. As new customers are added or orders
are placed, the database instance is updated to reflect these changes.
3. Views of data
A database system is a collection of interrelated data and a set of programs that allows users to
access and modify these data. A major purpose of a database system is to provide users with
an abstract view of the data. That is, the system hides certain details of how data are stored and
maintained.
In the traditional database management system architecture, the data abstraction is organized
into three levels:
Physical Level: This level is concerned with the physical storage details of data on the storage
medium. It deals with how the data is stored in terms of bytes, blocks, pages, and how it is
physically organized on disk or other storage devices. At this level, the database management
system interacts directly with the storage system, and it is aware of hardware-specific details.
Logical Level: The logical level sits above the physical level and is concerned with the logical
representation of data, independent of the physical storage details. At this level, the data is
organized into tables, views, and relationships, and the database management system provides
an abstract view of the data. Users and application programs interact with the database at the
logical level using high-level query languages like SQL.
View Level: The view level is the highest level of data abstraction. It involves creating
customized views of the data for specific users or applications. Views allow users to see a
subset of the data or present the data in a format that is most relevant to their needs. The view
level abstracts away the underlying complexity of the data model and provides a simplified
view tailored to the specific requirements of different users.
The concept of data abstraction and the three levels are crucial for managing complex databases
efficiently. They provide a way to hide the implementation details from users and application
developers, enabling them to work with data at higher levels of abstraction without needing to
understand the intricacies of data storage and management.
By separating the physical, logical, and view levels, database management systems can achieve
better data independence, making it easier to modify the database structure or storage without
affecting the applications that use the data. This modular approach simplifies database
development, maintenance, and scalability. Additionally, it enhances security by controlling
access to the data at different levels based on user privileges.
View Level
Logical Level
Physical Level
4. Database Languages
Database languages are programming languages specifically designed for interacting with
databases. These languages provide a standardized and efficient way for users, applications,
and database administrators to perform various operations on the data stored in a database.
There are primarily three types of database languages:
INSERT INTO Employees (ID, FirstName, LastName, Age) VALUES (1, 'John', 'Doe',
35);
SQL (Structured Query Language) is the most widely used and well-known database
language, which incorporates all three types of database languages: DDL, DML, and DCL.
However, other database systems may have their specific languages or extensions for
interacting with the data, but the principles of DDL, DML, and DCL generally apply across
various database systems.
Database system architecture refers to the overall design and structure of a database
management system (DBMS). It encompasses the components, modules, and interactions that
make up the database system, facilitating data storage, retrieval, manipulation, and
management. A well-designed architecture ensures the efficient and reliable functioning of the
database system, optimizing performance, scalability, and security. The architecture of a
database system typically consists of the following key components:
Data Storage:
This component involves the physical storage of data on disk or other storage media. The data
is organized into data files or tables, each containing records with attributes or fields. Various
data structures and techniques are used to optimize data storage and access, such as indexing,
partitioning, and clustering.
Query Processor:
The query processor is a vital part of the database system architecture responsible for parsing,
optimizing, and executing user queries. When a user submits a query (e.g., SQL statement), the
query processor analyzes the query, determines the most efficient way to retrieve the data, and
optimizes the query execution plan. This involves selecting the appropriate indexes and access
paths to minimize the query's response time.
Transaction Manager:
The transaction manager handles transactions, which are sequences of one or more database
operations that are treated as a single unit of work. The transaction manager ensures the
atomicity, consistency, isolation, and durability (ACID properties) of transactions, preventing
data inconsistencies and ensuring data integrity.
Buffer Manager:
The buffer manager is responsible for managing the buffer pool, which is an area of memory
used to cache frequently accessed data pages from the disk. The buffer manager reduces disk
I/O by keeping commonly used data in memory, improving overall database performance.
Recovery Manager:
The recovery manager is responsible for ensuring the database's recoverability in the event of
a system failure or crash. It manages the logging and undo/redo mechanisms to support data
recovery to a consistent state after failure.
Overall, the architecture of a database system is designed to provide a robust and reliable
platform for storing, managing, and accessing data efficiently, catering to the needs of various
users and applications while maintaining data integrity and security.
6. Database Users and Administrator
In a database system, there are different types of users who interact with the database, each
having specific roles and responsibilities. The two primary categories of users are regular
database users and the database administrator.
Application Users: Application users are users who interact with the database indirectly
through applications or software systems. These applications are developed to perform specific
tasks, and they use database queries to access and manipulate data on behalf of the application
users.
Regular database users typically have limited access to the database, only able to perform
specific operations based on the permissions granted by the database administrator. They are
not involved in the administration or management of the database itself.
Database Security: Setting up and managing user accounts, access permissions, and
authentication to ensure data security and prevent unauthorized access.
Data Backup and Recovery: Implementing regular data backups to protect against data loss
due to hardware failures, software errors, or disasters. The DBA is also responsible for
developing strategies for data recovery.
Schema Management: Designing, creating, and modifying the database schema to ensure data
integrity, efficiency, and ease of use.
User Support: Assisting end-users and application developers in using the database
effectively, troubleshooting issues, and providing support as needed.
Database Upgrades and Patches: Managing database software upgrades and applying
patches to address security vulnerabilities and bug fixes.
The database administrator has access to higher privileges and capabilities to manage the entire
database system, making sure it meets the organization's needs and adheres to industry best
practices. Their role is critical in maintaining data integrity, security, and availability while
optimizing database performance for smooth and efficient operations.
7. Distributed Database
A distributed database is a type of database system that stores data across multiple physical
locations or nodes, connected through a computer network. In a distributed database, data is
distributed and replicated across different servers or sites, allowing for more efficient data
management, improved scalability, fault tolerance, and enhanced performance. Distributed
databases are commonly used in large-scale systems where a centralized database would
become a bottleneck or a single point of failure.
Data Replication: Data may be replicated across multiple nodes to improve data availability
and fault tolerance. Replication ensures that copies of data exist in different locations, reducing
the risk of data loss in case of server failures or network outages.
Data Fragmentation: Data fragmentation involves dividing data into smaller parts or
fragments and storing each fragment on different nodes. This approach can improve query
performance by allowing parallel processing and reducing data access latency.
Load Balancing: Distributed databases aim to distribute data and processing load evenly
across nodes to avoid overloading any specific server. Load balancing helps ensure optimal
performance and resource utilization.
Improved performance and scalability: Distributing data across multiple nodes allows for
parallel processing and better performance for large-scale applications with high data volumes
and user concurrency.
Increased fault tolerance: Data replication and distribution reduce the risk of data loss and
system failures. If one node fails, the data is still available from other replicas.
Enhanced availability: Distributed databases can remain operational even if some nodes are
offline or experiencing issues, ensuring continuous access to data.
Geographic distribution: Distributed databases enable data to be stored close to the end-users,
reducing data access latency and improving user experience.
Challenges of Distributed Databases:
Complexity: Designing, managing, and maintaining a distributed database system can be more
complex than a centralized database.
Data consistency: Ensuring data consistency across distributed nodes while supporting
concurrent access can be challenging and may require careful planning and coordination.
Network latency: Data transfer over the network can introduce latency, impacting response
times and overall performance.
Data security: Securing data across multiple nodes and network communication requires
robust security measures and encryption protocols.
Despite the challenges, distributed databases offer significant advantages for large-scale and
geographically dispersed systems, making them a vital component of modern data management
strategies.
Distributed Data: The database is divided into smaller fragments or partitions, and each
fragment is distributed across multiple nodes or servers. These nodes can be located in
different geographic locations or data centers.
Data Replication: To improve data availability and fault tolerance, some distributed
databases use data replication. Data is duplicated across multiple nodes so that if one node
fails, the data can still be accessed from other nodes.
Distributed Query Processing: Distributed database systems need to handle queries that
span multiple nodes and retrieve data from various locations. Query optimizers determine
the most efficient way to execute queries, considering data distribution, network latency,
and other factors.
Load Balancing: To evenly distribute the workload among nodes and avoid bottlenecks,
load balancing mechanisms are employed. Load balancing helps optimize resource
utilization and improves the overall performance of the distributed database system.
Security and Access Control: Distributed database systems must implement robust
security measures to protect data privacy and prevent unauthorized access. Access control
mechanisms are used to define user permissions and restrict data access based on roles and
privileges.
A distributed database is a set of logically interrelated database that are stored on computers
at several geographically different sites and are linked by means of a computer network.
These interrelated database work together to perform certain specific tasks.
Distributed computer system work by splitting a large task into a number of smaller ones that
can then be solved in a coordinated fashion. Each processing element can be managed
independently and can be developed its own applications.
9. Database Model
Database models are conceptual frameworks that define the structure, organization, and
relationships of data within a database. These models provide a way to represent and
understand how data is stored and accessed in a database system. Different types of
database models have been developed over time, each with its own characteristics and
use cases. Some of the common database models are:
Hierarchical Model:
In the hierarchical model, data is organized in a tree-like structure, where each record
(or data element) has a parent-child relationship with other records. The top-level record
represents the root, and child records can have only one parent. This model was
prevalent in early database systems and is now less commonly used.
Network Model:
The network model extends the hierarchical model by allowing records to have multiple
parent records, forming a more complex interconnected network of relationships. This
model was also widely used in the early days of databases but has been largely replaced
by other models like the relational model.
Relational Model:
The relational model is the most widely used database model today. It organizes data
into tables, where each table represents an entity (e.g., customers, products) and each
row in the table represents a specific record or instance of that entity. The relationships
between entities are established through foreign keys, which reference the primary keys
of related tables.
Object-Oriented Model:
The object-oriented model represents data as objects, similar to the concepts in object-
oriented programming. Each object encapsulates data and behaviors (methods) related
to that data. This model is often used in object-oriented databases, which combine the
benefits of object-oriented programming with data storage and retrieval.
Object-Relational Model:
The object-relational model combines elements of both the relational and object-
oriented models. It extends the relational model to support complex data types, user-
defined data types, and object-oriented features like inheritance and polymorphism.
NoSQL Models:
NoSQL (Not Only SQL) databases do not adhere to the traditional relational model and
instead use various non-relational data models to store and manage data. Common
NoSQL models include:
Document Model: Data is stored as documents (e.g., JSON, XML) that can have nested
structures and different attributes.
Key-Value Model: Data is stored as key-value pairs, enabling fast and simple data
access.
Columnar Model: Data is stored in columns rather than rows, allowing for efficient
data retrieval and analytics.
Each database model has its strengths and weaknesses, and the choice of model depends
on the specific requirements of the application and the nature of the data being stored
and accessed.
10. Entity Relationship Model
The Entity-Relationship (ER) model is a high-level data model used to represent the conceptual
view of a database. It provides a graphical and intuitive representation of the structure of a
database, focusing on the entities (objects or things) within the system and the relationships
between them. The ER model is widely used during the initial phases of database design to
capture the data requirements and to create a blueprint for the database schema before
implementation.
Entity:
An entity is a real-world object or concept with a distinct identity, represented as a rectangle in
the ER diagram. Each entity is uniquely identifiable and can have attributes that describe its
properties.
Attributes:
Attributes are the characteristics or properties of an entity, represented as ovals or ellipses in
the ER diagram. Attributes define the specific details or information associated with the entity.
For example, in a "Customer" entity, attributes could include "CustomerID," "Name," "Email,"
and "Address."
Relationships:
Relationships depict the associations between two or more entities, represented as lines
connecting the related entities. Relationships describe how entities are related to each other and
can have cardinality and optionality constraints.
Cardinality: Cardinality represents the number of instances of one entity that can be related
to the number of instances of another entity through the relationship. Common cardinalities
include one-to-one (1:1), one-to-many (1:N), and many-to-many (N:M).
Optionality: Optionality refers to whether a relationship is mandatory or optional for an entity.
It is denoted as either total (mandatory) or partial (optional) participation.
Primary Key:
A primary key is an attribute or a combination of attributes that uniquely identifies each
instance (row) of an entity. It is represented by an underline in the ER diagram.
Weak Entity:
A weak entity is an entity that cannot be uniquely identified by its attributes alone. It relies on
a relationship with a strong entity (owner entity) for its identification. Weak entities are
represented with a double rectangle in the ER diagram.
Identifying Relationship:
An identifying relationship is a type of relationship where a weak entity's existence is
dependent on its association with a strong entity. The identifying relationship is represented
with a solid line connecting the weak entity to the strong entity.
The ER model is typically represented graphically using ER diagrams, which use various
symbols and notations to represent entities, attributes, and relationships. ER diagrams are
helpful for visualizing the database schema and understanding the data requirements and
relationships between entities in a clear and concise manner.
Overall, the Entity-Relationship model provides a foundation for database design and helps
database designers, developers, and stakeholders to conceptualize and communicate the
structure and organization of data in a database system.
E-R Diagram
Entity:
Entities are represented as rectangles in the diagram. Each entity has a name that describes the
real-world object it represents.
Example:
Attribute:
Attributes are represented as ovals or ellipses connected to the entity they belong to. They
describe the properties or characteristics of the entity.
Example:
In this example, "EmployeeID" is an attribute of the "Employee" entity, and (PK) denotes that
it is the primary key attribute.
Relationship:
Relationships are represented as diamond shapes connecting two or more entities. They
describe the associations between entities.
Example:
In this example, the relationship "Works_On" connects the "Employee" and "Project" entities,
indicating that employees work on projects.
Cardinality Notation:
Cardinality notation is used to represent the number of instances of one entity that can be
associated with the number of instances of another entity through a relationship. Common
cardinalities include one-to-one (1:1), one-to-many (1:N), and many-to-many (N:M).
Example:
In this example, the cardinality "(1:N)" indicates that one employee can work on multiple
projects, but each project is associated with one employee.
E-R diagrams provide a visual representation of the database structure, making it easier for
designers, developers, and stakeholders to understand the data model and relationships within
the database system. They serve as a valuable tool in the database design process.
Hierarchical Model :
This is one of the oldest models in a data model which was developed by IBM, in the 1950s.
In a hierarchical model, data are viewed as a collection of tables, or we can say segments that
form a hierarchical relation. In this, the data is organized into a tree-like structure where each
record consists of one parent record and many children. Even if the segments are connected as
a chain-like structure by logical associations, then the instant structure can be a fan structure
with multiple branches. We call the illogical associations as directional associations.
In the hierarchical model, segments pointed to by the logical association are called the child
segment and the other segment is called the parent segment. If there is a segment without a
parent is then that will be called the root and the segment which has no children are called
the leaves. The main disadvantage of the hierarchical model is that it can have one-to-one and
one-to-many relationships between the nodes.
Applications of hierarchical model :
Hierarchical models are generally used as semantic models in practice as many real-world
occurrences of events are hierarchical in nature like biological structures, political, or social
structures.
Hierarchical models are also commonly used as physical models because of the inherent
hierarchical structure of the disk storage system like tracks, cylinders, etc. There are various
examples such as Information Management System (IMS) by IBM, NOMAD by NCSS,
etc.
Example 1: Consider the below Student database system hierarchical model.
In the above-given figure, we have few students and few course-enroll and a course can be
assigned to a single student only, but a student can enroll in any number of courses and with
this the relationship becomes one-to-many. We can represent the given hierarchical model like
the below relational tables:
FACULTY Table
STUDENT Table
Akash Reddy CA B
Dhivya SE A
Mani Reddy SE B
Example 2: Consider the below cricket database system hierarchical model scheme.
Here, in this example, for each player, there are some set of positions (P_POSITION) he plays,
a set of places (P_PLACE), and also a set of birthdates (P_BDATE) of the players. In the above
figure, each node represents a logical record type and is displayed by a list of its fields. The
child node represents a set of records that are connected to each record of the parent type, which
is due to a many-to-many relationship is from child to parent. In the above, figure, the root
node PLAYER states that for every player there will be a set of positions, a set of places (only
one), and a set of birthdates (which is only one).
Advantages of the hierarchical model :
As the database is based on this architecture the relationships between various layers are
logically simple so, it has a very simple hierarchical database structure.
It has data sharing as all data are held in a common database data and therefore sharing of
data becomes practical.
It offers data security and this model was the first database model that offered data security.
There’s also data integrity as it is based on the parent-child relationship and also there’s
always a link between the parents and the child segments.
Network Model :
This model was formalized by the Database Task group in the 1960s. This model is the
generalization of the hierarchical model. This model can consist of multiple parent segments
and these segments are grouped as levels but there exists a logical association between the
segments belonging to any level. Mostly, there exists a many-to-many logical association
between any of the two segments. We called graphs the logical associations between the
segments. Therefore, this model replaces the hierarchical tree with a graph-like structure, and
with that, there can more general connections among different nodes. It can have M: N relations
i.e, many-to-many which allows a record to have more than one parent segment.
Here, a relationship is called a set, and each set is made up of at least 2 types of record which
are given below:
An owner record that is the same as of parent in the hierarchical model.
A member record that is the same as of child in the hierarchical model.
Structure of a Network Model :
So, In a network model, a one-to-many (1: N) relationship has a link between two record
types. Now, in the above figure, SALES-MAN, CUSTOMER, PRODUCT, INVOICE,
PAYMENT, INVOICE-LINE are the types of records for the sales of a company. Now, as you
can see in the given figure, INVOICE-LINE is owned by PRODUCT & INVOICE. INVOICE
has also two owners SALES-MAN & CUSTOMER.
Let’s see another example, in which we have two segments, Faculty and Student. Say that
student John takes courses both in CS and EE departments. Now, find how many instances will
be there?
For the above example, a students instance can have at least 2 parent instances therefore, there
exist relations between the instances of students and faculty segment. The model can be very
complex as if we use other segments say Courses and logical associations like Student-Enroll
and Faculty-course. So, in this model, a student can be logically associated with various
instances of Faculties and Courses.
Advantages of Network Model :
This model is very simple and easy to design like the hierarchical data model.
This model is capable of handling multiple types of relationships which can help in
modeling real-life applications, for example, 1: 1, 1: M, M: N relationships.
In this model, we can access the data easily, and also there is a chance that the application
can access the owner’s and the member’s records within a set.
This network does not allow a member to exist without an owner which leads to the concept
of Data integrity.
Like a hierarchical model, this model also does not have any database standard,
Relational Model
E.F. Codd proposed the relational Model to model data in the form of relations or tables. After
designing the conceptual model of the Database using ER diagram, we need to convert the
conceptual model into a relational model which can be implemented using
any RDBMS language like Oracle SQL, MySQL, etc.
The relational model represents how data is stored in Relational Databases. A relational
database consists of a collection of tables, each of which is assigned a unique name. Consider
a relation STUDENT with attributes ROLL_NO, NAME, ADDRESS, PHONE, and AGE
shown in the table.
Table Student
ROLL_NO NAME ADDRESS PHONE AGE
Important Terminologies
Attribute: Attributes are the properties that define an entity. e.g.; ROLL_NO, NAME,
ADDRESS
Relation Schema: A relation schema defines the structure of the relation and represents the
name of the relation with its attributes. e.g.; STUDENT (ROLL_NO, NAME, ADDRESS,
PHONE, and AGE) is the relation schema for STUDENT. If a schema has more than 1 relation,
it is called Relational Schema.
Tuple: Each row in the relation is known as a tuple. The above relation contains 4 tuples, one
of which is shown as:
NULL Values: The value which is not known or unavailable is called a NULL value. It is
represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is NULL.
Relation Key: These are basically the keys that are used to identify the rows uniquely or also
help in identifying tables. These are of the following types.
Primary Key
Candidate Key
Super Key
Foreign Key
Alternate Key
Composite Key
Constraints in Relational Model
While designing the Relational Model, we define some conditions which must hold for data
present in the database are called Constraints. These constraints are checked before performing
any operation (insertion, deletion, and updation ) in the database. If there is a violation of any
of the constraints, the operation will fail.
Domain Constraints
These are attribute-level constraints. An attribute can only take values that lie inside the domain
range. e.g.; If a constraint AGE>0 is applied to STUDENT relation, inserting a negative value
of AGE will result in failure.
Key Integrity
Every relation in the database should have at least one set of attributes that defines a tuple
uniquely. Those set of attributes is called keys. e.g.; ROLL_NO in STUDENT is key. No two
students can have the same roll number. So a key has two properties:
It should be unique for all tuples.
It can’t have NULL values.
Referential Integrity
When one attribute of a relation can only take values from another attribute of the same relation
or any other relation, it is called referential integrity. Let us suppose we have 2 relations
Table Student
ROLL_NO NAME ADDRESS PHONE AGE BRANCH_CODE
Table Branch
BRANCH_CODE BRANCH_NAME
CS COMPUTER SCIENCE
IT INFORMATION TECHNOLOGY
CV CIVIL ENGINEERING
BRANCH_CODE of STUDENT can only take the values which are present in
BRANCH_CODE of BRANCH which is called referential integrity constraint. The relation
which is referencing another relation is called REFERENCING RELATION (STUDENT in
this case) and the relation to which other relations refer is called REFERENCED RELATION
(BRANCH in this case).
Anomalies in the Relational Model
An anomaly is an irregularity or something which deviates from the expected or normal state.
When designing databases, we identify three types of anomalies: Insert, Update, and Delete.
Insertion Anomaly in Referencing Relation
We can’t insert a row in REFERENCING RELATION if referencing attribute’s value is not
present in the referenced attribute value. e.g.; Insertion of a student with BRANCH_CODE
‘ME’ in STUDENT relation will result in an error because ‘ME’ is not present in
BRANCH_CODE of BRANCH.
Deletion/ Updation Anomaly in Referenced Relation:
We can’t delete or update a row from REFERENCED RELATION if the value of
REFERENCED ATTRIBUTE is used in the value of REFERENCING ATTRIBUTE. e.g; if
we try to delete a tuple from BRANCH having BRANCH_CODE ‘CS’, it will result in an error
because ‘CS’ is referenced by BRANCH_CODE of STUDENT, but if we try to delete the row
from BRANCH with BRANCH_CODE CV, it will be deleted as the value is not been used by
referencing relation. It can be handled by the following method:
On Delete Cascade
It will delete the tuples from REFERENCING RELATION if the value used by
REFERENCING ATTRIBUTE is deleted from REFERENCED RELATION. e.g.; For, if we
delete a row from BRANCH with BRANCH_CODE ‘CS’, the rows in STUDENT relation
with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case) will be deleted.
On Update Cascade
It will update the REFERENCING ATTRIBUTE in REFERENCING RELATION if the
attribute value used by REFERENCING ATTRIBUTE is updated in REFERENCED
RELATION. e.g;, if we update a row from BRANCH with BRANCH_CODE ‘CS’ to ‘CSE’,
the rows in STUDENT relation with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case)
will be updated with BRANCH_CODE ‘CSE’.
Super Keys
Any set of attributes that allows us to identify unique rows (tuples) in a given relationship is
known as super keys. Out of these super keys, we can always choose a proper subset among
these that can be used as a primary key. Such keys are known as Candidate keys. If there is a
combination of two or more attributes that are being used as the primary key then we call it a
Composite key.
Codd Rules in Relational Model
Edgar F Codd proposed the relational database model where he stated rules. Now these are
known as Codd’s Rules. For any database to be the perfect one, it has to follow the rules.
For more, refer to Codd Rules in Relational Model.
DBTG
The DBTG (Data Base Task Group) model, also known as the CODASYL (Conference on
Data Systems Languages) model, was a database management system (DBMS) model
proposed in the late 1960s and early 1970s. It was an early attempt to standardize database
systems and provided a blueprint for the development of network database systems.
Records and Sets: Data in the DBTG model is organized into records, which are collections of
related data items. Records are grouped into sets, which represent entity types or relati
Pointers: The network structure is facilitated by using pointers, which are references between
records. Pointers provide navigation paths to traverse the network from one record to another,
enabling efficient access to related data.
Hierarchical and Non-Hierarchical Relationships: The DBTG model allows both hierarchical
and non-hierarchical (network) relationships between records. This flexibility allows for
complex data relationships to be represented.
Data Manipulation Language (DML): The DBTG model introduced a DML called COBOL-
74, which was based on the COBOL programming language. The DML provided commands
for navigating through the network, inserting, updating, and deleting records.
Data Definition Language (DDL): The DBTG model included a DDL for defining the schema
of the database, specifying record structures, sets, and relationships.
Record Type and Record Occurrence: The model distinguished between record types (entity
types) and record occurrences (actual data instances). Record types define the structure of
records, while record occurrences represent individual data instances.
The DBTG/CODASYL model was widely adopted in the 1970s and early 1980s, especially for
large-scale and complex database applications in areas such as government, finance, and
scientific research. However, as the demand for more flexible and user-friendly database
systems grew, and the relational database model emerged, the DBTG model lost popularity.
The relational model, with its simplicity and declarative querying language (SQL), eventually
became the dominant database model, and most modern databases are based on the relational
paradigm. Nonetheless, the DBTG/CODASYL model played a significant role in the evolution
of database management systems and laid the groundwork for subsequent database models and
technologies.
In the DBTG (CODASYL DBTG) model, data retrieval, processing, and update facilities are
based on a hierarchical approach, as discussed earlier. Additionally, the model also includes
features like "Find," "Get," and "Set" processing for accessing and manipulating data.
Furthermore, the DBTG model allows for mapping the hierarchical network to physical files
on a storage medium. Let's delve into each of these aspects:
Data Retrieval: As mentioned earlier, data retrieval in the DBTG model involves traversing the
hierarchical structure by following relationships between records using pointers or links. Users
can start at the root record and navigate through parent-child relationships to reach the desired
data. Queries can specify search criteria to filter the records.
Find Processing: "Find" processing in the DBTG model is used to search for specific records
that match certain criteria. It allows users to specify conditions to locate records with particular
attribute values or patterns. The "Find" operation returns the first record that satisfies the
specified conditions.
Get Processing: "Get" processing is used to retrieve a specific record based on a given key or
identifier. It is similar to a key-value lookup operation. Users provide a unique identifier, and
the system retrieves the corresponding record.
Set Processing: "Set" processing is used to modify or update data in the database. It allows
users to change the attribute values of a record or perform actions on a group of records that
meet certain criteria.
Update Facility: The DBTG model provides an update facility to handle data modifications.
Users can update records, insert new records, or delete existing records using the data
manipulation language provided by the model.
Mapping Network to Files: In the DBTG model, the hierarchical network is mapped to physical
files on a storage medium (e.g., disk). Each record type corresponds to a file, and each set
within a record type corresponds to a block or group of records within that file. The hierarchical
relationships between records are maintained through pointers or links.
The physical file layout is essential for efficient data retrieval and storage. The DBTG model
incorporates a mapping mechanism to navigate from one record to another using physical file
addresses.
It's important to note that while the DBTG model was an important milestone in database
management history, it has largely been superseded by the relational model and modern
database management systems, which offer more flexible data models and standardized query
languages like SQL. The relational model's ability to handle complex relationships and its
simplicity in data retrieval and manipulation led to its widespread adoption in modern database
systems.
A car rental company wants to create a database system to manage its car rental operations.
The company offers a variety of cars for rent to customers. Each car has unique attributes such
as the License Plate Number, Make, Model, Year, and Daily Rental Price. The company also
maintains customer information and keeps track of rental transactions. Design an ER model for
the car rental company's database system based on the following requirements: Cars are
identified by their unique License Plate Number. Each car has attributes such as Make, Model,
Year, and Daily Rental Price. Customers are identified by their unique Customer ID, and their
information includes Name, Email, and Contact Number. Each customer can rent multiple cars
over time, and each car can be rented by multiple customers. Rental transactions need to be
recorded, including the Rental ID, Rental Date, Return Date, and Total Rental Cost. Each rental
transaction involves one or more cars rented by a single customer. The company also wants to
keep track of any damages or issues reported for each car during the rental period. Draw an ER
diagram to represent the entities and their relationships for the car rental company's database
system.
ER Model:
Based on the scenario, the following entities and their relationships can be identified:
Entities:
Car
License Plate Number (Primary Key)
Make
Model
Year
Daily Rental Price
Customer
Customer ID (Primary Key)
Name
Email
Contact Number
Rental
Rental ID (Primary Key)
Rental Date
Return Date
Total Rental Cost
Rental Details
Rental Detail ID (Primary Key)
Rental ID (Foreign Key referencing Rental)
License Plate Number (Foreign Key referencing Car)
Issue Description (to record any damages or issues reported for the car)
Relationships:
Car and Rental Details: One-to-Many relationship (Each car can be part of multiple rental
transactions with different rental details).
Customer and Rental: One-to-Many relationship (Each customer can have multiple rental
transactions).
Car and Rental Details: One-to-Many relationship (Each car can have multiple rental details
recorded for different rental transactions).
Rental and Rental Details: One-to-Many relationship (Each rental transaction can have multiple
rental details for multiple cars).
With this ER model, the car rental company can efficiently manage its car inventory, customer
information, rental transactions, and any issues reported during the rental period in their
database system.