File Systems
Introduction to Databases
Examples of Database
Applications
• Purchases from the supermarket
• Purchases using your credit card
• Booking a holiday at the travel agents
• Using the local library
• Taking out insurance
• Using the Internet
• Studying at university
File-Based Systems
• Collection of application programs that
perform services for the end users (e.g.
reports).
• Each program defines and manages its
own data.
File-Based Processing
Limitations of File-Based
Approach
• Separation and isolation of data
– Each program maintains its own set of data.
– Users of one program may be unaware of
potentially useful data held by other programs.
• Duplication of data
– Same data is held by different programs.
– Wasted space and potentially different values
and/or different formats for the same item.
Limitations of File-Based
Approach
• Data dependence
– File structure is defined in the program code.
• Incompatible file formats
– Programs are written in different languages, and so
cannot easily access each other’s files.
• Fixed Queries/Proliferation of
application programs
– Programs are written to satisfy particular
functions.
– Any new requirement needs a new program.
Database Approach
• Arose because:
– Definition of data was embedded in application
programs, rather than being stored separately and
independently.
– No control over access and manipulation of data
beyond that imposed by application programs.
• Result:
– the database and Database Management System
(DBMS).
Database
• Shared collection of logically related data (and
a description of this data), designed to meet the
information needs of an organization.
• System catalog (metadata) provides description
of data to enable program–data independence.
• Logically related data comprises entities,
attributes, and relationships of an
organization’s information.
Database Management System
(DBMS)
• A software system that enables users to
define, create, and maintain the database
and that provides controlled access to
this database.
Database Management System
(DBMS)
Database Approach
• Data definition language (DDL).
– Permits specification of data types, structures and
any data constraints.
– All specifications are stored in the database.
• Data manipulation language (DML).
– General enquiry facility (query language) of the
data.
Database Approach
• Controlled access to database may
include:
– A security system.
– An integrity system.
– A concurrency control system.
– A recovery control system.
– A user-accessible catalog.
• A view mechanism.
– Provides users with only the data they want or need
to use.
Views
• Allows each user to have his or her own
view of the database.
• A view is essentially some subset of the
database.
Views
• Benefits include:
– Reduce complexity;
– Provide a level of security;
– Provide a mechanism to customize the appearance
of the database;
– Present a consistent, unchanging picture of the
structure of the database, even if the underlying
database is changed.
Components of DBMS
Environment
Components of DBMS
Environment
• Hardware
– Can range from a PC to a network of
computers.
• Software
– DBMS, operating system, network software (if
necessary) and also the application programs.
• Data
– Used by the organization and a description
of this data called the schema.
Components of DBMS
Environment
• Procedures
– Instructions and rules that should be applied to
the design and use of the database and DBMS.
• People
Roles in the Database
Environment
• Data Administrator (DA)
• Database Administrator (DBA)
• Database Designers (Logical and
Physical)
• Application Programmers
• End Users (naive and sophisticated)
History of Database Systems
• First-generation
– Hierarchical and Network
• Second generation
– Relational
• Third generation
– Object Relational
– Object-Oriented
The DBMS Marketplace
• Relational DBMS companies – Oracle, Sybase – are among
the largest software companies in the world.
• IBM offers its relational DB2 system. With IMS, a
nonrelational system, IBM is by some accounts the largest
DBMS vendor in the world.
• Microsoft offers SQL-Server, plus Microsoft Access for the
cheap DBMS on the desktop, answered by “lite” systems
from other competitors.
• Relational companies also challenged by “object-oriented
DB” companies.
• But countered with “object-relational” systems, which retain
the relational core while allowing type extension as in OO
systems.
Hierarchical Database Model
• History:
– North American Rockwell developed
GUAM (Generalized Update Access
Method)
– Mid 1960s Rockwell partner with IBM to
create Information Management System
(IMS)
– IMS DB/DC lead the mainframe database
market in 70’s and early 80’s
– Represents well hoe components are
decomposed into parts
Hierarchical Database Model
• Logically represented by an upside
down tree
– Each parent can have many children
– Each child has only one parent
Figure 1.8
Hierarchical Database Model
• Advantages
– Conceptual simplicity
– Database security and integrity
– Data independence
– Efficiency
• Disadvantages
– Complex implementation
– Difficult to manage and lack of standards
– Lacks structural independence
– Applications programming and use complexity
– Implementation limitations (no M:N relationship)
Network Database Model
• History:
– CODASYL (Conference on Data Systems
Languages) created a group to work on
standardization of databases: Database
Task Group (DBTG)
– Identified 3 database component:
• Network schema (database organization)
• Subschema (views of database per user)
• Data management language
Network Database Model
• Each record can have multiple parents
– Composed of sets - relationships
– Each set has owner record and member record
– Member may have several owners
– A set represents a 1:M relationship between the
owner and the member
Figure 1.10
Network Database Model
• Advantages
– Conceptual simplicity
– Handles more relationship types
– Data access flexibility
– Promotes database integrity
– Data independence
– Conformance to standards
• Disadvantages
– System complexity
– Lack of structural independence
Relational Database Model
• First developed by E.F. Codd (IBM) in
1970
• First deployed on mainframe computers
(DB2), then also personal computers
• Oracle, Informix, SQL server, DB2
Relational Database Model
• Perceived by user as a collection of
tables for data storage
• Tables are a series of row/column
intersections (a row corresponds to a
record, a column to a field)
• Tables related by sharing common
entity characteristic(s)
• RDBMS
Relational Database Model
(con’t.)
Figure 1.11
Relational Database Model
• Advantages
– Structural independence
– Improved conceptual simplicity
– Easier database design, implementation,
management, and use
– Ad hoc query capability with SQL
– Powerful database management system
Relational Database Model
• Disadvantages
– Substantial hardware and system software
overhead
– Poor design and implementation is made
easy
– May promote “islands of information”
problems
Advantages of DBMSs
• Control of data redundancy
• Data consistency
• More information from the same amount of
data
• Sharing of data
• Improved data integrity
• Improved security
• Enforcement of standards
• Economy of scale
Advantages of DBMSs
• Balanced conflicting requirements
• Improved data accessibility and
responsiveness
• Increased productivity
• Improved maintenance through data
independence
• Increased concurrency
• Improved backup and recovery services
Disadvantages of DBMSs
• Complexity
• Size
• Cost of DBMS
• Additional hardware costs
• Cost of conversion
• Performance
• Higher impact of a failure
Database Design
• Database design deals with how to design a
database
• Importance of Good Design
– Poor design results in unwanted data redundancy
– Poor design generates errors leading to bad decisions
• Practical Approach
– Focus on principles and concepts of database design
– Importance of logical design