Interest free loans are available to
students from National Bank of
Pakistan.
Last date for applications is March
28th 2011.
For further information contact
Sundal Mufti (DSA) directly.
Chapter 1:
Overview Of Databases
Database
Meta Data
Points To Ponder
• What happens when a customer moves to a new house ?
• Who should have access to what data in this database ?
• What happens if Mr. and Mrs. Rameez both try and withdraw Rs
500 from account 3322 ?
• What happens if the system crashes just as Mr. Ahsan is depositing
his latest paycheck ?
• What data is the customer concerned with ?
What data is a bank manager concerned with ?
• Send a mail to all customers with saving accounts having greater
than RS 20000 balance
• Let all ALB customers know of a new branch location
Introduction
• Purpose of Database Systems
• View of Data
• Database Languages
• Relational Databases
• Database Design
• Object-based and semi-structured databases
• Data Storage and Querying
• Transaction Management
• Database Architecture
• Database Users and Administrators
• Overall Structure
• History of Database Systems
• Reference Model for DBMS Standardization
Database Management System
(DBMS)
• DBMS contains information about a
particular enterprise
– Collection of interrelated data
– Set of programs to access the data
– An environment that is both convenient and
efficient to use
• Database Applications:
– Banking: all transactions
– Airlines: reservations, schedules
– Universities: registration, grades
– Sales: customers, products, purchases
– Online retailers: order tracking, customized
recommendations
– Manufacturing: production, inventory, orders, supply
chain
– Human resources: employee records, salaries, tax
deductions
• Databases touch all aspects of our lives
Purpose of Database Systems
• In the early days, database
applications were built directly on top
of file systems
• Drawbacks of using file systems to
store data:
– Data redundancy and inconsistency
• Multiple file formats, duplication of information
in different files
– Difficulty in accessing data
• Need to write a new program to carry out each
new task
Drawbacks of using file systems (cont.)
– Data isolation — multiple files and formats
– Integrity problems
• Integrity constraints (e.g. account balance > 0)
become “buried” in program code rather than being
stated explicitly
• Hard to add new constraints or change existing
ones
Purpose of Database Systems
(Cont.)
• Drawbacks of using file systems (cont.)
– Atomicity of updates
• Failures may leave database in an inconsistent
state with partial updates carried out
• Example: Transfer of funds from one account to
another should either complete or not happen at all
– Concurrent access by multiple users
• Concurrent accessed needed for performance
• Uncontrolled concurrent accesses can lead to
inconsistencies
– Example: Two people reading a balance and updating it at
the same time
– Security problems
• Hard to provide user access to some, but not all,
data
• Database systems offer solutions to all the
above problems
Levels of Abstraction
• Physical level: describes how a record
(e.g., customer) is stored.
• Logical level: describes data stored in
database, and the relationships among the
data.
type customer = record
customer_id : string;
customer_name : string;
customer_street : string;
customer_city : integer;
end;
Levels of Abstraction
• View level: application programs hide
details of data types. Views can also hide
information (such as an employee’s
salary) for security purposes.
View of Data
An architecture for a database system
The three-schema
architecture.
Instances and Schemas
• Similar to types and variables in programming
languages
• Schema – the logical structure of the database
– Example: The database consists of information about
a set of customers and accounts and the relationship
between them)
– Analogous to type information of a variable in a
program
– Physical schema: database design at the physical
level
– Logical schema: database design at the logical level
• Instance – the actual content of the database at
a particular point in time
– Analogous to the value of a variable
• Physical Data Independence – the ability to
modify the physical schema without changing
the logical schema
– Applications depend on the logical schema
– In general, the interfaces between the various levels
and components should be well defined so that
changes in some parts do not seriously influence
others.
Data Models
• A collection of tools for describing
– Data
– Data relationships
– Data semantics
– Data constraints
• Relational model
• Entity-Relationship data model
(mainly for database design)
• Object-based data models (Object-
oriented and Object-relational)
• Semistructured data model (XML)
• Other older models:
– Network model
– Hierarchical model
Data Manipulation Language
(DML)
• Language for accessing and manipulating the data
organized by the appropriate data model
– DML also known as query language
• Two classes of languages
– Procedural – user specifies what data is required and
how to get those data
– Declarative (nonprocedural) – user specifies what
data is required without specifying how to get those
data
• SQL is the most widely used query language
Data Definition Language
(DDL)
• Specification notation for defining the
database schema
Example:
create table
account (
account-number char(10),
balance integer
)
DDL Cont.
• DDL compiler generates a set of tables stored in a data dictionary
• Data dictionary contains metadata (i.e., data about data)
– Database schema
– Data storage and definition language
• Specifies the storage structure and access methods used
– Integrity constraints
• Domain constraints
• Referential integrity (references constraint in SQL)
• Assertions
– Authorization
Relational Model
• Example of tabular data in the relational model
Attributes
A Sample Relational Database
SQL
• SQL: widely used non-procedural language
– Example: Find the name of the customer with customer-
id 192-83-7465
select customer.customer_name
from customer
where customer.customer_id = ‘192-83-7465’
– Example: Find the balances of all accounts held by the
customer with customer-id 192-83-7465
select account.balance
from depositor, account
where depositor.customer_id = ‘192-83-7465’
and
depositor.account_number = account.account_number
SQL
• Application programs generally access
databases through one of
– Language extensions to allow embedded SQL
– Application program interface (e.g.,
ODBC/JDBC) which allow SQL queries to be
sent to a database
Database Design
The process of designing the general structure of the
database:
Logical Design – Deciding on the database schema. Database design
requires that we find a “good” collection of relation schemas.
– Business decision – What attributes should we record in the database?
– Computer Science decision – What relation schemas should we have
and how should the attributes be distributed among the various relation
schemas?
Physical Design – Deciding on the physical layout of the database
The Entity-Relationship Model
• Models an enterprise as a collection of
entities and relationships
– Entity: a “thing” or “object” in the enterprise
that is distinguishable from other objects
• Described by a set of attributes
– Relationship: an association among several
entities
•Represented diagrammatically by an entity-relationship diagram:
Object-Relational Data Models
• Extend the relational data model by including
object orientation and constructs to deal with
added data types.
• Allow attributes of tuples to have complex
types, including non-atomic values such as
nested relations.
• Preserve relational foundations, in particular
the declarative access to data, while extending
modeling power.
• Provide upward compatibility with existing
relational languages.
XML: Extensible Markup Language
• Defined by the WWW Consortium (W3C)
• Originally intended as a document markup
language not a database language
• The ability to specify new tags, and to create
nested tag structures made XML a great way
to exchange data, not just documents
• XML has become the basis for all new
generation data interchange formats.
• A wide variety of tools is available for
parsing, browsing and querying XML
documents/data
Storage Management
• Storage manager is a program module that provides the
interface between the low-level data stored in the database
and the application programs and queries submitted to the
system.
• The storage manager is responsible to the following tasks:
– Interaction with the file manager
– Efficient storing, retrieving and updating of data
• Issues:
– Storage access
– File organization
– Indexing and hashing
Query Processing
1.Parsing and translation
2.Optimization
3.Evaluation
Query Processing (Cont.)
• Alternative ways of evaluating a given query
– Equivalent expressions
– Different algorithms for each operation
• Cost difference between a good and a bad way of
evaluating a query can be enormous
• Need to estimate the cost of operations
– Depends critically on statistical information about
relations which the database must maintain
– Need to estimate statistics for intermediate results
to compute cost of complex expressions
Transaction Management
• A transaction is a collection of operations that
performs a single logical function in a database
application
• Transaction-management component ensures
that the database remains in a consistent (correct)
state despite system failures (e.g., power failures
and operating system crashes) and transaction
failures.
• Concurrency-control manager controls the
interaction among the concurrent transactions, to
ensure the consistency of the database.
Database Architecture
The architecture of a database systems is greatly influenced by
the underlying computer system on which the database is
running:
• Centralized
• Client-server
• Parallel (multi-processor)
• Distributed
A physical centralized architecture.
Logical two-tier client/server architecture.
Physical two-tier client-server architecture.
FIGURE 2.7
Logical three-tier client/server architecture.
Database Users
Users are differentiated by the way they expect to interact with
the system
• Application programmers – interact with system through
DML calls
• Sophisticated users – form requests in a database query
language
• Specialized users – write specialized database applications
that do not fit into the traditional data processing framework
• Naïve users – invoke one of the permanent application
programs that have been written previously
– Examples, people accessing database over the web, bank tellers,
clerical staff
Database Users
• Database systems • End users
programmer – Use the database system
– Writes the database to achieve some goal
software itself • Application developers
• Database Administrator – Write software to allow end
(DBA) users to interface with the
– Designs & manages the database system
database system
Database Administrator
• Coordinates all the activities of the database system; the
database administrator has a good understanding of the
enterprise’s information resources and needs.
• Database administrator's duties include:
– Schema definition
– Storage structure and access method definition
– Schema and physical organization modification
– Granting user authority to access the database
– Specifying integrity constraints
– Acting as liaison with users
– Monitoring performance and responding to changes in
requirements
Overall System Structure
Interaction of DBMS Components
User Query/Program Database
administrator
Query
Processor inter
p DD Compiler
optim rets qu
izes eries
quer
ies
Transaction
concurrency control Data
Manager
recovery control Dictionary
e men t
an ag
Storage p a c em
f iles
Manager
Database
Component modules of a DBMS and their interactions.
A physical centralized architecture.
History of Database Systems
• 1940's, 50's Initial use of computers as calculators. Limited data,
focus on algorithms. Science, military applications.
• 1950s and early 1960s:
Business uses. Organizational data, customer data, sales,
inventory, accounting, etc. File system based, high emphasis on
applications programs to extract and assimilate data. Larger
amounts of data, relatively simple calculations.
– Data processing using magnetic tapes for storage
• Tapes provide only sequential access
– Punched cards for input
• Late 1960s and 1970s:
– Hard disks allow direct access to data
– Network and hierarchical data models in widespread use
– Ted Codd defines the relational data model
• The relational model. Data separated into individual tables.
Related by keys. Initially required heavy system resources.
Examples: Oracle, Sybase, Informix, Digital RDB, IBM DB2
• IBM Research begins System R prototype
• UC Berkeley begins Ingres prototype
– High-performance (for the era) transaction processing
History (cont.)
• 1980s:
– Research relational prototypes evolve into commercial systems
• SQL becomes industrial standard
– Parallel and distributed database systems
– Local area networks. Workgroups sharing resources such as
files, printers, e-mail.
Client/Server Database resides on a central server, applications
programs run on client PCs attached to the server over a LAN.
– Object-oriented database systems
History (cont.)
• 1990s:
– Large decision support and data-mining applications
– Large multi-terabyte data warehouses
– Emergence of Web commerce
– Distributed Database Systems
• 2000s:
– XML and XQuery standards
– Automated database administration
ANSI/SPARC Architecture
• ANSI - American National • A three-level architecture
Standards Institute – Internal level: For systems
• SPARC - Standards designers
– Conceptual level: For
Planning and
database designers and
Requirements Committee administrators
• Proposed a framework for – External level: For
DBs in 1975 database users
Introduction to Database
Systems
ANSI/SPARC Architecture
User 1 User 2 User 3
External External
External Schemas
View 1 View 2
External/Conceptual Mappings
Conceptual DBA
Conceptual Schema View
Conceptual/Internal Mapping
Internal Schema Stored
Data
Introduction to Database
Systems
ASSIGNMENT 1
5th Edition
• 1.10, 1.11, 1.14, 1.15
• 2.18, 2.19 jhhjjhjjhh
Submission Date:
Wednesday, March 2, 2011.
End of Chapter 1 & 2