File Processing Systems
Billing Program Purchasing Program
Customer file
Accounts receivable file
Buyer file
Inventory file
Vendor file
Accounts_Payable Program
Sales Order Processing Program
Payroll Program
Vendor file
Invoice file
Customer file
Inventory file
Employee file
Database Approach
Order Dept. Accounting Dept. Payroll Dept.
Program
Program B
Program C
Ordering filing System
Invoicing System
Payroll System
Back Orders file
Inventory Master file
Customer Master file
Inventory Pricing file
Employee Master file
Database vs. File-based
Miniworld as data source
Universe of Discourse (UOD)
Logically integrated files Intended users and applications Shared and Self-describing
Compared with file-based approach: - program-data independence - multiple view of data - multi-user transaction processing
Types of Databases and Database Applications
Numeric and Textual Databases (e.g. IRS CADE) Multimedia Databases (e.g. Cortina) Geographic Information Systems (GIS) Data Warehouses Real-time and Active Databases
Basic Definitions
Database: A collection of related data. Data: Known facts that can be recorded and have an implicit meaning. Mini-world: Some part of the real world about which data is stored in a database. For example, student grades and transcripts at a university. Database Management System (DBMS): A collection of software to facilitate the creation and maintenance of a DB. Database System: The DBMS software together with the data. Sometimes, applications are also included.
Database System Environment
Users/Programmers Application Programs/Queries DBMS Software Software to Process Queries/Programs Software to Access Stored Data
Stored DB Definition (Meta-Data)
Stored Database
Why the Database Approach?
Application needs constantly changing Ad hoc questions need rapid answers Need to reduce long lead times and high cost in new application development Lots of data shared throughout the organization Need to improve data consistency and control access to data Substantial dedicated programming assistance typically not available
Core DB Technology Trend
Relational Database Distributed Database Multi-dimensional databases Object Relational Database Object-Oriented Database Multimedia Database Intelligent Database Data warehousing, data marts, data mining Web-based Databases
DB Time Line
Web-based Data Warehousing Client-server
Data Management Capability
multimedia heterogeneous object-oriented expert, distributed SQL Standard commercial DBMS
PC DBMS ER model network model
Relational Model: Codd Hierarchical: IMS
file management magnetic tape
1945 1961 1970 1976 1980 1985 1990 2000
DBMS
A collection of software
manage different applications for a multi-user database system enable users to define/create and manipulate data
Basic functions:
multiple user interfaces controlled redundancy integrity control security: authorization & protection concurrency & recovery control
Example Database (with Conceptual Data Model)
Mini-world for the example: Part of a UNIVERSITY environment. Some mini-world entities:
STUDENTs COURSEs SECTIONs (of COURSEs) (academic) DEPARTMENTs INSTRUCTORs
Note: The above could be expressed in the ENTITY-RELATIONSHIP data model.
Example Database (with Conceptual Data Model) 2.
Some mini-world relationships:
SECTIONs are of specific COURSEs STUDENTs take SECTIONs COURSEs have prerequisite COURSEs INSTRUCTORs teach SECTIONs COURSEs are offered by DEPARTMENTs STUDENTs major in DEPARTMENTs
Note: The above could be expressed in the ENTITY-RELATIONSHIP data model.
Example E-R Model
Relational Logical Schema Example
Example Relational Database Snapshot
Features of the E-R Model
Relationships are just as important as entities they are data that need to be stored in the DB Most relationships are binary, but they may be ternary (or more!) as well Questions:
What is the relationship between three binary relationships and a ternary relationship? Why are there two relationships between projects and employees?
Advantages of Using the Database Approach
More information from given data Ad hoc queries can be performed Redundancy can be reduced Inconsistency can be avoided Security restriction can be applied Data independence
more cost-effective: reduced development time, flexibility, economies of scale
Advantages of Using the Database Approach - 2
Controlling redundancy in data storage and in development and maintenance. Sharing of data among multiple users. Providing persistent storage for program objects (in Object-oriented DBMSs see Chs. 20-22) Providing storage structures for efficient query processing
Advantages of Using the Database Approach 3.
Providing backup and recovery services. Providing multiple interfaces to different classes of users. Representing complex relationships among data. Enforcing integrity constraints on the database. Drawing Inferences and Actions using rules
Disadvantages of Using the Database Approach
Expensive
hardware, software, personnel, processing overhead, operating cost , etc.
DBMS generality & overhead => performance issue Increased vulnerability to failure Recovery is more complex
When should you not use a DBMS????
Additional Implications of the Database Approach
Potential for enforcing standards:
crucial for the success of database applications in large organizations standards refer to data item names, display formats, screens, report structures, meta-data (description of data) etc.
Reduced application development time:
incremental time to add each new application is reduced.
Additional Implications of the Database Approach 2.
Flexibility to change data structures: database structure may evolve as new requirements are defined. Availability of up-to-date information: very important for on-line transaction systems such as airline, hotel, car reservations. Economies of scale: by consolidating data and applications across departments wasteful overlap of resources and personnel can be avoided.
Historical Development of Database Technology
Early Database Applications: Hierarchical and Network Models were introduced in mid 1960s and dominated the 70s. A bulk of the worldwide database processing still uses these models. Relational Model based systems: originally introduced in 1970 this model was heavily researched and experimented with in IBM and universities. Relational DBMS products emerged in the 1980s.
Historical Development of Database Technology 2.
Object-oriented applications: OODBMSs were introduced in late 1980s and early 1990s to cater to the need of complex data processing in CAD and other applications. Their use is not large. Data on the Web and E-commerce Applications: Web contains data in HTML with links among pages. Ecommerce is using standards like XML (eXtended Markup Language).
Extending Database Capabilities
New functionality is being added to DBMSs in the following areas:
Scientific Applications Image Storage and Management Audio and Video data management Data Mining Spatial data management Time Series and Historical Data Management
The above gives rise to new research and development in incorporating new data types, complex data structures, new operations and indexing schemes in database systems.
When NOT to use a DBMS
Main inhibitors (costs) of using a DBMS:
High initial investment and possible need for additional hardware. Overhead for providing generality, security, concurrency control, recovery, and integrity functions.
When a DBMS may be unnecessary:
If the database and applications are simple, well defined, and not expected to change. If there are stringent real-time requirements that may not be met because of DBMS overhead. If access to data by multiple users is not required.
When NOT to use a DBMS 2.
When no DBMS may suffice:
If the database system is not able to handle the complexity of data because of modeling limitations If the database users need special operations not supported by the DBMS.
System Overview
OLCP
On-Line Complex Processing data mining & knowledge discovery
EIS
DSS DP
On-Line Analytical Processing Data Warehousing Data Marts
OLAP
On-line Transaction Processing Operational databases Legacy systems
OLTP