Unit I
Introduction
Introduction
Basic concepts and definitions
Data dictionary
Database
Database Systems
Data Administrator
Database Administrator
File-Oriented System Versus Database
System
Database Language.
Introduction:In todays competitive world or environment,
data
(or
information)
and
its
efficient
management is the most critical business
objective of an organisation.
Database system is a tool that simplifies the
tasks of managing the data and extracting useful
information in a timely fashion.
Database systems analyses and guides the
activities or business purposes of an organisation.
Central repository, managers are seeking to
use knowledge derived from databases
Basic concepts and Definitions:With the growing use of computers, the
organisations are fast migrating from a
manual system to a computerised
information system for which the data
within the organisation is a basic
resource.
Proper organisation and management
of data is necessary to run the
organisation efficiently.
The efficient use of data for planning,
production control, marketing, invoicing,
payroll, accounting and other function in an
organisation have a major impact for its
competitive edge.
1) Data
2) Information
3) Data Vs. Information
4) Data Warehouse
5) Metadata
6) System Catalog
7) Data Item or fields
8) Records
9) Files
Data:Data may be defined as a known fact
that can be recorded and that have implicit
meaning.
Data are raw or isolated facts from
which
the
required
information
is
produced.
Data can exist in a variety of forms that
have meaning in the users environment
such as numbers or text on a piece of
paper, bits or bytes stored in computers
memory, or as facts stored in persons
Data can also be objects such as
documents, photographic images and even
video segments.
The following are the examples of data:In salespersons view
customer-name
Customer-account
Address
Telephone numbers
In Electricity Suppliers Context
Consumer-name
Consumer-number
Address
Telephone numbers
Unit Consumed
Amount-Payable
In Employers mind
Employee-Name
Identification-number
Department
Date-0f-Birth
Qualification
Skill-Type
Data is also known as the plural of datum,
which means a single piece of information.
In practice, data is used as both-the
singular and the plural form of the word.
Data is often used to distinguish machinereadable(binary) information from humanreadable(textual) information.
Some applications make a distinction
between data files and text files. Either
numbers, or characters or both can
represent the data.
Three-Layer data architecture:To centralise the data throughout the
organisation and make them readily
available for efficient decision support
applications, data is organised in the
following layered structured:
1) Operational data
2) Reconciled data
3) Derived data
The following figure shows a three layered
structure that is generally used for data
warehousing applications.
DATA LAYER
Operational
data
Enterprise
(organizatio
n) Data
model
Reconciled
Data
Derived Data
Operational Data are stored in various
operational
systems
throughout
he
organisation (both internal and external)
system s.
Reconciled Data are stored in the
organisation data warehouse and in
operational data store. They are detailed
and current data, which is intended as the
single, authoritative source for all decision
support applications.
Derived Data are stored in each of the
data mart. They are selected, formatted
and aggregated for end-user decision
support applications.
Information:Data and information are
related
and
are
often
interchangeably.
closely
used
Information is processed, organised or
summarised data.
Information is defined as collection of
related data that when put together,
commiunicate meaningful and useful
message to a recipient who uses it, to
make decision or to interpret the data to
get the meaning.
The following diagram shows the process
of Information Cycle.
Input
(Data)
Processing
Decision
Output
(Information)
Users
Data Versus Information:Today, database may contain either data or
information (or both), according to the
organisation definition and needs.
Data Warehouse:It is a collection of data designed to
support management in the decisionmaking Process.
It is subject oriented, integrated, timevariant, non-updatable collection of data
used in support of management decisionmaking processes and business
intelligence.
It is the process, where organizations
extract meaning and inform decision
making from their informational assets
through the use of data warehouses.
Meta Data:Also called data dictionary.
It is the data about the data.
It is also called the system catalog, which
is the self-describing nature of
the
database that provides program- data
independence.
Metadata is found in documentation
describing source systems.
It is used by developers who rely on it to
help them develop the programs, queries,
controls and procedures to manage and
manipulate the warehouse data.
It is also used for creating reports and
graphs in front-end data access tools, as
well as for the management of enterprisewide data and report changes for the enduser.
Types of Metadata:There are three types of metadata.
These metadata are linked to the threelayer data structure as shown in the figure.
The following diagram shows the Metadata
layer.
DATA
Operation
LAYER
Enterprise
(organization)
Data model
al data
Meta data
Operation
layer
al
Metadata
Reconcile
d Data
EDW
Metadata
Derived
Data
Data Mart
Metadata
Operational Metadata:It describes the data in various operational
systems that feed the enterprise data
warehouse.
Operational metadata typically exist in a
number of formats and unfortunately are
often of poor quality
Enterprise Data Warehouse(EDW):They are derived from the enterprise
data model.
EDW metadata describe the reconciled
data layer as well as the rules for
transforming operational data to reconciled
data.
Data Mart Metadata:They describe the derived data layer
and the rules for transforming reconciled
data to derived data.
System Catalog:It is a repository of information
describing the data in the database, that is
the metadata(or data about data).
It is a system created database that
describes
objects,
data
dictionary
information and user access information.
It also describes table-related data such
as table name, table creators or owners,
column names, data types, data size,
foreign keys and primary keys, indexed file,
authorized users, user access privileges
and so forth.
It
is
created
by
the
database
management system and the information is
stored in system files.
Data Items or Fields:A data item is the smallest unit of data
that has meaning to its user.
It is traditionally called a field or data
element.
It is represented in the database by a
value.
Eg:- Names, telephone numbers, bill
amount, address, basic pay, gross pay, net
pay and so on.
Records:A record is a collection of logically related
fields or data items, with each field
possessing a fixed number of bytes and
having a fixed data type.
A record consists of values for each field.
The grouping of data items together form a
record.
These records are retrieved or updated
using programs.
Files:A file is a collection of related sequence of
records.
All records in a file are of the same record
type.
If every record in the file has exactly the
same size (in bytes), the file is said to be
made up of fixed length records.
If different records in the file have
different sizes, the file is said to be made of
variable-length records.
Data Dictionary:Also called Information Repositories.
They are the mini database management
systems that manage metadata.
The following diagram shows the general
structure of Data Dictionary.
Data
Dictionar
y Inputs
Data Dictionary
Maintenance
Information
about database
structure
Report
generation
Information about database use
(by programs, by users)
(Schema user
views locations)
It contains description of the database
structure and database use.
The data in the data dictionary are
maintained by several programs and
produce diverse reports on demand.
Most data dictionary systems are standalone systems, and their database is
maintained independently of the DBMS,
thereby enabling inconsistencies between
and the data dictionary.
Data Dictionary is usually a part of the
system catalog that is generated for each
database. A useful data dictionary system
usually stores and manages the following
types of information:
1) Descriptions of the schema of the
database.
2)
Detailed information on physical
database design, such as storage
structures, access paths and file and
record sizes.
3) Description of the database users, their
responsibilities and their access rights.
4) High level descriptions of the database
transactions and applications and of the
relationships of users to transactions.
5) The
relationship
between
database
transactions and the data items referenced
by them.
6) Usage statistics such as frequencies of
queries and transactions and access counts
to different portions of the database.
Considering an example of a manufacturing
company M/s ABC Motors Ltd., which has
decided to computerise its activities related
to various departments.
The manufacturing department wants to
store the details such as model no., model
description and so on.,
The personnel department want to keep the
facts such as employees number, last
name, first name and so on.,
The following figures illustrates the two
data processing (DP) files, namely Inventory
file of manufacturing department and
Employee file of personnel department.
INVENTO
MOD-NO
M ODRY
NAME
MODDESC
UNITPRICER
L-800
LEGEND
LUXURY
CAR
4000000
M-1000
MAHARAJ
A
LUXURY
CAR
3000000
C-1200
CRUZE
ZIP DRIVE 1200000
P-2000
PANTHER
A
SPORTS
RIDE
800000
R-121
ROVER
SPORTS
RIDE
2000000
EMPLOY
EMP-NO
EMPEE
LNAME
EMPFNAME
EMPSALARY
106519
MATHEW
THOMAN
4000
112233
SMITH
JOHN
4500
123456
KUMAR
RAJEEV
6000
123243
MARTIN
JOSE
3500
Though, manufacturing and employee
departments are interested in keeping
track of their inventory and employees
details respectively, the data processing
(DP) department of M/s ABC Motors Ltd.,
would be interested in tracking and
managing entities, that is the data
dictionary.
The following figures shows a sample
of data dictionary for the two files (Fields
file and Files file).
FIELDS
FILE
FIELD-NAME
FIELD-TYPE
FIELD-LENGTH
MOD-NO
CHAR
MOD-NAME
ALPHA
10
MOD-DESC
ALPHA
15
UNIT-PRICE
NUMERIC
EMP-NO
NUMERIC
EMP-LNAME
ALPHA
10
EMP-FNAME
ALPHA
15
EMP-SALARY
NUMERIC
FILES FILE
FIELDNAME
FIELD-LENGTH
INVENTORY
2000
EMPLOYEE
3000
In
the
manufacturing
departments
INVENTORY file, each row MOD-NO, MODNAME, MOD-DESC, UNIT-PRICE represents the
details of a model of car
In the personnel departments EMPLOYEE
file, each row EMP-NO, EMP-LNAME, EMPFNAME, EMP-SALARY represents the details
about an employee.
In the data dictionary, each row of the
FIELDS FILE namely FIELD-NAME, FIELDTYPE, FIELD-LENGTH, represents one of the
fields in one of the application data files,
processed
by
the
data
processing
department.
Also each row of the Files file namely
FILE-NAME, FILE-LENGTH represents one of
the application files processed by the dta
processing department.
Data dictionary also keeps track of the
relationships among the entities, which
is important in the data processing
environment as how these entities
interrelate.
The following figure shows the
links(relationships) between fields and
files.
These relationships are important for
the data processing department.
FIELDS
FIELD- FIEL
FILE
NAME D-
FIELD-LENGTH
FILES FILE
TYPE
MODNO
CHAR
FIELDNAME
FIELDLENGTH
MODNAME
ALPH
A
10
INVENTORY
2000
MODDESC
ALPH
A
15
EMPLOYEE
3000
UNITPRICE
NUME
RIC
EMPNO
NUME
RIC
EMPLNAME
ALPH
A
10
EMPFNAME
ALPH
A
15
EMPNUME
SALARY RIC
Data dictionary
showing relationships
Components of Data Dictionaries:Data dictionary contains the following
components.
Entities
Attributes
Relationships
Key
Entities:Entity is the real physical object or an
event;
Any item about which information is
stored is called entity.
Attributes:An attribute is a property or characteristic
(field) of an entity.
The following figure shows an example of
entity set and its attributes.
ENTITY SET
ATTRIBUTES
(A)INVENTORY
MOD-NO
MOD-NAME
MOD-DESC
UNIT-PRICE
(B)EMPLOYEE
EMP-NO
EMP-LNAME
EMP-FNAME
EMP-SALARY
Relationships:The associations or the ways that
different entities relate to each other is
called relationship
The relationship between any pair of
entities of a data dictionary can have value
to some part or department of the
organisation.
Some examples of common
dictionary relationship are as follows:
data
Record construction: for example, which
field appears in which records.
Security: for example, which user has
access to which file.
Impact of change: for example, which
programs might be affected by changes
to which files.
Physical residence: for example, which
files are residing in which storage device
or disk packs.
Program data requirement: for example,
which program use which file.
Responsibility: for example, which users are
responsible for updating which files.
Relationships could be of the following
types:
1) One-to-One (1:1) relationships
2) One-to-Many(1:m) relationships
3) Many-to-Many (n:m) relationships
Key:The data item or field for which a
computer uses to identify a record in a
database system is referred to as key.
Key is a single attribute or combination of
attributes of an entity set that is used to
identify one or more instances of the set.
There are various types of keys
1) Primary key
2) Concatenated key
3) Secondary Key
4) Super Key
Primary key is
identify a record.
used
to
uniquely
It is also called entity identifier.
e.g. EMP-NO, MOD-NO.
When more than one data item is used
to identify a record, it is called
concatenated key.
e.g. EMP-NO AND EMP-FNAME
MOD-NO AND MOD-TYPE.
Secondary key is used to identify all those
records, which have a certain property.
It is an attribute or combination of
attributes that may not be a concatenated
key but that classifies the entity set on a
particular characteristic.
Super key includes any
attributes that possess a
property.
number of
uniqueness
A primary key is a minimum of super key
Active and Passive Data Dictionaries:An Active data dictionary also called
integrated data dictionary is managed
automatically by the data management
software.
The Passive data dictionary also called
non-integrated data dictionary is the one
used only for documentation purposes.
DATABASE:A database is defined as a collection of
logically related data stored together that is
designed to meet the information needs of
an organisation.
It is basically an electronic filing cabinet,
which contain computerized data files.
It can contain one data file(a very small
database) or large number of data files( a
large database) depending on organisational
needs.
The names, addresses, telephone numbers
and so on, of the people we maintain in an
address book, store in the computer storage
(such as floppy or hard disk), or in excel
worksheet of Microsoft and so on, are the
examples of the database.
Since it is a collection of related data with
an implicit meaning, it is a database.
A database is designed, built and
populated with data for a specific purpose.
A database can be of any size and varying
complexity.
It may be generated and maintained
manually or it may be computerized.
A computerized database may be created
and maintained either by a
group of
application programs written specifically for
the task or by a dbms
A database consists of the following four
components
1) Data Item
2) Relationships
3) Constraints
4) Schema
DATA
ITEMS
RELATIONSHI
PS
CONSTRAIN
TS
PHYSICAL
DATABASE
COMPONENTS OF
SCHEMA
Data or data item is a distinct piece of
information.
Relationship represent a correspondence (or
communication)
between
various
data
elements.
Constraints are predicates
correct database states.
that
define
Schema describes the organisation of data
and relationships within database.
DATABASE SYSTEM:A database system also called
database management system(DBMS) is
a generalized software system for
manipulating databases.
It is basically a computerized record
keeping
system;
which
it
stores
information and allows users to add,
delete, change, retrieve and update that
information on demand.
DBMS is also a collection of programs
that enables users to create and maintain
database.
It is a general-purpose software system
that facilitates the process of defining
and manipulating for various applications.
Typically a DBMS has three basic
components as shown in the following
figure.
DATABA
SE
SYSTEM
USERS/PROGRAMMERS
APPLICATION
PROGRAM/QUERIES
DBMS
COMPONENTS
DATA DESCRIPTION
LANGUAGE(DDL)
SOFTWARE TO PROCESS
QUERIES/PROGRAMS(DML/SQL)
SOFTWARE FOR CONTROLLED
ACCESS OF STORED DATA
STORED DB
DEFINITION
(METADATA)
STORED
PHYSICAL
DATABASE
The above figure provides the following
facilities.
Data description language(DDL) allows
users to define the database, specify the
data types, and data structures, and the
constraints on the data to be stored in the
database, usually through data definition
language.
Data Manipulation language (DML) and
query facility allows users to insert, update,
delete and retrieve data from the databases.
Software for controlled access of database
provides controlled access to the database.
The database and DBMS software together is
called a database system.
A data base system overcomes the
limitations of traditional file-oriented system.
Operations
Systems
performed
on
Database
1) Inserting new data into existing data
files
2) Adding new files to the database
3) Retrieving data from existing files
4) Changing data in existing files
5) Deleting data from existing files
6) Removing existing files from the db
Data Administrator (DA):Is an identified individual person in the
organisation who has central responsibility
of controlling data.
Data are important
assets of an organisation.
A DA is the senior level person in the
organisation whose job is to decide what
data should be stored in the database and
establish policies for maintaining and
dealing with that data.
A DA decides the content of the database at
an abstract level.
This process performed by DA is known
as Logical or conceptual database design.
DAs are the manager and need not be a
technical person, however, knowledge of
information technology helps them in an
overall understanding and appreciation of
the system.
DATABASE ADMINISTRATOR (DBA):He is an individual person or group of
persons with an overview of one or more
databases who controls the design and the
use of these databases.
He provides the necessary technical
support for implementing policy decisions of
databases.
DBA is responsible for the overall control
of the system at technical level and unlike a
DA, who is an IT Professional.
FUNCTIONS AND RESPONSIBILITIES OF
DBAs
1) Defining Conceptual Schema and database
Schema.
2) Storage Structure
definition.
and
access-method
3) Granting authorization to the users
4) Physical organisation modification.
5) Routine maintenance
6) Job monitoring
FILE ORIENTED SYSTEM
VS. DATABASE SYSTEM