0% found this document useful (0 votes)
51 views59 pages

Understanding Data Management Concepts

The document discusses different data concepts and database structures. It defines data elements like characters, fields, records, files and databases. It also describes common database structures such as hierarchical, network, relational, object-oriented and multidimensional. Database management systems use these structures to organize data and define relationships. Database administrators develop databases using tools like data definition language and data dictionaries.

Uploaded by

Rohit Chand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views59 pages

Understanding Data Management Concepts

The document discusses different data concepts and database structures. It defines data elements like characters, fields, records, files and databases. It also describes common database structures such as hierarchical, network, relational, object-oriented and multidimensional. Database management systems use these structures to organize data and define relationships. Database administrators develop databases using tools like data definition language and data dictionaries.

Uploaded by

Rohit Chand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 59

UNIT 5

MANAGING DATA RESOURCES


5.1 Fundamental Data Concepts
• A conceptual framework of several levels of
data has been devised that differentiates among
different groupings, or elements, of data.
• Thus, data may be logically organized into
characters, fields, records, files, and databases,
just as writing can be organized into letters,
words, sentences, paragraphs, and documents.
Character
• The most basic logical data element is the
character, which consists of a single
alphabetic, numeric, or other symbol.
• A character is the most basic element of data
that can be observed and manipulated.
Field
• The next higher level of data is the field, or data item.
• A field consists of a grouping of related characters.
• For example, the grouping of alphabetic characters in
a person’s name may form a name field (or typically,
last name, first name, and middle initial fields), and the
grouping of numbers in a sales amount forms a sales
amount field. Specifically, a data field represents an
attribute (a characteristic or quality) of some entity
(object, person, place, or event).
Record
• All of the fields used to describe the attributes
of an entity are grouped to form a record. Thus,
a record represents a collection of attributes
that describe a single instance of an entity.
• An example is a person’s payroll record, which
consists of data fields describing attributes such
as the person’s name, Social Security number,
and rate of pay
File
• A group of related records is a data file
(sometimes referred to as a table or flat file).
• When it is independent of any other files
related to it, a single table may be referred to
as a flat file
• Files are also classified by their permanence, for
example, a payroll master file versus a payroll weekly
transaction file.
• A transaction file, therefore, would contain records of
all transactions occurring during a period and might
be used periodically to update the permanent records
contained in a master file.
• A history file is an outdated transaction or master file
retained for backup purposes or for long-term
historical storage, called archival storage.
Database
• A database is an integrated collection of
logically related data elements.
• A database consolidates records previously
stored in separate files into a common pool of
data elements that provides data for many
applications.
• The data stored in a database are independent
of the application programs using them and of
the type of storage devices on which they are
stored.
• Thus, databases contain data elements
describing entities and relationships among
entities.
• All the data we use are stored in some type of
database.
5.2 Database Structures
• The relationships among the many individual data
elements stored in databases are based on one of
several logical data structures, or models.
• Database management system (DBMS) packages are
designed to use a specific data structure to provide
end users with quick, easy access to information
stored in databases.
• Five fundamental database structures are the
hierarchical, network, relational, object-oriented,
and multidimensional models.
Hierarchical Structure
• Hierarchical structure is the traditional structure.
• Early mainframe DBMS packages used the
hierarchical structure, in which the relationships
between records form a hierarchy or treelike
structure.
• In this model, all records are dependent and
arranged in multilevel structures, consisting of
one root record and any number of subordinate
levels.
• Thus, all of the relationships among records are
one-to-many because each data element is
related to only one element above it.
• The data element or record at the highest level
of the hierarchy is called the root element.
• Any data element can be accessed by moving
progressively downward from a root and along
the branches of the tree until the desired record
is located.
Network Structure
• The network structure can represent more complex logical
relationships and is still used by some mainframe DBMS
packages.
• It allows many-to-many relationships among records; that
is, the network model can access a data element by
following one of several paths because any data element
or record can be related to any number of other data
elements.
• It should be noted that neither the hierarchical nor the
network data structures are commonly found in the
modern organization.
Relational Structure
• The relational model is the most widely used of
the three database structures.
• It is used by most microcomputer DBMS
packages, as well as by most midrange and
mainframe systems.
• In the relational model, all data elements within
the database are viewed as being stored in the
form of simple two-dimensional tables,
sometimes referred to as relations.
• The tables in a relational database are flat files
that have rows and columns.
• Each row represents a single record in the file,
and each column represents a field.
• The major difference between a flat file and a
database is that a flat file can only have data
attributes specified for one file.
• In contrast, a database can specify data
attributes for multiple files simultaneously and
can relate the various data elements in one
file to those in one or more other files.
• Database management system packages
based on the relational model can link data
elements from various tables to provide
information to users.
• The relational model can relate data in any one
file with data in another file if both files share a
common data element or field.
• Because of this, information can be created by
retrieving data from multiple files even if they
are not all stored in the same physical location.
• Because of the widespread use of relational
models, an abundance of commercial products
exist to create and manage them.
• Some examples of relational database
management system (RDBMS) are Oracle,
DB2, SQL Server, and Microsoft Access.
Multidimensional Structure

• The multidimensional model is a variation of


the relational model that uses multidimensional
structures to organize data and express the
relationships between data.
• You can visualize multidimensional structures
as cubes of data and cubes within cubes of
data.
• Each side of the cube is considered a
dimension of the data.
• A major benefit of multidimensional databases is
that they provide a compact and easy-to
understand way to visualize and manipulate data
elements that have many interrelationships.
• So multidimensional databases have become the
most popular database structure for the analytical
databases that support online analytical
processing (OLAP) applications, in which fast
answers to complex business queries are expected.
Object-oriented Structure

• The object-oriented model uses the concept of


objects.
• An object consists of data values describing the
attributes of an entity, plus the operations that
can be performed upon the data.
• This encapsulation capability allows the object-
oriented model to handle complex types of data
(graphics, pictures, voice, and text) more easily
than other database structures.
• The object-oriented model also supports
inheritance; that is, new objects can be
automatically created by replicating some or
all of the characteristics of one or more parent
objects.
• An object-oriented DBMS can work with
complex data types such as document and
graphic images, video clips, audio segments,
and other subsets of Web pages much more
efficiently than relational database
management systems.
• However, major relational DBMS vendors have
countered by adding object-oriented modules
to their relational software.
5.3 Database Development

• Database management systems like Oracle,


SQL Server, and Microsoft Access Microsoft
Access allow database developers to develop
the databases.
• We store data in the database to improve
integrity and security of data.
• Database developers use the data definition
language (DDL) in database management systems to
develop and specify the data contents, relationships,
and structure of each database, as well as to modify
these database specifications when necessary.
• Such information is cataloged and stored in a
database of data definitions and specifications called
a data dictionary, or metadata repository, which is
managed by the database management software and
maintained by the database administrator.
• A data dictionary is a database management
catalog or directory containing metadata (i.e.,
data about data).
• A data dictionary relies on a specialized
database software component to manage a
database of data definitions, which is
metadata about the structure, data elements,
and other characteristics of an organization’s
databases.
• The database administrator can query data
dictionaries to report the status of any aspect
of a firm’s metadata.
• The administrator can then make changes to
the definitions of selected data elements.
• The database administrator controls the
overall operation of the database at technical
level.
• Developing a large database of complex data types
can be a complicated task.
• Database administrators and database design
analysts work with end users and systems analysts to
model business processes and the data they require.
• Then they determine (1) what data definitions
should be included in the database and (2) what
structures or relationships should exist among the
data elements.
Data Resource Management

• Data, a vital organizational resource, need to


be managed like other important business
assets.
• Today’s business enterprises cannot survive or
succeed without quality data about their
internal operations and external environment.
• That’s why organizations and their managers
need to practice data resource management, a
managerial activity that applies information
systems technologies like database
management, data warehousing, and other
data management tools to the task of
managing an organization’s data resources to
meet the information needs of their business
stakeholders.
5.4 Types of Databases

• Continuing developments in information


technology and its business applications have
resulted in the evolution of several major
types of databases.
Operational Database
• Operational databases store detailed data needed
to support the business processes and operations of
a company.
• They are also called subject area databases (SADB),
transaction databases, and production databases.
• Examples are a customer database, human
resource database, inventory database, and other
databases containing data generated by business
operations.
Distributed Database

• Many organizations replicate and distribute


copies or parts of databases to network
servers at a variety of sites.
• These distributed databases can reside on
network servers on the World Wide Web, on
corporate intranets or extranets, or on other
company networks.
• Distributed databases may be copies of operational
or analytical databases, hypermedia or discussion
databases, or any other type of database.
• Replication and distribution of databases improve
database performance at end-user worksites.
• Ensuring that the data in an organization’s
distributed databases are consistently and
concurrently updated is a major challenge of
distributed database management.
Advantages of distributed database

• One primary advantage of a distributed


database lies with the protection of valuable
data. By having databases distributed in
multiple locations the negative impact of any
catastrophic event like a fire or damage to the
media holding the data can be minimized
Advantages of distributed database

• Another advantage of distributed databases is


found in their storage requirements. Often, a large
database system may be distributed into smaller
databases based on some logical relationship
between the data and the location. Because
multiple databases in a distributed system can be
joined together, each location has control of its
local data while all other locations can access any
database in the company if so desired.
Challenges of distributed database

• The primary challenge is the maintenance of


data accuracy. If a company distributes its
database to multiple locations, any change to
the data in one location must somehow be
updated in all other locations.
• One additional challenge associated with
distributed databases is the extra computing
power and bandwidth necessary to access
multiple databases in multiple locations.
External Database

• Access to a wealth of information from


external databases is available for a fee from
commercial online services and with or
without charge from many sources on the
World Wide Web.
• Web sites provide an endless variety of
hyperlinked pages of multimedia documents
in hypermedia databases for you to access.
• Data are available in the form of statistics on
economic and demographic activity from statistical
databanks, or you can view or download abstracts or
complete copies of hundreds of newspapers,
magazines, newsletters, research papers, and other
published material and periodicals from bibliographic
and full-text databases.
• Whenever you use a search engine like Google or
Yahoo to look up something on the Internet, you are
using an external database – a very, very large one.
Hypermedia Database
• The rapid growth of Web sites on the Internet
and corporate intranets and extranets has
dramatically increased the use of databases of
hypertext and hypermedia documents.
• A Web site stores such information in a
hypermedia database consisting of hyperlinked
pages of multimedia (text, graphic and
photographic images, video clips, audio
segments, and so on).
• That is, from a database management point of
view, the set of interconnected multimedia
pages on a Web site is a database of
interrelated hypermedia page elements,
rather than interrelated data records.
5.5 Data Warehousing and Data Mining

• A data warehouse stores data extracted from the


various operational, external, and other
databases of an organization.
• It is a central source of the data that have been
cleaned, transformed, and cataloged so that they
can be used by managers and other business
professionals for data mining, online analytical
processing, and other forms of business analysis,
market research, and decision support.
• Data warehouses may be subdivided into data
marts, which hold subsets of data from the
data warehouse that focus on specific aspects
of a company, such as a department or a
business process.
• Data stores in data warehouse are non-
volatile, subject oriented and time variant
The figure below shows the components of a complete
data warehouse system.
• Data from various operational and external
databases are captured, cleaned, and
transformed into data that can be better used
for analysis.
• This acquisition process might include activities
like consolidating data from several sources,
filtering out unwanted data, correcting incorrect
data, converting data to new data elements, or
aggregating data into new data subsets.
Data Mining

• Data mining is a major use of data warehouse.


In data mining, the data in a data warehouse
are analyzed to reveal hidden patterns and
trends in historical business activity.
• This analysis can be used to help managers
make decisions about strategic changes in
business operations to gain competitive
advantages in the marketplace.
• Data mining can discover new correlations,
patterns, and trends in vast amounts of
business data (frequently several terabytes of
data) stored in data warehouses.
• Data mining software uses advanced pattern
recognition algorithms, as well as a variety of
mathematical and statistical techniques, to sift
through mountains of data to extract previously
unknown strategic business information.
For example, many companies use data mining to:
• Perform market-basket analysis to identify new
product bundles.
• Find root causes of quality or manufacturing
problems.
• Prevent customer attrition and acquire new
customers.
• Cross-sell to existing customers.
• Profile customers with more accuracy.
• In data mining, patterns and rules are used to
guide decision making and forecast the effect
of these decision
• E.g. finding patterns in customers data for one
to one marketing
• Types of information obtainable from data mining:
 Association: are occurrence linked to a single
effect e.g. pen-ink
 Sequence: event are linked overtime e.g. iphone-
cover
 Classification
 Clustering
 forecasting
Different terminologies
1. Predictive analysis: uses data mining techniques,
historical data and assumptions about future
conditions to predict outcome of event
2. Text mining: extract key elements from large
unstructured data sets (e.g. from e-mail)
3. Web mining: discovery and analysis of useful
patterns and information from WWW e.g. to
understand customer behaviour, evaluate
effectiveness of website
• Web content mining: knowledge extracted
from content of web pages
• Web structure mining: e.g. links to and from
web page
• Web usage mining: user interaction data
recorded by web server
5.6 The database management approach

a. DBMS
b. Database interrogation
c. Database maintenance
d. Database development

Assignment

You might also like