CH01
Data vs Information:
Data consists of raw facts
- Not yet Processed to reveal meaning to the end user
- Building blocks of information
Information results from processing raw data to reveal meaning
- Requires context
- Bedrock of knowledge
- Should be accurate, relevant and timely
Roles and advantages of a DBMS:
DBMS: Intermediary between the user and the database
- Enables data to be shared
- Presents the end user with an integrated view of data
- Provides more efficient and effective data management
- Improves sharing, security, integration, access, decision-making,
productivity, etc…
Types of databases:
Single user database: Supports one user at a time
- Desktop database: single user database on a personal computer
Multi User database: Supports multiple users at the same time
- Workgroup databases: Supports a small number of users or a specific
department
- Enterprise database: support many users across many departments
Classification by location:
- Centralized database: data located at a single site
- Distributed database: data distributed across different sites
- Cloud database: created and maintained using cloud data services that
provide defined performance measures for the database
Classification by data type:
- General purpose database: contains a wide variety of data used in
multiple disciplines
- Discipline specific database: contains data focused on a specific subject
areas
- Operational database: designed to support a company’s day to day
operations
Analytical database: stores historical data and business metrics used exclusively
for tactical or strategic decision making
- Data warehouse: stores data in a format optimized for decision support
- Online analytical processing (OLAP): tools for retrieving, processing, and
modeling data from the data warehouse
- Business intelligence: captures and processes business data to generate
information that support decision making
Why is database design so important?
Focuses on design of database structure that will be used to store and manage
end-user data
- Well designed database: facilitates data management and generates
accurate and valuable information
- Poorly designed database: causes difficult to trace errors that may lead to
poor decision making
Data Redundancy:
Unnecessarily storing the same data at different places
- Islands of information ( I.e. scattered data locations )
- Increases the probability of having different versions of the same data
Possible results of uncontrolled data redundancy:
- Poor data security
- Data inconsistency
- Data entry errors
- Data integrity problems
Structural and data dependence:
Structural dependence
- Access to a file dependent on its own structure
- All file system programs are modified to confirm to a new file structure
Structural independence
- File structure is changed without affecting the application’s ability to
access its own data
Data dependence
- Data access changes when data storage characteristics change
Data independence
- Data storage characteristics are changed without affecting the program’s
ability to access the data
Practical significance of data dependence is the difference between logical and
physical format
Data Anomalies:
Develop when not all of the required changes in the redundant data are made
successfully
- Update anomalies
- Insertion anomalies
- Deleting anomalies
CH02
Data modeling:
Data modeling: creating specific data model for a determined problem domain
- Data model: simple representation of complex real world data structures
- Useful for supporting a specific problem domain
- Model: Abstraction of a more complex real world object or event
Importance of data models:
The importance of data models can not be overstated
- Facilitates communication
- Gives various views of the database
- Organizes data for various users
- Provides an abstraction for the creation of a good database
Business rules:
Brief, precise, and unambiguous description of a policy, procedure, or principle
- Create and enforce actions within that organization’s environment
- Establish entities, relationships, and constraints
Sources of business rules:
- Company managers
- Policy makers
- Department managers
- Written documentation
- Direct interviews with end users
Reasons for identifying and documenting business rules:
- Standardize company’s view on data
- Facilitate communications tool between users and designers
- Assist designers
- Understand the nature, role, scope of data, and business
processes
- Develop appropriate relationship participation rules and constraints
- Create an accurate data model
Translating business rules into data model components:
Business rules set the stage for the proper identification of entities,
attributes, relationships, and constraints
- Nouns translate into entities
- Verbs translate into relationships among entities
Relationships are bidirectional
- Questions to identify the relationships type:
- How many instances of B are related to one instance of A?
- How many instances of A are related to one instance of B?
The Relational model
Produced an automatic transmission database that replaced standard
transmission databases
- Based on a relation: matrix composed of intersecting tuples (rows ) and
attributes ( Columns )
Describes a precise set of data manipulation constructs
Relational database management system ( RDBMS )
- Performs basic functions provided by the hierarchical and network DBMS
systems
- Makes the relational model easier to understand and implement
- Hides the complexities of the relational model from the users
SQL based relational database application
End user interface:
- Allows end users to interact with the data
Collection of tables stored in the database:
- Each table is independent from another
- Rows in different tables are related based on common values in
common attributes
SQL engine:
- Executes all queries
The Entity Relational Model
Graphical representation of entities and their relationships in a database
structure
- Entity relationship diagram ( ERD ): uses graphic representations to
model database components
- Entity instance or entity occurrence: rows in the relational table
- Attributes: describe particular characteristics
- Connectivity: term used to label the relationship types
NoSQL
Advantages
- High scalability, availability, and fault tolerance are provided
- Uses low cost commodity hardware
- Supports big data
- Key value model improves storage efficiency
Disadvantages
- Complex programming is required
- There is no relationship support
- There is no transaction integrity support
- In terms of data consistency, it provides an eventually consistent model
CH03
Keys
Key consist of one or more attributes that determine other attributes
- Ensure that each row in a table is uniquely identifiable
- Establish relationships among tables and to ensure the integrity of the
data
Primary key: attribute or combination of attributes that uniquely identifies a row
Dependencies
Determination
State in which knowing the value of one attribute makes it
possible to determine the value of another
- Establishes the role of a key
- Based on the relationships among the attributes
-
Functional dependence: value of one more attributes determines
the value of one or more other attributes
- Determinant: attribute whose value determines another
- Dependant: attribute whose value is determined by the other
attribute
Fully functional dependence: Entire collection of attributes in the
determinant is necessary for the relationship
Types of keys
Several types of keys are used in the relational model
- Composite key: key that is composed of more than one attribute
- Key attribute: attribute that is part of a key
- Superkey: key that can uniquely identify any row in the table
- Candidate key: minimal superkey
- Foreign key: primary key of one table that has been placed into
another table to create a common attribute
- Secondary key: used strictly for data retrieval purposes
Entity integrity: condition in which each row in the table has its own unique
Identity
- All of the values in the primary key must be unique
- No key attribute in the primary key can contain a null
Null: absence of any data value
- Unknown attribute value, known but missing attribute value or
inapplicable condition
Referential integrity: every reference to an entity instance by another entity
instance is valid
Relationships within the relational database
One to many ( 1:M )
- Norm for relational databases
-
One to one ( 1:1 )
- One titity can be related to only one other entity and vise versa
Many to many ( M:N )
- Implemented by creating a new entity in 1:M relationships with the original
entities
- Composite entity: helps avoid problems inherent to M:N relationships
- Includes the primary keys of tables to be linked
CH04
Entity relational model
Forms the basics of an ( ERD )
- Conceptual database as viewed by end user
Database’s main components
- Entities
- Attributes
- Relationships
Entities
Object of interest to the end user
- Refers to the entity set and not to a single entity occurrence
ERM corresponds to a table ( not a row ) in the relational environment
- ERM refers to a table row as an entity instance or entity occurrence
In Chen, Crow’s Foot, and UML notations, an entity is represented by a
rectangle that contains the entity’s name
- The entity name, a noun, is usually written in all capital letters
Attributes
Characteristics of entities:
- Required attribute: must have a value and cannot be left empty
- Optional attribute: does not require a value and can be left empty
- Domain: a set of possible values for a given attribute
- Identifier: one or more attributes that uniquely identify each entity
instance
- Composite identifier: primary key composed of more than one
attribute
- Composite attribute: attribute that can be subdivided to yield
additional attributes
- Simple attribute: attribute that cannot be subdivided
- Single valued attribute: attribute that has only a single value
- Multivalued attributes: attributes that have many values
Requirements of multivalued attributes:
- Create several new attributes, one for each component of the
original multivalued attribute
- Develop a new entity composed of the original multivalued
attribute’s components
Derived attribute: attribute whose value is calculated from other attributes
- Derived using an algorithm
Relationships, Connectivity, Cardinality
Association between entities that always operate in both directions
- Participants: entities that participate in a relationship
-
Connectivity: describes the relationship classification
- Includes 1:1, 1:M, and M:N relationships
Cardinality: expresses the minimum and maximum number of entity occurrences
associated with one occurrence of related entity
- In the ERD, cardinality is indicated by placing the appropriate numbers
beside the entities, using the format ( X, Y )
Existence dependence
Existence dependance
- Entity exists in the database only when it is associated with other related
entity occurrences
Existence independence
- Entity exists apart from all of its related entities
- Referred to as a strong entity or regular entity
Relationship strength
Weak ( Non-identifying ) relationship
- Primary key of the related entity does not contain a primary key
component of the parent entity
Strong ( identifying ) relationship
- Primary key of the related entity contains a primary key component of the
parent entity
Conditions of a weak entity:
- Existence dependant
- Has a primary key that is partially or totally derived from parent entity in
the relationship
Database designer determines whether an entity is weak
- Based on business rules
Developing an ERD
Activities involved in building an ERD:
- Create a detailed narrative of the organization’s description of operations
- Identifying business rules based on descriptions
- Identify main entities and relationships from the business rules
- Develop then initial ERD
- Identify the attributes and primary keys that adequately describe entities
- Revise and review ERD