0% found this document useful (0 votes)
1 views

DataModelingTraining

The Data Modeling course aims to teach participants how to design and structure data for efficient storage and management in databases, covering topics such as logical, conceptual, and physical data models, along with practical modeling tools. The course schedule includes five days of detailed agendas focusing on various aspects of data modeling, including entity-relationship diagrams and normalization techniques. Overall, the course emphasizes the importance of logical data modeling in aligning data structures with business needs and improving communication among stakeholders.

Uploaded by

suresh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

DataModelingTraining

The Data Modeling course aims to teach participants how to design and structure data for efficient storage and management in databases, covering topics such as logical, conceptual, and physical data models, along with practical modeling tools. The course schedule includes five days of detailed agendas focusing on various aspects of data modeling, including entity-relationship diagrams and normalization techniques. Overall, the course emphasizes the importance of logical data modeling in aligning data structures with business needs and improving communication among stakeholders.

Uploaded by

suresh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 71

Data Modeling

Hello! Instructor introduction


1.Instructor:
About the course
• Data Modeling course teaches you how to design
and structure data in a way that makes it easy to
store, retrieve, and manage in databases. It’s a
key skill in fields like data science, software
development, and database administration
Course Objective
• The objective of this course is to:
• Basics of Data Modeling
• Types of Data Models - Logical, Conceptual, Physical
• Entity-Relationship Diagrams (ERDs)
• Practical Modeling Tools
Course Schedule
Topic Name Day

• Introduction to Logical Data Modeling


1

• Project Context and Drivers


2

• Conceptual Data Modelling


3

• Advanced Relationships & Advance Data Modeling Technique 4

• Data Model Normalizations,Verification and Validation 5


Day 1 – Agenda

• Introduction to Logical Data Modeling


• Importance of logical data modeling in requirements
• When to use logical data models
• Relationship between logical and physical data model
• Elements of a loical data model
• Read a high-level data model
• Data model prerequisites
• Data model sources of information
• Developing a logical data model
Day 2 – Agenda

• Project Context and Drivers


• State diagrams
• Importance of well-defined • Class diagrams
solution scope
• Types of modeling projects
• Functional decomposition diagram
• Transactional business systems
• Context-level data flow diagram
• Business intelligence and data
• Sources of requirements warehosystems
• Data flow diagrams • Integration and consolidation of
• Use case models existinsystems
• Workflow models • Maintenance of existing systems
• Business rules • Enterprise analysiso Commercial
off-the-shelf application
Day 3 – Agenda

• Day 3 :- Conceptual Data Modelling • Naming the relationships


• Discovering entities • Discover attributes for the subject area
• Defining entities • Assign attributes to the appropriate entity
• Documenting an entity • Name attributes using established naming
• Identifying attributes • conventions
• Distinguishing between entities and • Documenting attributes
attributes • Advanced Relationships
• Model fundamental relationships • Modeling many-to-many relationships
• Cardinality of relationships • Model multiple relationships between the
• One-to-one • same two entities
• One-to-many • Model self-referencing relationships
• Many-to-many • Model ternary relationships
• Is the relationship mandatory or optional? • Identify redundant relationships
Day 4 – Agenda

• Advanced Relationships
• Modeling many-to-many relationships
• Model multiple relationships between the
• same two entities
• Model self-referencing relationships
• Model ternary relationships
• Identify redundant relationships
Day 5 – Agenda

• Comletin the Loical Data Model • Verification and Validation


• Use supertypes and subtypes to manage • Verify the technical accuracy of a logical
complexity data
• Use supertypes and subtypes to • model Use CASE tools to assist in
represent rules verification
• and constraints • Verify the logical data model using other
• . Data Integrity Through Normalization models
• Normalize a logical data model • o Data flow diagram
• First normal form Second normal form • o CRUD matrix
Third normal form
• Reasons for denormalization
• Transactional vs. business intelligence
• applications
Day-1
Data Modeling?
What is Data Modeling?

• Data modelling in analysis is the process of


creating a visual representation , abstraction of
data structures, relationships, and rules within a
system or organization. Determining and analyzing
the data requirements required to support business
activities within the bounds of related information
systems in organizations is another process known
as data modelling.
• The main objective of data modelling is to provide
a precise and well-organized framework for data
organization and representation, since it enables
efficient analysis and decision-making. Analysts
can discover trends, understand the connections
between various data items, and make sure that
data is efficiently and accurately stored by building
models.
What is Data Model?

• Data models are visual


representations of an
enterprise’s data elements and
the connections between them.
Models assist to define and
arrange data in the context of
key business processes, hence
facilitating the creation of
successful information
systems. They let business and
technical personnel to
collaborate on how data will be
kept, accessed, shared,
updated, and utilized within an
organization.
Types of Data
Models
Types of Data
Models
• Conceptual Data Model: Conceptual Data Model is a
representations of data Examine and describe in depth your
abstract, high-level business concepts and structures. They are
most commonly employed when working through high-level
concepts and preliminary needs at the start of a new project.
They are typically developed as alternatives or preludes to the
logical data models that come later he main purpose of this data
model is to organize, define business problems , rules and
concepts. For instance, it helps business people to view any data
like market data, customer data, and purchase data.
• Logical Data Model: In the logical data model, By offering a
thorough representation of the data at a logical level, the logical
data model expands on the conceptual model. It outlines the
tables, columns, connections, and constraints that make up the
data structure. Although logical data models are not dependent
on any particular database management system (DBMS), they
are more similar to how data would be implemented in a
database. The physical design of databases is based on this idea.
• Physical Data Model: In Physical Data model ,The
implementation is explained with reference to a particular
database system. It outlines every part and service needed to
construct a database. It is made with queries and the database
language. Every table, column, and constraint—such as primary
key, foreign key, NOT NULL, etc.—is represented in the physical
data model.
Importance of logical data modeling in
requirements
• Logical data modeling gives businesses a clear, flexible
foundation for building reliable databases. It improves
communication, reduces costly errors, and ensures
data structures align with actual business needs.

• Essentially, a logical data model provides the


foundations necessary for productive database design.
Without a logical data model, designers can only really
figure out a new application's requirements as they go.
When to use logical data models
Logical data models can be used for impact analysis, as
each and every business process plus rule is connected
within it. As objects in the logical data model bear
textual definitions in business language, it makes it
easier to maintain and access system documentation.
Relationship between logical and physical data
model
The relationship between logical and physical data
models is that the logical model serves as a blueprint for
the physical model. The logical model defines the
structure and relationships of data in a way that's
independent of any specific database implementation.
The physical model, on the other hand, takes this logical
model and translates it into a concrete structure within a
particular database system, including details like tables,
columns, data types, and indexes.
Introduction to Logical Data Modeling
Elements of a logical data model
Read a high-level data model
A high-level data model, like a conceptual model, provides a
general overview of how data is organized, focusing on entities,
attributes, and relationships without specific database
implementation details. It's a broad view of the data landscape,
often used for communication and understanding before diving into
the specifics.

Data model prerequisites


Accuracy and completeness ensure data integrity and reliability for
business intelligence. A well-designed logical data model supports
data warehouses and enables efficient query processing. Flexibility
allows the model to adapt to changing business processes and data
types
Data model sources of information
A data model's sources of information can be diverse, ranging from relational databases and flat files
to web data, IoT devices, and even data from social media. The specific sources used depend on the
purpose and scope of the mode

Developing a logical data model


Developing a logical data model involves creating a structured representation of data requirements,
including entities, attributes, and relationships, without considering the specific technology used. It's a
crucial step in database design, ensuring a clear understanding of data structure and facilitating
communication between developersbusiness analysts, and database administrators
Day-2
Importance of well-defined solution scope
Functional decomposition Principle & diagram

A functional decomposition contains the whole function or project along with all of the necessary
sub-tasks needed to complete it. Functional decomposition is a problem-solving tool used in
several contexts, from business and industry to computer programming and AI.
Context-level data flow diagram

Context diagrams focus on how external entities interact with your system. It's the most basic
form of a data flow diagram, providing a broad view of the system and external entities in an
easily digestible way. Because of its simplicity, it's sometimes called a level 0 data flow diagram
Sources of requirements

In data modeling, requirements can be gathered from various sources, including business needs,
existing systems, and user requirements. These sources help define the data model's structure,
ensuring it accurately reflects the organization's data landscape and supports its processes.

Here's a look at different requirement sources:

• Business Requirements
• Existing Systems
• User Requirements
• Other Sources
• KPI
• Reporting Need
Data flow diagrams

A data flow diagram (DFD) maps out the flow of information for any process or system. It uses
defined symbols like rectangles, circles and arrows, plus short text labels, to show data inputs,
outputs, storage points and the routes between each destination.
Use case models

A use-case model is a model of how different types of users interact with the system to solve a
problem. As such, it describes the goals of the users, the interactions between the users and the
system, and the required behavior of the system in satisfying these goals.
Workflow models

Workflow models allow for the standardization of organizational processes. They define a
structured approach to executing tasks, ensuring consistency and adherence to predefined
guidelines. This consistency is crucial for maintaining quality, compliance, and achieving desired
outcomes.
Business rules

Business rules in data modeling are constraints, policies, and logic that define how data behaves and relates
within a database. They ensure data quality, consistency, and alignment with business goals. These rules are
typically derived from a detailed description of the organization's operations. By incorporating business rules,
data models can reflect real-world data environments accurately and lead to better database designs.
State diagrams

State diagrams provide an abstract description of a system's behavior. This behavior is analyzed and
represented by a series of events that can occur in one or more possible states. Hereby "each diagram usually
represents objects of a single class and track the different states of its objects through the system".
Class diagrams

The class diagram is the main building block of object-oriented modeling. It is used for general conceptual
modeling of the structure of the application, and for detailed modeling, translating the models into
programming code
Types of modeling projects

Transactional business system

A transactional business system, also known as an Online Transaction Processing (OLTP) system, handles the
recording and processing of a company's daily transactions. These systems are designed to manage a high
volume of transactions efficiently and reliably, focusing on data integrity and accuracy. Examples include
systems for online banking, e-commerce, and inventory management

Business intelligence and data warehousing systems

Business Intelligence (BI) and data warehousing are related but distinct concepts. While data warehousing
focuses on storing and organizing large amounts of data for analysis, BI encompasses the processes and
technologies used to analyze that data and extract actionable insights. Essentially, a data warehouse provides
the foundation for BI by serving as a centralized repository of data, while BI tools and techniques are used to
query, analyze, and visualize that data to drive decision-making.

Integration and consolidation of existing systems

Integrating and consolidating existing systems involves combining different, often disparate, systems to
create a unified whole. This can include merging data from multiple sources into a single repository
(consolidation) or connecting them for real-time data exchange (integration). The goal is to streamline
operations, improve data accessibility, and enhance overall efficiency
Maintenance of existing systems

Maintaining existing systems involves ongoing activities to keep them operational and meet evolving user
needs. This includes fixing bugs, enhancing functionality, and adapting to changes in the environment.
Effective maintenance is crucial for extending the life of a system and reducing its long-term costs

Enterprise analysis

Enterprise analysis is a comprehensive approach to understanding an organization's needs, goals, and


direction to identify opportunities for strategic initiatives and project investment. It involves assessing the
current state, defining future goals, and mapping out the path to achieve them, considering the entire
organization's value chain. This process helps prioritize projects, manage change, and maximize return on
investment, ensuring initiatives align with the overall business strategy.

Commercial off-the-shelf application

A commercial off-the-shelf (COTS) application is a pre-packaged software program that's available for
purchase and use without extensive customization. COTS applications are also known as off-the-shelf
software.
Day-3
Conceptual Data
Modeling
Conceptual data modeling is a high-level representation of an organization's data needs, focusing on the core
business concepts and their relationships, rather than specific technical details. It serves as a blueprint for
developing more detailed logical and physical data models, ensuring that the database structure aligns with
the business requirements. Essentially, it's a way to understand and document the essential data elements
and their interactions in a business context.

Discovering entities

Discovering entities in data modeling involves identifying the core objects or concepts for which data is
collected. These entities represent the building blocks of a data model, often corresponding to tables in a
relational database. The process typically involves analyzing data sources and business requirements, and can
be facilitated by tools like Entity-Relationship Diagrams (ERDs).

Defining entities

In data modeling, defining entities involves identifying the core objects or concepts about which data will be
stored and managed within a database or system. These entities represent real-world objects, people, places,
concepts, or events that are of interest to the application or domain
Documenting an entity

Documenting an entity in data modeling involves capturing details about an object of interest, its attributes,
and relationships with other entities. This documentation is crucial for understanding the data structure and
ensures consistency and accuracy.
Identifying attributes

In data modeling, identifying attributes involves recognizing and defining the characteristics or properties of
entities, which are the core objects or concepts being modeled. These attributes describe the entity and are
the most fundamental building blocks of a data model. They represent the data that is stored for each entity

Distinguishing between entities and attributes

In data modeling, an entity represents a real-world object or concept, like a "Student" or a "Course."
Attributes, on the other hand, are characteristics or properties that describe an entity, such as a "Student's
Name" or a "Course's Credit Hours". Entities are the fundamental building blocks of a data model, and
attributes provide details about those entities.
Model fundamental relationships

Modeling fundamental relationships involves using various frameworks to represent connections between entities or
individuals. These models help understand and enhance interactions, whether in personal relationships, work
environments, or data structures.

Cardinality of relationships

Cardinality, in the context of relationships in databases and data modeling, describes the numerical relationship
between entities or tables. It essentially defines how many instances of one entity can be related to instances of
another entity.

There are three main types of cardinality:


one-to-one (1:1),
one-to-many (1:N),
and many-to-many (N:M).

Types of Cardinality:
One-to-One (1:1):
Each instance of one entity is related to exactly one instance of another entity. For example, a person might have
exactly one passport.
One-to-Many (1:N):
Each instance of one entity can be related to multiple instances of another entity. For instance, one
customer can place many orders.

Many-to-Many (N:M):
Each instance of one entity can be related to multiple instances of another entity, and vice versa.
For example, many students can enroll in many courses.
Is the relationship mandatory or optional?

In the context of relationships between entities in data modeling, a relationship can be either mandatory or optional.
A mandatory relationship requires that an entity instance must participate in the relationship with another entity. In
contrast, an optional relationship allows an entity instance to participate in the relationship with another entity, but
this participation is not compulsory.

Naming the relationship

In a data model, relationships between entities should be named clearly and consistently to ensure understanding
and facilitate data manipulation. A good naming convention uses verbs to describe the relationship between two
entities. For example, "CUSTOMER places ORDER" describes the relationship between a customer and an order.
Active verbs are generally preferred, and inverse names (e.g., "ORDER placed by CUSTOMER") can be helpful for
readability

Discover attributes for the subject area

In data modeling, subject areas are groups of related entities and their attributes that represent specific business
functions or domains within an organization. To discover attributes for a subject area, you need to identify the key
characteristics and properties of the entities within that area. This involves understanding the business needs,
processes, and data requirements related to the subject are
Assign attributes to the appropriate entity

To assign attributes to an entity, you generally use tools or methods specific to the data modeling or database
system being used. This typically involves selecting the entity, then adding attributes and specifying their data types
and properties.

General Steps:
Identify the Entity: Determine the specific entity (e.g., a table, class, or object) you want to modify.
Open the Entity Editor or Tool: Use the appropriate tool (e.g., Attribute Editor, Data Modeler, or database
management system) to access the entity's attributes.
Add Attributes: Use the tool's interface to add new attributes to the entity.
Define Attribute Properties: Specify the data type (e.g., string, integer, date), name, and other relevant properties for
each attribute.
Save Changes: Confirm the changes to the entity's attributes.

Specific Tools/Methods:

Erwin/Power Designer/Any Data Modeling Tool


Name attributes using established naming conventions

Look at the database model below. I went a bit overboard and removed as many traces of a naming
convention as I could. This proves my first point: a naming convention is an important part of a well-
built data model. The model is not very fun to look at, to try to understand, or to code around.
Name attributes using established naming conventions

Look at the database model below. I went a bit overboard and removed as many traces of a naming convention as I
could. This proves my first point: a naming convention is an important part of a well-built data model. The model is
not very fun to look at, to try to understand, or to code around.

Planning A Naming Convention

Tables
Views
Columns
Keys – including the primary key, alternate keys, and foreign keys
Schemas

The case of the name. You can choose between:


UPPERCASE names
lowercase names
camelCase names – the name starts with a lowercase letter, but new words start with an uppercase letter
PascalCaseNames (also known as upper camel) – similar to camelCase, but the name starts with an uppercase letter,
as do all additional words
How to separate words in names:
you can separate them by case (starting each new word with an uppercase letter)
you can separate them with an underscore (like_this)
you can separate them with spaces, though that is very uncommon
For primary key (PK) columns: Are artificial PK columns called id, ID, table_name_id? Do you use artificial PK columns
at all?
For foreign key columns: Are they called book_id, bookID, etc?

PK_TableName for primary key constraints


FK_TableName_ReferencedTableName[_n] for foreign key constraints
UQ_TableName_ColumnName[_ColumnName2...] for unique constraints
CK_TableName_ColumnName (or CK_TableName_n) for check constraints

IX_TableName_ColumnName

Table_Name_BIS_TRG
Day-4
Advanced Relationships

Model self-referencing relationships

Self-referencing relationships, also known as recursive relationships, occur when a record in a table references
another record within the same table. This is common when modeling hierarchical or self-associative structures, such
as an employee reporting to another employee, or a category having sub-categories
Model ternary relationships

An association between 3 entities is called a ternary association. A typical example is an association between an
employee, the project they are working on, and their role in that project. If the role is a complex object, you might
decide to model this as 3 entity classes.

Identify redundant relationships


Redundant relationships in data models, particularly in databases, occur when multiple
relationships represent the same concept or information, leading to unnecessary complexity and
potential data inconsistencies. To identify them, you can analyze your data model or ER diagram,
looking for duplicated entries or relationships that could be derived from others
Advance Data Modeling Technique
 Dimension Table
 Fact Table
 Slowly Changing Dimension
 Type 0, Type 1, Type 2, Type 3
 Types of Dimension
 Conformed Dimension, Junk Dimension, Role Playing Dimension
 Additive Fact, Semi Additive Fact, Non Additive Fact

All above topics will be covered Practically during the workshop


Day-5
Completing the Logical Data Model

supertypes and subtypes

In data modeling, supertypes and subtypes are used to represent hierarchical relationships between
entities, facilitating the organization of data by capturing both commonalities and differences among
related entities. A supertype is a generalized entity, and subtypes are specialized entities that inherit
attributes from the supertype while also having their own unique attribute

Supertypes:
A supertype is a general entity that encompasses a broader category or concept. It serves as a parent entity, and its
attributes are shared by all of its subtypes. For example, in a vehicle database, "Vehicle" could be a supertype, as it
encompasses various types of vehicles like cars, trucks, and motorcycles.
Subtypes:

Subtypes are specialized versions of the supertype. They inherit the attributes of the supertype but also have
additional, specific attributes unique to themselves. For instance, "Car", "Truck," and "Motorcycle" could be subtypes
of the "Vehicle" supertype, each with its own attributes specific to its type (e.g., "Car" might have attributes like
"numberOfDoors", while "Truck" might have "payloadCapacity"
Constraints in supertype subtype relationship

Constraints on generalization define the rules governing the relationship between supertypes and subtypes. They
specify which entities can belong to specific subtypes, whether an entity can belong to multiple subtypes, and
whether a supertype entity must belong to at least one subtype

When you write constraints statements on a subtype table, you can refer to all of the following without joining with
another table:

All supertype columns.


All columns of the subtype table.
All subtype indicators of the constellation, since these are attributes of the supertype.
In the following example you only need a single table name in your statement.

Example 1

Suppose Persons can be Guides or Office Staff (subtype set Occupation), and they can be Male or Female (subtype
set Gender).

Maternity leave is not possible for guides, only for office staff. This business rule can be enforced by a restrictive
constraint with the following statement:
SELECT ' '
FROM female
WHERE maternity_leave = 'Y'
AND guide = 'Y'
In this example, Female is a subtype table, Maternity_leave is a column of this subtype table, and Guide is a subtype
indicator of the constellation.

In the following example, you need to join tables only because the business rule refers to subtype columns in
different subtype tables:

Example 2

Office staff can only get maternity leave if a number of conditions are met. These conditions refer to office staff
attributes such as the person's hire date.

This business rule can be enforced by a constraint with the following statement:

SELECT ' '


FROM female f
, staff s
WHERE f.primary_key = s.primary_key
AND f.maternity_leave = 'Y'
AND s.hire_date = condition
AND ...
Normalization

Database normalization is a process of organizing data in a relational database to minimize data


redundancy and improve data integrity. It involves breaking down large tables into smaller, more
manageable tables with relationships between them. This process reduces the likelihood of data
anomalies, such as insertion, deletion, and update anomalies, by ensuring that data is stored in a
consistent and efficient manner.

Key Concepts:
Redundancy: Storing the same information multiple times in a database.
Data Integrity: Ensuring the accuracy and consistency of data within the database.
Anomalies: Issues that can arise from inconsistencies or redundancies in data, like inserting, deleting,
or updating data that affects other parts of the database.
Normal Forms (1NF, 2NF, 3NF, etc.): A set of rules or levels that define how well data is structured and
normalized, with each level building upon the previous one.
Functional Dependencies: Relationships between attributes in a table, where one attribute (or set of
attributes) determines the value of another.
1. First Normal Form (1NF)

For a table to be in the First Normal Form, it should follow the following 4 rules:

It should only have single(atomic) valued attributes/columns.


Values stored in a column should be of the same domain.
All the columns in a table should have unique names.
And the order in which data is stored should not matter.
If we have an Employee table in which we store the employee information along with the employee
skillset, the table will look like this:
The above table has 4 columns:

All the columns have different names.

All the columns hold values of the same type like emp_name has all the names, emp_mobile has all
the contact numbers, etc.

The order in which we save data doesn't matter

But the emp_skills column holds multiple comma-separated values, while as per the First Normal form,
each column should have a single value.

Hence the above table fails to pass the First Normal form.
So how do you fix the above table?

You can also simply add multiple rows to add multiple skills. This will lead to repetition of the data, but
that can be handled as you further Normalize your data using the Second Normal form and the Third
Normal form
Second Normal Form (2NF)
For a table to be in the Second Normal Form,

It should be in the First Normal form.

And, it should not have Partial Dependency.

What is Partial Dependency?


When a table has a primary key that is made up of two or more columns, then all the columns(not
included in the primary key) in that table should depend on the entire primary key and not on a part of
it. If any column(which is not in the primary key) depends on a part of the primary key then we say we
have Partial dependency in the table
If we have two tables Students and Subjects, to store student information and information related to
subjects. And we have another table Score to store the marks scored by students in any subject like
this,
Now in the above table, the primary key is student_id + subject_id, because both these information are
required to select any row of data.

But in the Score table, we have a column teacher_name, which depends on the subject information or
just the subject_id, so we should not keep that information in the Score table.

The column teacher_name should be in the Subjects table. And then the entire system will be
Normalized as per the Second Normal Form.
Third Normal Form (3NF)

A table is said to be in the Third Normal Form when,

It satisfies the First Normal Form and the Second Normal form.

And, it doesn't have Transitive Dependency.

What is Transitive Dependency?


In a table we have some column that acts as the primary key and other columns depends on this
column. But what if a column that is not the primary key depends on another column that is also not a
primary key or part of it? Then we have Transitive dependency in our table.

Let's take an example. We had the Score table in the Second Normal Form above. If we have to store
some extra information in it, like,

exam_type

total_marks
To store the type of exam and the total marks in the exam so that we can later calculate the
percentage of marks scored by each student.

The Score table will look like this,

In the table above, the column exam_type depends on both student_id and subject_id, because,

a student can be in the CSE branch or the Mechanical branch,

and based on that they may have different exam types for different subjects.

The CSE students may have both Practical and Theory for Compiler Design,

whereas Mechanical branch students may only have Theory exams for Compiler Design.

But the column total_marks just depends on the exam_type column. And the exam_type column is not
a part of the primary key. Because the primary key is student_id + subject_id, hence we have a
Transitive dependency here.
can create a separate table for ExamType and use it in the Score table.

We have created a new table ExamType and we have added more related information in it like
duration(duration of exam in mins.), and now we can use the exam_type_id in the Score table.
Denormalization

Denormalization is the process of adding precomputed redundant data to an otherwise normalized


relational database to improve read performance. With denormalization, the database administrator
selectively adds back specific instances of redundant data after the data structure has been
normalized. A denormalized database should not be confused with a database that has never been
normalized.

Normalization vs. denormalization


Denormalization helps to address a fundamental fact in databases: slow read and join operations.

In a fully normalized database, each piece of data is stored only once, generally in separate tables,
with a relation to one another. To become usable, the information must be queried and read out from
the individual tables, and then joined together to provide the query response. If this process involves
large amounts of data or needs to be done many times a second, it can quickly overwhelm the
database hardware, reduce its performance, and even cause it to crash.
Denormalization pros and cons
Denormalization on databases has both pros and cons:

Pros
• Faster reads for denormalized data.
• Simpler queries for application developers.
• Less compute on read operations.

Cons
• Slower write operations.
• Increases database complexity.
• Potential for data inconsistency.
• Additional storage required for redundant tables.
CRUD matrix
CRUD matrix is a tool used to map out and visualize the relationships between data entities and the
operations that can be performed on them (Create, Read, Update, Delete). It helps understand and
define user permissions and data access within a system, especially in the context of business
processes and software development.

Purpose:

The CRUD matrix helps determine which data entities are affected by various business activities
and how those entities are manipulated (CRUD operations).
Structure:
It's typically represented as a table or matrix where columns represent the CRUD operations
(Create, Read, Update, Delete) and rows represent data entities or use cases.

Usage:
It's a valuable tool for business analysts, software developers, and database administrators to:
Map out data operations and their relationships.
Define user permissions and access control.
Identify potential data conflicts or inconsistencies.
Document the system's data flow and interactions.
CRUD matrix
Identifying & Non Identifying Relationsip

Identifying Relationship:
Dependency: The child entity cannot exist without the parent entity.
Primary Key: The child's primary key includes the parent's primary key.
Example: A book cannot exist without an author, so a "Book" entity's primary key would include the
"Author" entity's primary key.

Non-Identifying Relationship:
Independence: The child entity can exist independently of the parent entity.
Primary Key: The child entity has its own primary key, and the parent's primary key is included as a
foreign key in the child's table, but not as part of its primary key.
Example: A city can exist independently of a country, so a "City" entity can have its own primary key,
with the "Country" entity's primary key included as a foreign key.
Workshop
 Demostate the Data Modeling Tool Erwin/Power Designer
 Translate the OLTP Model to OLAP

QA Session

You might also like