DATABASE
ADMINISTRATION
Module 1: Introduction to Database Systems
This module provides a fundamental understanding of database
systems, their importance, and how they differ from other data
storage methods. It also covers the different types of databases
and the role of database administrators (DBAs) in managing
these systems.
Overviews of Databases..
- What is a Database?
- A database is an organized collection of data, stored and
accessed electronically. It enables efficient data management
and retrieval.
- Examples: Customer information systems, financial records,
product inventories, etc.
Overviews of Databases..
- Why Databases Matter
- They offer structured data storage, faster access, and easy
manipulation of large volumes of information.
- Databases support business processes, decision-making, and
data-driven applications.
Overviews of Databases..
- Types of Data Stored in Databases
- Structured (tables, rows, and columns) vs. Unstructured data
(images, documents, logs).
- Relational vs. Non-relational (NoSQL) databases.
Database Management System
- What is a DBMS?
- A DBMS is software that allows users to create, retrieve,
update, and manage databases efficiently. It ensures data
integrity, security, and performance.
Database Management System
- Key Features of a DBMS:
- Data Storage: Allows data to be stored in structured formats (e.g., tables).
- Data Retrieval: Supports SQL and other query languages for data extraction.
- Data Manipulation: Helps in adding, updating, or deleting data.
- Data Security: Ensures only authorized users can access or modify data.
- Concurrency Control: Manages multiple users accessing the database
simultaneously.
Database Management System
- Popular DBMS Platforms:
- Relational DBMS (RDBMS): MySQL, Oracle, SQL Server,
PostgreSQL.
- NoSQL DBMS: MongoDB, Cassandra, Couchbase (used for
unstructured or semi-structured data).
Types of Databases
- Relational Databases (RDBMS):
- Organize data in tables (rows and columns).
- Use SQL (Structured Query Language) to manage data.
- Ensure data integrity and support relationships between tables
using primary and foreign keys.
- Examples: MySQL, Oracle, PostgreSQL.
Types of Databases..
- Non-relational Databases (NoSQL):
- Handle unstructured, distributed, or large-scale data.
- Data is stored in formats such as documents, key-value pairs,
graphs, or columns.
- More flexible schema designs than RDBMS, often used in big
data applications.
- Examples: MongoDB, Cassandra, Redis.
Database vs. Spreadsheets and Flat Files
- Spreadsheets (e.g., Excel):
- Good for small-scale, personal data management.
- Limited capabilities for querying, data integrity, and handling
large datasets.
Database vs. Spreadsheets and Flat Files
- Flat Files (e.g., CSV, text files):
- Store data in a simple, line-by-line format.
- Lack of indexing and relationships between data, leading to
inefficiency as the dataset grows.
Database vs. Spreadsheets and Flat Files
- Advantages of Databases over Spreadsheets/Flat Files:
- Better for large data volumes, multi-user access, and complex
queries.
- Offer high scalability, security, and structured data
relationships.
The Role of a Database Administrator (DBA)
- Who is a DBA?
- A Database Administrator is responsible for the installation,
configuration, maintenance, and security of a database.
What are the key responsibilities of a Database
administrator?
The Role of a Database Administrator (DBA)
- Key Responsibilities:
- Database Design: Defining the structure and relationships
within the database.
- Data Security: Managing access permissions and ensuring
data is protected from unauthorized access.
- Backup and Recovery: Implementing backups and ensuring
data can be restored in case of loss or corruption.
The Role of a Database Administrator (DBA)
- Key Responsibilities:
- Performance Tuning: Optimizing the database for efficient data
retrieval and transaction processing.
- Monitoring and Maintenance: Regularly checking the health of
the database to ensure smooth operations.
The Role of a Database Administrator (DBA)
- DBA Skills and Tools:
- Proficiency in SQL and database management tools.
- Knowledge of operating systems (Linux, Windows), network
configuration, and storage management.
- Problem-solving and performance optimization skills.
Database Use Cases and Applications
- Business and E-Commerce:
- Databases power online transaction processing systems (OLTP) for
retail, inventory, and customer relationship management (CRM).
- Healthcare:
- Store and manage patient records, medical histories, and treatment
data in structured formats.
Database Use Cases and Applications
- Finance:
- Manage banking systems, financial transactions, and auditing
processes with secure databases.
- Education:
- Universities and institutions store student records, course
enrollments, and academic performance data.
Database Use Cases and Applications
- Cloud Databases:
- Increasing trend toward cloud-based databases (AWS RDS,
Google Cloud SQL) for scalability and cost-effectiveness.
- Big Data and Analytics:
- Growth of NoSQL databases for handling big data, machine
learning, and real-time analytics.
Database Use Cases and Applications
- Automation and AI in DBMS:
- Automated database tuning and AI-driven query optimizations
to improve performance and reduce manual intervention.
Module 2: Database Design and Modeling
This module focuses on how to design and model databases to
ensure data integrity, efficiency, and scalability. It covers
important concepts such as Entity-Relationship (ER) modeling,
normalization, and database schema design.
Introduction to Database Design
- What is Database Design?
- The process of structuring and organizing data into a logical
and efficient format that meets the needs of users and
applications.
- Good database design minimizes redundancy, maintains data
integrity, and improves performance.
Introduction to Database Design
- Importance of Database Design
- Reduces data anomalies and improves query performance.
- Ensures scalability, security, and maintainability of the
database system.
Introduction to Database Design
- Steps in Database Design:
1. Requirement Gathering: Identify what data needs to be stored and how it will be
accessed.
2. Conceptual Design: Define the entities and relationships without considering
technical constraints.
3. Logical Design: Translate the conceptual model into a detailed database schema,
focusing on data organization and normalization.
4. Physical Design: Implement the schema in a DBMS, optimizing for performance,
storage, and indexing.
Entity-Relationship (ER) Modeling
- What is ER Modeling?
- ER modeling is a graphical representation of the entities
(objects) in a system and their relationships to each other.
- It provides a high-level view of the database structure,
focusing on what data needs to be stored rather than how it is
stored.
Entity-Relationship (ER) Modeling
- Key Components of an ER Diagram:
- Entity: Represents a real-world object or concept (e.g.,
Customer, Product, Employee).
- Attribute: Characteristics or properties of an entity (e.g., name,
address, ID).
- Relationship: Describes how entities are related to each other
(e.g., a customer places an order).
Entity-Relationship (ER) Modeling
- Types of Relationships:
- One-to-One (1:1): Each entity in one table is related to only one entity in
another table (e.g., one person has one passport).
- One-to-Many (1:N): One entity in one table can relate to multiple entities
in another (e.g., one customer can place multiple orders).
- Many-to-Many (M:N): Entities from both tables can have multiple
relationships (e.g., students enroll in many courses, and each course has
many students).
Entity-Relationship (ER) Modeling
- ER Diagram Notations:
- Rectangles represent entities.
- Ellipses represent attributes.
- Diamonds represent relationships.
- Lines connect entities to relationships and attributes to
entities.
Identifying Keys in Database Design
- Primary Key (PK):
- A unique identifier for each record in a table.
- Every table must have a primary key to uniquely identify its
rows (e.g., Employee ID, Order Number).
Identifying Keys in Database Design
- Foreign Key (FK):
- A field (or collection of fields) in one table that uniquely identifies a
row in another table.
- Used to establish relationships between tables and maintain
referential integrity.
- Composite Key:
- A combination of two or more fields used together as a primary key.
Normalization and Denormalization
- What is Normalization?
- The process of organizing data in a database to reduce
redundancy and improve data integrity.
- Divides large tables into smaller, related tables and links them
using relationships (foreign keys).
Normalization and Denormalization
- Normalization Forms:
- First Normal Form (1NF): Eliminate duplicate data by ensuring
each column holds atomic values (one value per field).
- Second Normal Form (2NF): Eliminate partial dependencies by
ensuring all non-key attributes depend on the entire primary key.
- Third Normal Form (3NF): Eliminate transitive dependencies by
ensuring all non-key attributes depend only on the primary key.
Normalization and Denormalization
- Advantages of Normalization:
- Reduces data duplication.
- Ensures data consistency and integrity.
- Makes maintenance easier by separating related data into
different tables.
Normalization and Denormalization
- What is Denormalization?
- The process of combining tables to improve read performance by reducing
the number of joins in queries.
- Denormalization is often used in data warehouses and reporting systems for
performance optimization.
- Advantages of Denormalization:
- Improves query performance, especially in read-heavy applications.
- Reduces the complexity of queries by minimizing joins.
Logical vs. Physical Database Design
- Logical Database Design:
- Focuses on the high-level structure of the database without considering the
physical aspects of how data will be stored.
- Includes the creation of an ER diagram, defining entities, attributes, and
relationships, and ensuring normalization.
- Physical Database Design:
- Involves implementing the logical design in a specific DBMS.
- Decisions about indexing, storage formats, data types, partitioning, and file
organization are made at this stage.
Logical vs. Physical Database Design
- Optimization Considerations for Physical Design:
- Indexes: Speed up query performance by allowing faster access
to data.
- Data Types: Choose appropriate data types to minimize storage
and improve performance (e.g., integers vs. text).
- Partitioning: Divides large tables into smaller, more manageable
pieces for performance and scalability.
Database Schema
- What is a Database Schema?
- A schema is the blueprint or architecture of a database, defining how data is
organized and how relationships are enforced.
- Types of Schemas:
- Logical Schema: Describes the logical design of the database, focusing on entities,
attributes, and relationships.
- Physical Schema: Describes how data will be physically stored on disk (tables,
indexes, partitions).
Database Schema
- Schema Design Best Practices:
- Ensure that tables are normalized to avoid data redundancy.
- Choose appropriate data types and constraints to enforce data
integrity.
- Use foreign keys to maintain relationships and referential
integrity.
ER to Relational Schema Mapping
- Steps for Mapping ER Model to Relational Schema:
1. Mapping Entities to Tables: Convert each entity in the ER diagram to a table.
2. Mapping Relationships to Foreign Keys: Convert relationships to foreign keys,
ensuring the correct enforcement of referential integrity.
3. Mapping Attributes to Columns: Each attribute of an entity becomes a column
in the corresponding table.
4. Mapping Keys: Define primary and foreign keys to ensure unique identification
and maintain relationships between tables.
ER to Relational Schema Mapping
- Example:
- ER Model: A "Customer" entity is related to an "Order" entity
via a one-to-many relationship (a customer can place multiple
orders).
- Relational Schema: Create a "Customers" table with a primary
key (CustomerID) and an "Orders" table with a foreign key
(CustomerID) to link them.
Module 3: Structured Query Language (SQL)
Structured Query Language (SQL) is the standard language for
interacting with relational databases. This module will introduce
you to the basics of SQL, followed by more advanced querying
techniques, and cover essential aspects such as database
manipulation, data retrieval, and database integrity.
Introduction to SQL
- What is SQL?
- SQL is a standardized language used to query, manipulate,
and manage data in relational databases. It allows users to
create, update, delete, and retrieve data stored in databases.
- History and Standardization
- SQL was developed by IBM in the 1970s for their relational
database system.
Introduction to SQL
- SQL in Different DBMS:
- SQL syntax can vary slightly between different database
management systems (DBMS) like MySQL, SQL Server, Oracle,
and PostgreSQL, but the core functionality remains similar.
Basic SQL Commands (CRUD Operations)
- Data Manipulation Language (DML)
- DML is used to retrieve, insert, update, and delete data in a database.
1. SELECT – Retrieves data from one or more tables.
sql
SELECT column1, column2 FROM table_name WHERE condition;
- Example:
sql
SELECT first_name, last_name FROM employees WHERE department = 'Sales';
Basic SQL Commands (CRUD Operations)
2. INSERT INTO – Adds new records to a table.
sql
INSERT INTO table_name (column1, column2) VALUES (value1, value2);
- Example:
sql
INSERT INTO customers (customer_id, name, address) VALUES (1,
'John Doe', '123 Elm St');
Basic SQL Commands (CRUD Operations)
3. UPDATE – Modifies existing data in a table.
sql
UPDATE table_name SET column1 = value1, column2 = value2 WHERE
condition;
- Example:
sql
UPDATE employees SET salary = 60000 WHERE employee_id = 5;
Basic SQL Commands (CRUD Operations)
4. DELETE – Removes records from a table.
sql
DELETE FROM table_name WHERE condition;
- Example:
sql
DELETE FROM customers WHERE customer_id = 10;
Retrieving Data with SQL (SELECT Statement)
- Selecting Specific Columns:
- You can retrieve specific columns by specifying their names in
the `SELECT` statement.
sql
SELECT column1, column2 FROM table_name;
Retrieving Data with SQL (SELECT Statement)
- Filtering with WHERE Clause:
- The `WHERE` clause is used to filter rows based on specific conditions.
sql
SELECT * FROM table_name WHERE condition;
- Example:
sql
SELECT * FROM products WHERE price > 100;
Retrieving Data with SQL (SELECT Statement)
- Using Logical Operators (AND, OR, NOT):
- Combine multiple conditions using logical operators.
sql
SELECT * FROM employees WHERE department = 'HR' AND
salary > 50000;
Retrieving Data with SQL (SELECT Statement)
- Limiting Results (LIMIT / TOP):
- You can limit the number of rows returned using `LIMIT`
(MySQL/PostgreSQL) or `TOP` (SQL Server).
sql
SELECT * FROM customers LIMIT 10;
- Alias for Column Names:
SQL Joins
- What are Joins?
- Joins are used to retrieve data from multiple tables by
combining rows based on a related column.
SQL Joins
- Types of Joins:
1. INNER JOIN:
- Retrieves records that have matching values in both tables.
sql
SELECT columns FROM table1
INNER JOIN table2 ON [Link] = [Link];
SQL Joins
- Example:
sql
SELECT employees.first_name, departments.department_name
FROM employees
INNER JOIN departments ON employees.department_id =
departments.department_id;
SQL Joins
2. LEFT (OUTER) JOIN:
- Retrieves all records from the left table and the matched records from
the right table. If there is no match, NULL values are returned for columns
from the right table.
sql
SELECT columns FROM table1
LEFT JOIN table2 ON [Link] = [Link];
SQL Joins
3. RIGHT (OUTER) JOIN:
- Retrieves all records from the right table and the matched
records from the left table.
sql
SELECT columns FROM table1
RIGHT JOIN table2 ON [Link] = [Link];
SQL Joins
4. FULL (OUTER) JOIN:
- Retrieves all records when there is a match in either the left
or the right table.
sql
SELECT columns FROM table1
FULL JOIN table2 ON [Link] = [Link];
Aggregate Functions
- What are Aggregate Functions?
- Aggregate functions perform a calculation on a set of values
and return a single value.
Aggregate Functions
- Common Aggregate Functions:
- COUNT() – Returns the number of rows that match a specified condition.
sql
SELECT COUNT(*) FROM employees WHERE department = 'HR';
- SUM() – Adds up the values of a specified column.
sql
SELECT SUM(salary) FROM employees WHERE department = 'HR';
Aggregate Functions
- Common Aggregate Functions:
- AVG() – Returns the average value of a numeric column.
sql
SELECT AVG(salary) FROM employees WHERE department = 'HR';
- MIN() and MAX() – Return the minimum and maximum values in a column.
sql
SELECT MIN(salary), MAX(salary) FROM employees;
Aggregate Functions
- GROUP BY Clause:
- The `GROUP BY` clause is used to group rows that have the same values in specified columns and perform aggregate functions on them.
sql
SELECT department, AVG(salary)
FROM employees
GROUP BY department;
- HAVING Clause:
- The `HAVING` clause is used to filter results after grouping (similar to `WHERE` but for aggregated data).
sql
SELECT department, COUNT(*)
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;
Subqueries and Nested Queries
- What are Subqueries?
- A subquery is a query nested within another SQL query. It can be used in the `SELECT`,
`FROM`, `WHERE`, or `HAVING` clauses.
- Example: Subquery in WHERE clause:
sql
SELECT employee_name
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
Subqueries and Nested Queries
- Correlated Subqueries:
- A correlated subquery is a subquery that refers to a column from the outer
query.
sql
SELECT e1.first_name, [Link]
FROM employees e1
WHERE [Link] > (SELECT AVG([Link]) FROM employees e2 WHERE
[Link] = [Link]);
Indexes and Constraints
- Indexes:
- Indexes are used to speed up data retrieval. They allow the
database to find rows faster without scanning the entire table.
sql
CREATE INDEX index_name ON table_name (column1,
column2);
- Constraints:
Indexes and Constraints
- Common Constraints:
- Primary Key: Ensures that each row is unique.
- Foreign Key: Enforces relationships between tables.
- Unique: Ensures that all values in a column are unique.
- Check: Ensures that values in a column meet a specific
condition.
sql
ALTER TABLE employees ADD CONSTRAINT fk_department