MODULE-3
SQL
SQL Data Definition and Data Types
The relational model defines relation, tuple, and
attribute as core concepts: a set-based structure
where a relation (table) contains tuples (rows), each
described by attributes (columns)
SQL adopts more familiar terms—table instead of
relation, row instead of tuple, and column instead of
attribute—to map these concepts into usable
database objects
SQL Data Definition: CREATE
The DDL toolkit in SQL uses CREATE to define
structures like schemas, tables, types, domains,
views, assertions, triggers, and more .
This enables declaring the shape and rules of your
An SQL schema is identified by a schema name and
includes an authorization identifier to indicate the
user or account who owns the schema, as well as
descriptors for each element in the schema.
Schema elements include tables, types, constraints,
views, domains, and other constructs.
A schema is created via the CREATE SCHEMA
statement, which can include all the schema
elements’ definitions.
Eg:
CREATE SCHEMA COMPANY AUTHORIZATION
‘Jsmith’;
Schema Elements
Schemas can hold a variety of database objects:
Tables (relations)
Views
Domains and Types
Constraints (e.g., primary/foreign keys)
Triggers, indexes, functions, procedures, etc
SQL Catalog
An SQL catalog is a named collection of schemas. It
provides a way to organize and manage multiple
schemas in a database.
The CREATE TABLE Command in SQL
The CREATE TABLE command is used to create a new relation
(table) in a database. It specifies the table's name, attributes, and
initial constraints.
Syntax:
CREATE TABLE table_name (
attribute1 data_type [constraint],
attribute2 data_type [constraint], ... [table_constraints]);
Example:
CREATE TABLE EMPLOYEE (
SSN CHAR(9) PRIMARY KEY,
Name VARCHAR(255) NOT NULL,
Department VARCHAR(255),
Salary DECIMAL(10, 2),
FOREIGN KEY (Department) REFERENCES DEPARTMENT
(DepartmentName));
Table Name: The name of the table being created (e.g.,
EMPLOYEE).
Attributes: Each attribute is defined with a name and data
type (e.g., SSN, Name, Department, Salary).3.
Data Types: Specify the type of data that can be stored in
each attribute (e.g., CHAR, VARCHAR, DECIMAL).4.
Attribute Constraints: Optional rules that restrict the data in
each attribute (e.g., PRIMARY KEY, NOT NULL).5.
Table Constraints: Optional rules that apply to the entire
table (e.g., FOREIGN KEY).
Constraints
Constraints are used to enforce data integrity and consistency.
Common constraints include:
1.PRIMARY KEY: Uniquely identifies each row in the table.
2.FOREIGN KEY: References the primary key of another
table.
3. NOT NULL: Ensures that an attribute cannot contain
null values.
4. UNIQUE: Ensures that all values in an attribute are
unique.
Adding Constraints Later
Constraints can be added or modified later using the
ALTER TABLE command.
ALTER TABLE EMPLOYEE
ADD CONSTRAINT UNIQUE (SSN);
This command adds a unique constraint to the SSN
attribute in the EMPLOYEE table.
By using the CREATE TABLE command effectively,
you can design and create robust and scalable database
tables with appropriate constraints to ensure data integrity
and consistency.
Attribute Data Types and Domains in SQL
SQL supports various data types to store different types of data.
Here are some common data types:
Numeric Data Types
1.Integer Types:
- INTEGER or INT: Whole numbers (e.g., 1, 2, 3).
- SMALLINT: Small whole numbers (e.g., -32,768 to
32,767).
-BIGINT: Very Large integers (e.g., 1234567890)
2. Floating-Point Types:
- FLOAT or REAL: Floating-point numbers (e.g., 3.14).
- DOUBLE PRECISION: Double-precision floating-point
numbers.
3. Decimal Types:
- DECIMAL(p,s) or DEC(p,s) or NUMERIC(p,s): Formatted
numbers with precision (p) and scale (s) (e.g., 123.45).
Character-String Data Types
1.Fixed-Length Strings:
- CHAR(n) or CHARACTER(n): Fixed-length
character
strings (e.g., 'hello').
2.Variable-Length Strings:
- VARCHAR(n) or CHAR VARYING(n) or
CHARACTER VARYING(n): Variable-length character strings
(e.g., 'hello world').
Bit-String Data Types
1.Fixed-Length Bit Strings:
- BIT(n): Fixed-length bit strings (e.g., '101010').
2. Variable-Length Bit Strings:
- BIT VARYING(n): Variable-length bit strings (e.g.,
'10101011').
- BINARY LARGE OBJECT or BLOB: Large binary values
Boolean Data Type
1.Boolean Values:
- TRUE or FALSE: Boolean values.
Date and Time Data Types
1. Date: - DATE: Dates (e.g., '2022-01-01').
2. Time: - TIME: Times (e.g., '12:00:00').
3. Timestamp: - TIMESTAMP: Dates and times (e.g., '2022-
01-01 12:00:00').
4. Interval: - INTERVAL: Relative values that can be used to
increment or decrement absolute values of dates, times, or
timestamps.
Example table Using All Data Types:
CREATE TABLE Employee (
EmpID INT,
Name VARCHAR(50),
Salary DECIMAL(8,2),
JoinDate DATE,
IsActive BOOLEAN
);
Specifying Constraints in SQL
These include key and referential integrity constraints,
restrictions on attribute domains and NULLs, and
constraints on individual tuples within a relation using
the CHECK clause.
Specifying Attribute Constraints and Attribute
Defaults
A. Attribute Constraints (Column-Level Constraints):
These are rules defined directly within a column’s definition to
restrict what values it can hold:
NOT NULL
Prevents NULL values in a column. Ideal for
primary keys or required attributes.
Example: create table employee(
Id int(6) NOT NULL, name varchar(10) NOT NULL,
CHECK
Enforces a logical condition at the column level.
For example, to limit department numbers to 1–20:
Dnumber INT NOT NULL CHECK (Dnumber
> 0 AND Dnumber < 21)
Any insert/update violating the condition will be
Rejected.
UNIQUE
Ensures all non-NULL values in the column are
distinct. (Allows multiple NULLs.)
Email VARCHAR(100) UNIQUE
These are part of column declaration and
immediately applied to that column.
B. Attribute Defaults
The DEFAULT clause specifies a fallback value
when no explicit value is provided on INSERT:
Dno INT NOT NULL DEFAULT 1
ManagerID INT DEFAULT 100
When a new row is inserted without these specified,
the defaults apply. If there's no DEFAULT clause
and the column allows NULL, the default becomes
NULL
Example:
CREATE TABLE DEPARTMENT (
Dnumber INT NOT NULL CHECK (Dnumber > 0 AND
Dnumber < 21),
ManagerID INT DEFAULT 100
);
CREATE TABLE EMPLOYEE (
Ssn CHAR(9) NOT NULL PRIMARY KEY,
Name VARCHAR(100),
Dno INT NOT NULL DEFAULT 1 -- default department
);
Dnumber: must be between 1–20, cannot be NULL
ManagerID: defaults to 100 if not provided
Dno: for each new employee, defaults to department 1 if not
specified
Any omitted default attributes else default to NULL by default
Specifying Key and Referential Integrity
Constraints
1. Primary Key & Unique Constraints
PRIMARY KEY must contain UNIQUE values, and
cannot contain NULL values. A table can have only
one primary key and this primary key can consist of
single or multiple columns.
NOT NULL +UNIQUE=PRIMARY KEY.
create table employee(Id int(6) NOT NULL, name
varchar(10) NOT NULL, Address
varchar(20),primary key(ID));
UNIQUE defines alternate (candidate) keys—also
ensures uniqueness, but allows a single NULL.
2. Foreign Key & Referential Integrity
A FOREIGN KEY ensures values in the
child table correspond to existing primary (or
unique) key values in the parent table, or are
NULL. This preserves referential integrity—
no orphaned references allowed.
Default Behavior (Restrict/No Action)
When no action is specified, SQL applies a
restrictive default: preventing operations that
would break integrity, also called
RESTRICT or standardized as NO ACTION
3. Referential Triggered Actions
You can customize the behavior of a foreign
Key when the parent key is updated or deleted:
Action ON DELETE Behavior ON UPDATE Behavior
Deletes related rows in Updates child key values
CASCADE
the child table to match new parent
Sets child foreign key to Sets to NULL if parent
SET NULL
NULL key is changed
Sets child key to its
SET DEFAULT Same as above on update
declared default value
NO Prevents the operation if Prevents if dependent
ACTION/RESTRICT children exist records exist
Example on delete cascading:
FOREIGN KEY (Dno) REFERENCES
DEPARTMENT(Dnumber)
ON DELETE CASCADE
If a Department (row in DEPATMENT) is deleted , then all
employees(rows in the child table referencing that department
using Dno) will also be automatically deleted.
Example set-null on delete:
FOREIGN KEY (MgrID) REFERENCES
EMPLOYEE(Ssn)
ON DELETE SET NULL
Requires MgrID to be nullable
SET DEFAULT uses a column's default;
ensure a default value exists
Putting It All Together
CREATE TABLE DEPARTMENT (
Dnumber INT PRIMARY KEY,
Dname VARCHAR(15) UNIQUE,
MgrID INT
DEFAULT 100 (if no value is given for mgeid,100 is used)
REFERENCES EMPLOYEE(Ssn)
ON DELETE SET NULL(if the referenced employee(manager) IS DELETED,SET MGRID
TO NULL)
ON UPDATE CASCADE (if the managers SSn (ssn) changes, it updates here too)
);
CREATE TABLE EMPLOYEE (
Ssn CHAR(9) PRIMARY KEY,
Name VARCHAR(100),
Dno INT
DEFAULT 1
NOT NULL
REFERENCES DEPARTMENT(Dnumber)
ON DELETE RESTRICT (prevent deletion of a department if employees exist in it).
ON UPDATE NO ACTION(if department number changes, no change is made in
EMPLOYEE; it will result in an error if the referential integrity is broken).
PRIMARY KEY ⇒ Dnumber, Ssn
UNIQUE ⇒ Dname
FOREIGN KEY constraints:
'MgrID': set to NULL if referenced manager
is deleted
'Dno': restricts deletion or update of
referenced department
Giving Names to Constraints
In SQL, constraints can be given names to identify
them uniquely. This is useful for referencing and
managing constraints.
Syntax : The general syntax for naming a
constraint is:
CONSTRAINT constraint_name constraint_type
(constraint_definition)
Example:
CREATE TABLE DEPARTMENT (
Dept_name VARCHAR(255),
Dept_create_date DATE,
Mgr_start_date DATE,
Mgr_ssn CHAR(9),
CONSTRAINT valid_dates CHECK (Dept_create_date <=
In this example, the CHECK constraint is named valid dates.
Benefits
1. Unique Identification: Constraint names must be unique
within a schema, making it easier to identify and reference
specific constraints.
2. Easy Management: Named constraints can be easily dropped
and replaced with new constraints using the ALTER TABLE
statement.
3. Improved Readability: Naming constraints makes the SQL
code more readable and self-explanatory.
Dropping Constraints
When dropping a constraint, you can use the ALTER TABLE
statement with the DROP CONSTRAINT clause, specifying
the constraint name:
ALTER TABLE DEPARTMENTDROP CONSTRAINT
valid_dates;
By giving names to constraints, you can improve the
manageability and readability of your SQL code.
Specifying Constraints on Tuples Using
CHECK
SQL allows you to specify constraints on tuples (rows) using the
CHECK clause. These constraints are applied to each row
individually and are checked whenever a row is inserted or
modified.
Row-Based Constraints
Row-based constraints are used to enforce rules that apply to
each row in a table. They can be specified using the CHECK
clause at the end of a CREATE TABLE statement.
Example
Suppose we have a DEPARTMENT table with the
following attributes:
- Dept_create_date: The date when the department was created.
- Mgr_start_date: The start date of the department manager.
We can specify a CHECK constraint to ensure that a manager's start date is
later than the department creation date:
CREATE TABLE DEPARTMENT (
Dept_name VARCHAR(255),
Dept_create_date DATE,
Mgr_start_date DATE,
Mgr_ssn CHAR(9),
CHECK (Dept_create_date <= Mgr_start_date));
In this example, the CHECK constraint ensures that the Mgr_start_date is
later than or equal to the Dept_create_date for each row in the
DEPARTMENT table.
Benefits
1. Data Integrity: Row-based constraints help ensure that the data in each
row is consistent and accurate.
2. Improved Data Quality: By specifying CHECK constraints, you can
prevent invalid or inconsistent data from being inserted or updated.
3. Reduced Errors: CHECK constraints can reduce errors by enforcing rules
at the row level.
By specifying constraints on tuples using CHECK, you can improve the
Basic Retrieval Queries in SQL
SQL has one basic statement for retrieving information from a
database: the SELECT statement.
The SELECT-FROM-WHERE Structure of Basic
SQL Queries
The SELECT-FROM-WHERE Structure of Basic SQL Queries
SELECT <attribute list>
FROM <table list>
WHERE <condition>;
Where
■<attribute list> is a list of attribute names whose values are to be retrieved
by the query.
■<table list> is a list of the relation names required to process the query.
■<condition> is a conditional (Boolean) expression that identifies the tuples
to be retrieved by the query.
In SQL,the basic logical comparison operators for comparing attribute
values with one another and with literal constants are =, <, <=, >, >=,
Query: Retrieve the birth date and address of the employee(s)
whose name is 'John B. Smith'.
SQL Query:
SELECT Bdate, Address
FROM EMPLOYEE
WHERE Fname = 'John' AND Minit = 'B' AND Lname = 'Smith';
Relational Algebra Equivalent:
The SELECT clause corresponds to the projection operation in
relational algebra.
The WHERE clause corresponds to the selection operation in
relational algebra.
Query 1: Retrieve the name and address of all employees who
work for the ‘Research’ department
SELECT FNAME, LNAME, ADDRESS
FROM EMPLOYEE, DEPARTMENT
WHERE DNAME = ‘Research’ AND DNUMBER = DNO;
FROM Emp, Dept – This is a Cartesian join between the Emp and
Dept tables.
WHERE Dname = 'Research' – Filters to rows where the
department name is “Research.”
AND Emp. Dno = Dept.Dno – Ensures employees match the
correct department using the join on department number (Dno).
What it does:
Dname = 'Research' is a selection condition (filters rows by dept
name).
Emp. Dno = Dept.Dno is a join condition (links employees to their
department).
You’re effectively combining both in one SELECT statement—
commonly called a select-project-join query.
Query 3: For every project located in ‘Stafford’, list the project
number, the controlling department number, and the department
manager’s last name , address, and birth date.
SELECT Pnumber, Dnum, Lname, Address, Bdate
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE num=Dnumber AND Mgr_ssn=Ssn AND Plocation =
‘Stafford’;
Ambiguous Attribute Names, Aliasing, Renaming,
and Tuple Variables
1. Ambiguous Attribute Names
SQL allows the same column name to exist in multiple
tables (e.g., Name in both EMPLOYEE and
DEPARTMENT).
When referencing such columns in a multi-table query,
you must disambiguate them using the fully qualified
syntax:
TableName.ColumnName
Example:
SELECT Fname, EMPLOYEE.Name, Address
FROM EMPLOYEE, DEPARTMENT
WHERE DEPARTMENT.Name = 'Research' AND
DEPARTMENT.Dnumber = EMPLOYEE.Dnumber;
Here, EMPLOYEE.Name and DEPARTMENT.Name
Aliasing and Renaming:
Aliasing and renaming are techniques used to assign temporary names
to relations or attributes, making it easier to refer to them in queries.
This is particularly useful when dealing with complex queries
involving multiple relations.
Tuple Variables:
Tuple variables are used to represent individual rows in a relation.
They can be used to simplify queries and improve readability.
Example:
SELECT E.Fname, E.Lname, S.Fname, S.Lname
FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.Super_ssn = S.Ssn;
E refers to the employee.
S refers to that employee’s supervisor.
This approach enables a self-join to retrieve both the employee’s and
the supervisor’s names
3.Unspecified WHERE Clause and Use of the Asterisk.
Unspecified WHERE Clause:
- A missing WHERE clause in a SQL query indicates no condition
on tuple selection.
- All tuples of the relation specified in the FROM clause qualify
and are selected for the query result.
- If more than one relation is specified with no WHERE clause,
then the cross product of these relations is selected.
Use of Asterisk (*):
- To retrieve all attribute values of selected tuples, you don't have
to list the attribute names explicitly in SQL.
- Instead, you can specify an asterisk (*), which stands for all the
attributes.
Example:
SELECT * FROM Emp
WHERE Dno=5;
This retrieves all the attribute values of any employee who
Select all combination of EMPLOYEE SSN and
DEPARTMENT DNAME in the database
SELECT SSN, DNAME
FROM EMPLOYEE, DEPARTMENT;
Select the CROSS PRODUCT of the EMPLOYEE
and DEPARTMENT relations
SELECT *
FROM EMPLOYEE, DEPARTMENT;
4. Tables as Sets in SQL
SQL usually treats a table not as a set but rather as a multiset.
This means that duplicate tuples can appear more than once in a
table and in the result of a query.
SQL does not automatically eliminate duplicate tuples in the
results of queries, for the following reasons:
1. Duplicate elimination is an expensive operation. One way
to implement it is to sort the tuples first and then
eliminate duplicates.
2. The user may want to see duplicate tuples in the result of
a query.
3. When an aggregate function is applied to tuples, in most
cases we do not want to eliminate duplicates.
To eliminate duplicate tuples from the result of an SQL query,
the keyword DISTINCT is used in the SELECT clause. This
ensures that only distinct tuples remain in the result.
SELECT DISTINCT vs. SELECT ALL:
- A query with SELECT DISTINCT eliminates duplicates.
- A query with SELECT ALL does not eliminate duplicates.
- If neither ALL nor DISTINCT is specified, the default is
SELECT ALL.
Example
Retrieves the salary of every employee; if several employees
have the same salary, that salary value will appear as many
times in the result of the query.
SELECT ALL Salary FROM EMPLOYEE;
Retrieve the salary of every employee and all distinct salary
Values
SELECT DISTINCT Salary FROM EMPLOYEE;
SQL has some of the set operations they are
Set Union (UNION): Combines the results of two
queries into a single result set, eliminating
duplicates.
Set Difference (EXCEPT): Returns the tuples that
are present in the first query but not in the second
query, eliminating duplicates.
Set Intersection (INTERSECT): Returns the tuples
that are common to both queries, eliminating
duplicates.
The relations resulting from these set operations are
sets of tuples, meaning that duplicate tuples are
eliminated from the result.
SQL also supports multiset operations, which are
followed by the keyword ALL (UNION ALL,
EXCEPT ALL, INTERSECT ALL). Their results are
multisets, meaning that duplicates are not eliminated.
Example
Example of using the UNION operator to combine the
results of two queries:
(SELECT DISTINCT Pnumber
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND
Lname='Smith')
UNION
(SELECT DISTINCT Pnumber
FROM PROJECT, WORKS_ON, EMPLOYEE
WHERE Pnumber=Pno AND Essn=Ssn AND
This query retrieves the project numbers for projects
that involve an employee whose last name is 'Smith',
either as a worker or as a manager of the department
that controls the project. The UNION operator is used
to combine the results of the two queries,eliminating
duplicates.
5.Substring Pattern Matching and Arithmetic
Operators
The LIKE operator is used in a WHERE clause to search for a
specified pattern in a column.
There are two wildcards used in conjunction with the LIKE
operator:
1. % The percent sign represents zero, one, or multiple
characters
2. _ The underscore represents a single character.
Query
Retrieve all employees whose address is in Houston,
Texas.
SELECT Fname, Lname
FROM EMPLOYEE
Query:
Find all employees who were born during the 1950s
SELECT Fname, Lname
FROM EMPLOYEE
WHERE Bdate LIKE ‘_ _ 5 _ _ _ _ _ _ _’ ;
This query assumes that the Bdate column is stored as
a string in the format 'YYYY-MM-DD'.
Escaping Special Characters:
When using the LIKE operator, special characters like
_ and % have special meanings. To use these characters
as literal characters in a string, an escape character (\) is
used to precede the special character.
For example, to match the string 'AB_CD%EF', the
pattern would be 'AB_CD%EF'.
Apostrophes in Strings: When including apostrophes
(single quotation marks) in a string, you need to
represent them as two consecutive apostrophes ('').
For example, to match the string "John's car",you
would use the string 'John''s car'.
Arithmetic Operators: SQL supports standard
arithmetic operators for numeric values or attributes
with numeric domains:
+ (addition)
- (subtraction)
* (multiplication)
/ (division)
Query: Show the resulting salaries if every employee working
on the ‘ProductX’ project is given a 10 percent raise.
SELECT E.Fname, E.Lname, 1.1 * E.Salary AS Increased_sal
FROM EMPLOYEE AS E, WORKS_ON AS W, PROJECT AS P
WHERE E.Ssn=W.Essn AND W.Pno=P.Pnumber AND P.Pname
=‘ProductX’;
- The 1.1 * E.Salary expression calculates the increased salary by
multiplying the current salary by 1.1 (100% + 10%).
- The AS keyword is used to alias the calculated column as
Increased_sal.
Query : Retrieve Employees with Salary Between $30,000 and
$40,000
This query retrieves all employees in department 5 whose salary is
between $30,000 and $40,000:
SELECT *
FROM EMPLOYEE
WHERE (Salary BETWEEN 30000 AND 40000) AND Dno = 5;
- The BETWEEN operator is used to specify a range of values
(inclusive).
- This is equivalent to the condition ((Salary >= 30000) AND
(Salary <= 40000)).
BETWEEN Operator:
The BETWEEN operator is a convenient way to specify a range
of values. It's equivalent to using two separate conditions with
AND.
Example:
Retrieve all employees in department 5 whose salary is between
$30,000 and $40,000.
SELECT * FROM EMPLOYEE
WHERE (Salary BETWEEN 30000 AND 40000 ) AND Dno =5;