0% found this document useful (0 votes)
29 views

Imperial College London - SQL Data Definition

The document discusses SQL's Data Definition Language (DDL) which is used for schema creation and modification. It covers creating tables with attributes, primary keys, unique constraints, not null constraints, check constraints and foreign keys. Modifying schemas using ALTER TABLE to add or drop attributes is also discussed. Assertions, which define constraints across multiple relations, are introduced. Examples are provided for each concept to illustrate SQL DDL syntax.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Imperial College London - SQL Data Definition

The document discusses SQL's Data Definition Language (DDL) which is used for schema creation and modification. It covers creating tables with attributes, primary keys, unique constraints, not null constraints, check constraints and foreign keys. Modifying schemas using ALTER TABLE to add or drop attributes is also discussed. Assertions, which define constraints across multiple relations, are introduced. Examples are provided for each concept to illustrate SQL DDL syntax.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

SQL Data Definition

Naranker Dulay

[email protected]
https://www.doc.ic.ac.uk/~nd/databases
Data definition

SQL’s Data Definition Language (DDL) is concerned with schema creation and
modification; as well as the specification of constraints and performance options such
as materialised views and indexing.

We’ll mainly look at base relations (stored tables) and briefly at derived relations
(computed views).

Constraints in SQL include: domain (type) constraints, primary key constraints,


foreign key constraints, unique constraints, not null constraints, check constraints,
and assertions.

N. Dulay Databases : 08 : SQL Data Definition 2


Creating a Relation

The create table statement is used to create a new named relation and declare its
schema. The relation is persistent and is stored on disk in specially organised files.

Example: movie(title, year, length, genre) The attributes belonging to


the primary key are usually
create table movie ( underlined in textbooks.
title varchar(120),
year int default 2011, Attribute types/domains
length int default 0, constrain the values that can
genre char(20), be assigned to the attribute.
primary key (title, year) For examples of attribute
) types, see Databases 06

The order that attributes are defined is sometimes used by other SQL statements. For
example select *, and inserting values into a relation when no attribute list is given.

We can remove relations with drop table, e.g. drop table movie
N. Dulay Databases : 08 : SQL Data Definition 3
Modifying the Schema

The alter table statement is used to add or remove attributes for a relation.

Example: movie(title, year, length, genre)

alter table movie add studio char(16) default ‘’;


alter table movie drop length;

After these two modifications the schema becomes

movie(title, year, genre, studio)

and each tuple of movie will now have a studio attribute set to the empty string.
We can set different studio values by following alter table with one or more
update statements.

SQL includes several statements and clauses for dynamically modifying


schemas and constraints. We won’t attempt to cover them here.

N. Dulay Databases : 08 : SQL Data Definition 4


Primary Key

Each relation should have a primary key (a candidate key) that determines the other
attributes in the relation and is used to uniquely identify each tuple of the relation.

The primary key is also used to enforce foreign key constraints on the relation
(see later)

Only one primary key is permitted for a relation.

If there is choice of candidate keys, then choose one where the key will never be
updated or very rarely updated, otherwise choose one with few (and small-size)
attributes.

Other candidate keys can be defined using unique constraints (see later).

N. Dulay Databases : 08 : SQL Data Definition 5


Primary key constraints

There are two constraints enforced by SQL over primary keys:

1. No nulls are permitted in a primary key.

2. Primary key values (as a whole) must be unique i.e. no two tuples can have the
same primary key value (i.e. all attributes the same).

If any database operation attempts to violate one of these constraints then the
operation will fail. For example, if we attempt to assign a null value to a primary
key attribute or if we attempt to insert a tuple where a tuple with the same primary
key value already exists. The insertion

insert into movie(title, genre) values (‘Up’, ‘cartoon’)

will fail since insert will attempt to assign null to year, and year is an attribute of
the primary key.

N. Dulay Databases : 08 : SQL Data Definition 6


Unique constraints

Additional candidate keys (or superkeys or any attributes for that matter) can be
declared using unique constraints, which ensure that no two tuples can have the
same set of values for the attributes listed in unique, i.e. that every tuple must have
a unique value for the attributes listed in unique (as a whole).

Example: movie(title, year, length, genre, ISAN)

create table movie (


Unlike primary key
title varchar(120),
attributes, unique attributes
year int default 2011,
can be null, unless we
length int default 0,
declare not null constraints
genre char(20),
for the attributes.
ISAN char(24),

primary key (title, year),


unique (ISAN)
)
N. Dulay Databases : 08 : SQL Data Definition 7
Not null constraints

We can add a not null constraint to any attribute to have the database prohibit
assignment of null to that attribute.

Example: movie(title, year, length, genre, ISAN)

create table movie ( not null is not needed for


title varchar(120), primary key attributes since
year int default 2011, they are implicitly not null.
length int default 0 not null,
genre char(20) not null,
ISAN char(24) not null,

primary key (title, year),


unique (ISAN)
)
A primary-key constraint is just a combination of a unique constraint and a not-null
constraint.
N. Dulay Databases : 08 : SQL Data Definition 8
Exercise

Q. Declare the SQL schema for the following relation - don’t worry about getting all the
details right - make ‘educated‘ guesses.

staff(id, opened, openedby, updated, updatedby, validfrom, validto,


login, email, lastname, firstname, telephone, room)

N. Dulay Databases : 08 : SQL Data Definition 9


Check constraints

Attribute types and not null constraints allow us to limit the values that we store.
check constraints allow us to define predicates that must be satisfied when a tuple is
inserted or updated.

Example: movie(title, year, length, genre, ISAN)

create table movie (


title varchar(120), Note: check is satisfied if the
year int, result of the predicate is true
length int, or unknown (from a null result)
genre char(20),

primary key (title, year),


check(year between 1900 and 2020),
check(genre in (‘sf’, ‘comedy’, ‘drama’, ‘western’))
)

N. Dulay Databases : 08 : SQL Data Definition 10


Check constraints

Although check constraints are typically simple checks on a single attribute they can
be arbitrary expressions involving several attributes and/or a query.

Example: movie(title, year, length, genre, ISAN)

create table movie (


title varchar(120),
year int,
length int,
genre char(20),

primary key (title, year),


check(ISAN in (select no from ISANcatalog))
)
Note: changing or deleting a no in ISANcatalog could result in a violation of the
movie check condition!

N. Dulay Databases : 08 : SQL Data Definition 11


Exercise

Q. For the staff relation add a check constraint for validfrom and validto and one for
room given the following additional relation:
building(id, ..., room, area, desks, occupancy)

create table staff (


id int,
...
updated timestamp not null default current_timestamp,
validfrom date not null default current_date,
validto date not null default date ‘2020-12-31’,
...
room char(10) not null default ‘’,

primary key (id), unique(login), unique(email),

N. Dulay Databases : 08 : SQL Data Definition 12


Assertions

It’s also possible to declare assertions, which are check constraints over data in
several relations.

Example: movieboss(name, address, networth)


studio(name, address, boss)

create assertion nopoorbosses check (


not exists (
select s.name
from studio s join movieboss m on (s.boss=m.name)
where m.networth < 100000000
)
)

Although assertions are a very powerful feature of SQL, they are hard to implement
efficiently. Triggers are an alternative, more powerful and more operational approach
for letting the programmer deal with constraint checking when data is modified.
N. Dulay Databases : 08 : SQL Data Definition 13
Naming constraints

It’s good practice to name constraints. This allows us to drop them using alter
table but also clarifies error messages when a constraint violation is reported.

Example: movie(title, year, length, genre, ISAN)

create table movie (


title varchar(120),
year int,
length int,
genre char(20),

primary key (title, year),


constraint uniqueISAN unique (ISAN),
constraint yearCheck check(year between 1900 and 2020),
constraint genreCheck check(genre in (‘sf’,‘comedy’))
)

N. Dulay Databases : 08 : SQL Data Definition 14


Foreign Key Constraints

Foreign key constraints specify that the value of one or more attributes in a relation
must match (reference) values of a primary key or unique constraint (candidate key)
in another (referenced) relation. This is an example of referential integrity.

Example: movie(title, year, length, genre, ISAN)


actor(title, year, name)

create table actor ( If referenced attributes


title varchar(120), are omitted, the primary
year int, key is assumed.
name varchar(60),

foreign key (title, year) references movie (title,year)


)

N. Dulay Databases : 08 : SQL Data Definition 15


Exercise

Q. Write foreign key constraints for some of the following relations:

staff(login, email, lastname, firstname, telephone, room, deptrole, department)


student(login, email, lastname, status, entryyear, externaldept)
course(code, title, syllabus, term, classes, popestimate)
class(degreeid, yr, degree, degreeyr, major, majoryr, letter, letteryr)
degree(title, code, major, grp, letter, years)
xcourseclass(courseid, classid, required, examcode)
xcoursestaff(courseid, staffid, staffhours, role, term)
xstudentclass(studentid, classid)
xstudentstaff(studentid, staffid, role, grp, projecttitle)

N. Dulay Databases : 08 : SQL Data Definition 16


Maintaining Referential Integrity

The default policy when referential integrity is violated in SQL is to reject the
modification. However, there are two other policies that can be defined for deletes
and updates. Cascade Policy - With this policy, any update to the referenced
attribute(s) is cascaded back to the foreign key. Similarly deleting a referenced tuple
will result in the referencing tuple being deleted as well (which might cascade again!)

Example: movie(title, year, length, genre, ISAN)


actor(title, year, name)

create table actor ( Updating the title (or year) of a movie will
title varchar(120), change the title of the movie for all actors
year int, in the movie. Deleting a movie will delete
name varchar(60), all actors who appeared in the movie.

foreign key (title, year) references movie (title,year)


on update cascade on delete cascade
)
N. Dulay Databases : 08 : SQL Data Definition 17
Maintaining Referential Integrity

Rather than cascading updates or deletes to the referencing relation, we can, instead,
set the value of the foreign key to null using a set null policy or to the default
value using a set default policy. This will lead to unmatched tuples however! ☹

Example: movie(title, year, length, genre, ISAN)


actor(title, year, name)
create table actor (
title varchar(120) default ‘’,
year int default 2011,
name varchar(60),
foreign key (title, year) references movie (title,year)
on delete set null on update set default
)
Constraints are normally checked immediately a tuple action occurs. It’s possible
however, to defer constraint checking until the end of a transaction. This can be useful
where we need to violate a constraint temporarily before satisfying it again.
N. Dulay Databases : 08 : SQL Data Definition 18
Exercise

Q. Rewrite some of the foreign key constraints maintaining referential integrity.

staff(login, email, lastname, firstname, telephone, room, deptrole, department)


student(login, email, lastname, status, entryyear, externaldept)
course(code, title, syllabus, term, classes, popestimate)
class(degreeid, yr, degree, degreeyr, major, majoryr, letter, letteryr)
degree(title, code, major, grp, letter, years)
xcourseclass(courseid, classid, required, examcode)
xcoursestaff(courseid, staffid, staffhours, role, term)
xstudentclass(studentid, classid)
xstudentstaff(studentid, staffid, role, grp, projecttitle)

N. Dulay Databases : 08 : SQL Data Definition 19


Views

Views are relations that are defined using a query (a select). Views are not
physically stored on disk unless they are materialised (see later).

Example: movie(title, year, length, genre)


actor(title, year, name) Changes in the underlying
relations are reflected in the view
create view comedies as
select title, year from movie where genre=‘comedy’;
View’s attribute names
create view actorgenre(actorname, moviegenre) as
select distinct name, genre from movie join actor
using (title,year);

Views can be queried just like stored relations (tables): Views in a query are like
subqueries.
select * from comedies where year=2010;
select * from actorgenre where genre=‘comedy’;
N. Dulay Databases : 08 : SQL Data Definition 20
Exercise

Q. Write a view on staff for all staff who are currently here (i.e. currently valid).

staff(id, opened, openedby, updated, updatedby, validfrom, validto,


login, email, lastname, firstname, telephone, room)

Q. Write a view on staff who were here in the academic year 2008 to 2009 (October to
September). Tricky.

N. Dulay Databases : 08 : SQL Data Definition 21


Uses for Views

Views allow us to:

1. Declare commonly used subqueries (relational expressions). Views can also be


defined in terms of other views (nested views)

2. Declare a relation over several relations using joins, products etc.

3. Declare a relation over calculated (expressions) and aggregated data


(sums, averages, mins, counts) etc

4. Partition data using selection e.g. into years.

5. Restrict access to a relation by providing access to a view not the whole relation.

...

N. Dulay Databases : 08 : SQL Data Definition 22


Materialised views

Views are normally recomputed each time they are needed. If a view is used
sufficiently often then it might be more efficient to materialise (store) the view at the
cost of extra storage and extra time to keep the view up-to-date when the underlying
relations are changed by insertions, updates, and deletions.

Materialised view maintenance can be expensive however, for example, lots of changes
to the underlying views versus few queries on the views. The decision is a tradeoff
between extra-storage and view maintenance costs and the faster speed of querying a
materialised view.

Rather than keeping a materialised view eagerly up-to-date, some RDBMSs allow a
materialised view to be brought up-to-date only when the view is accessed (lazy
maintenance). Others although the view to become “stale” and only update it
periodically e.g. when database activity is low (overnight). Others create/remove a
materialised view transparently as a query optimisation.

Materialised views are a non-standard extension.


N. Dulay Databases : 08 : SQL Data Definition 23
Updateable views

Views are normally read-only, used to retrieve data. One could consider updateable
views - views that allow inserts, deletes, updates directly on the view.

In general, updatable views usually don’t make sense e.g. deleting a tuple in
actorgenre for example.

Even in simple cases, e.g. inserting a tuple into comedies, there is the issue of
missing attributes - attributes in the underlying relations that are not in the relation.
In this case we could set them to null or their default.

SQL has a complex set of rules for defining an updateable view including:

1. Only one relation in the from clause


2. Only attributes in the projection list, no expressions, aggregates, distinct etc.
3. Any attribute not in the projection list can be set to null
4. No group by and having clauses.
5. No subqueries.
N. Dulay Databases : 08 : SQL Data Definition 24
Indexes

When relations have many tuples it can be very slow to scan the relation tuple-by-
tuple in order to satisfy a query.

Example: select * from movie where year=2010 and genre=‘comedy’

If there were 10,000 movies then reading and testing all 10,000 tuples might be a
little slow. Imagine if there were 100 million tuples in the relation.

Indexes are copies of an attribute’s data that are automatically maintained by the
RDBMS but can be searched very very quickly. For the example, we could create an
index on both year and genre together, or separate indexes on one or both which
might be more flexible:
Ideally we shouldn’t
create index yearindex on movies(year); have to create
create index genreindex on movies(genre); indexes, at all.
Like materialised views, there is a tradeoff between the space needed for indexes and
the cost of maintaining them vs the greater speed of access to the indexed data.
N. Dulay Databases : 08 : SQL Data Definition 25

You might also like