0% found this document useful (0 votes)

102 views5 pages

What's A Database System?

A database system facilitates the creation, maintenance, and use of an organized collection of related data or information. It provides efficient and safe querying and updating of large amounts of persistent data through a data model that structures the data conceptually and a query language for users to ask questions about the data. The database system handles query processing, optimization, and transaction processing to ensure data integrity and efficient access.

Uploaded by

Sangeeta Upadhyay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views5 pages

What's A Database System?

Uploaded by

Sangeeta Upadhyay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Whats a database system?

Review of Basic Database Concepts

CPS 296.1 Topics in Database Systems According to Oxford Dictionary
Database: an organized body of related information Database system, DataBase Management System, or DBMS: a software system that facilitates the creation and maintenance and use of an electronic database

More precisely, a DBMS should support

Efficient and convenient querying and updating of large amounts of persistent data Safe, multi-user access
2

Two important questions

What is the right API for a DBMS?
Data model
How is the data structured conceptually?

Entity-relationship (E/R) diagram

Entities: students and courses Relationships: students enroll in courses
SID name Student age GPA Enroll Course title CID

Query language
How do users ask queries about the data?

How does the DBMS support the API?

Query processing and optimization
What is the most efficient way to answer a query?

Transaction processing
How are atomicity, consistency, isolation, and durability of transaction ensured?
3

Widely used for database design by humans DBMS does not need a graphical data model

Before the relational revolution

Hierarchical and network data models
Relationships are modeled as pointers Queries require explicit pointer following

Physical data independence

Problems with hierarchical and network data models
Access to data is not declarative Whenever data is reorganized, applications must be reprogrammed!

Example: a simplified CODASYL query

Student.GPA := 4.0 FIND Student RECORD BY CALC-KEY FIND OWNER OF CURRENT Student-Course SET IF Course.CID = CPS 296 THEN PRINT Student.name

! Physical data independence

Applications should not need to worry about how data is physically structured and stored Applications should work with a logical data model and declarative query language Leave the implementation details and optimization to DBMS
6

Assume that we can quickly find student records by GPA Assume there is a pointer from students to courses How about navigating from courses to students?
5

Relational data model

A database is a collection of relations (or tables) Each relation has a list of attributes (or columns) Each relation contains a set of tuples (or rows)
Duplicates not allowed
Student
SID 142 123 857 456 ... name Bart Milhouse Lisa Ralph ... age 10 10 8 8 ...
SID CID title 142 GPACPS 296 Topics in Database Systems 142 2.3 CPS 216 Advanced Database Systems 123 3.1 CPS 116 Intro. to Database Systems 857 4.3 ... ... 857 2.3 456 ... ...

Schema versus instance

Schema (metadata)
Structure and constraints over data
Student (SID integer, name string, age integer, GPA float) Course (CID string, title string) Enroll (SID integer, CID integer) Student.SID is a key, Enroll.SID is a foreign key referencing Student.SID, etc.

Course

Enroll
CID CPS 296 CPS 216 CPS 296 CPS 296 CPS 116 CPS 116 ...
7

Changes infrequently

Instance
Actual contents that conform to the schema
{ <142, Bart, 10, 2.3>, <123, Milhouse, 10, 3.1>, ...} { <CPS 296, Topics in Database Systems>, ...} { <142, CPS 296>, <142, CPS 216>, ...}

Changes frequently

Relational algebra
Core set of operators:
Selection, projection, cross product, union, difference, and renaming

Selection
Notation: p ( R )
p is called a selection condition/predicate

Additional, derived operators:

Join, etc. Operator Operator

Output: only rows that satisfy p Example: Students with GPA higher than 3.0

GPA > 3.0 ( Student )

name Bart Milhouse Lisa Ralph age 10 10 8 8 GPA 2.3 3.1 4.3 2.3

SID 142 123 857 456

GPA > 3.0

SID 142 123 857 456

name Bart Milhouse Lisa Ralph

age 10 10 8 8

GPA 2.3 3.1 4.3 2.3

Projection
Notation: L ( R )
L is a list of columns in R Duplicate rows are removed

Cross product
Notation: R S Output: for each row r in R and each row s in S, output a row rs (concatenation of r and s) Example: Student Enroll
SID 142 123 ... name Bart Milhouse ...
SID 142 142 142 123 123 123 ...

Output: only the columns in L Example: age distribution of students

age ( Student )
name Bart Milhouse Lisa Ralph age 10 10 8 8

age 10 10 ...

GPA 2.3 3.1 ...

age 10 10 10 10 10 10 ... GPA 2.3 2.3 2.3 3.1 3.1 3.1 ...

SID 142 142 123 142 142 123 ... CID CPS 296 CPS 216 CPS 296 CPS 296 CPS 216 CPS 296 ...

SID 142 123 857 456

GPA 2.3 3.1 4.3 2.3

age

SID 142 123 857 456

name Bart Milhouse Lisa Ralph

age 10 10 8 8

GPA 2.3 3.1 4.3 2.3

name Bart Bart Bart Milhouse Milhouse Milhouse ...

SID 142 142 123 ...

CID CPS 296 CPS 216 CPS 296 ...

Derived operator: join

Notation: R ><p S (shorthand for p (R S))
p is called a join condition/predicate

Union and difference

Notation: R S
R and S must have identical schema

Example: students and CIDs of their courses Student >< Student.SID = Enroll.SID Enroll
SID 142 123 ... name Bart Milhouse ... SID 142 142 142 123 123 123 ... age 10 10 ... GPA 2.3 3.1 ... age 10 10 10 10 10 10 ...

Output:
Same schema as R and S Contains all rows in R and all rows in S, with duplicates eliminated

Output:
Same schema as R and S Contains all rows in R that are not found in S

Student.SID = Enroll.SID
GPA 2.3 2.3 2.3 3.1 3.1 3.1 ... SID 142 142 123 142 142 123 ... CID CPS 296 CPS 216 CPS 296 CPS 296 CPS 216 CPS 296 ...

name Bart Bart Bart Milhouse Milhouse Milhouse ...

SID 142 142 123 ...

CID CPS 296 CPS 216 CPS 296 ...

Renaming
Notation: S ( R ), or S ( A1 , A2 , ...) ( R ) Purpose: rename a table and/or its columns
No real processing involved Used to avoid confusion caused by identical column names

Relational algebra example

Names of students in CPS 296 with 4.0 GPA

GPA = 4.0

name

Example: all pairs of (different) students

><Student.SID = Enroll.SID

Student1 (SID1, name1, age1, GPA1)

Student

>< SID1 < > SID2

CID = CPS 296

Student2 (SID2, name2, age2, GPA2)

Student
15

Student

Enroll

Compare this query to the CODASYL version!

SQL
SQL (Structured Query Language)
Pronounced S-Q-L or sequel The query language of every commercial DBMS

SQL example
Names of students in CPS 296 with 4.0 GPA
SELECT Student.name FROM Student, Enroll WHERE Enroll.CID = CPS 296 AND Enroll.SID = Student.SID AND Student.GPA = 4.0;

Simplest form: SELECT A1, A2, , An FROM R1, R2, , Rm WHERE condition;
Also called an SPJ (select-project-join) query Equivalent (more or less) to relational algebra query
A1, A2, , An ( condition (R1 R2 Rm)) Unlike relational algebra, SQL preserves duplicates by default
17

Compare this query to the CODASYL version!

More SQL features

SELECT [DISTINCT] list_of_output_columns FROM list_of_tables WHERE where_condition GROUP BY list_of_group_by_columns HAVING having_condition ORDER BY list_of_order_by_columns;

SQL example with aggregation

Find the average GPA for each age group with at SID name age GPA least three students
SELECT age, AVG(GPA) FROM Student GROUP BY age HAVING COUNT(*) >= 3;
142 857 123 456 789 Bart Lisa Milhouse Ralph Jessica 10 8 10 8 10 2.3 4.3 3.1 2.3 4.2

Operational semantics
FROM: take the cross product of list_of_tables WHERE: apply where_condition

GROUP BY
SID 142 123 789 857 456 name Bart Milhouse Jessica Lisa Ralph age 10 10 10 8 8 GPA 2.3 3.1 4.2 4.3 2.3 SID 142 123 789 857 456

HAVING
name Bart Milhouse Jessica Lisa Ralph age 10 10 10 8 8 GPA 2.3 3.1 4.2 4.3 2.3

SELECT
age AVG(GPA) 10 3.2

GROUP BY: group result tuples according to list_of_group_by_columns HAVING: apply SELECT: apply list_of_output_columns (preserve duplicates) DISTINCT: eliminate duplicates ORDER BY: sort the result by list_of_order_by_columns

having_condition

to the groups

Summary: relational query languages

Not your general-purpose programming language
Not expected to be Turing-complete Not intended to be used for complex calculations Amenable to much optimization

Access paths
Store data in ways to speed up queries
Heap file: unordered set of records B+-tree index: disk-based balanced search tree with logarithmic lookup and update Linear/extensible hashing: disk-based hash tables that can grow dynamically Bitmap indexes: potentially much more compact And many more

More declarative than languages for hierarchical and network data models
No explicit pointer following
Replaced by joins that can be easily reordered

One table may have multiple access paths

One primary index that stores records directly Multiple secondary indexes that store pointers to records
22

Next: How do we support relational query languages efficiently? 21

Query processing methods

The same query operator can be implemented in many different ways Example: R ><R.A=S.B S

Motivation for query optimization

The same query can have many different execution plans Example: SELECT Student.name
FROM Student, Enroll WHERE Enroll.CID = CPS 296 AND Enroll.SID = Student.SID AND Student.GPA = 4.0;

Nested-loop join: for each tuple of R, and for each tuple of S, join Index nested-loop join: for each tuple of R, use the index on S.B to find joining S tuples Sort-merge join: sort R by R.A, sort S by S.B, and merge-join Hash join: partition R and S by hashing R.A and S.B, and join corresponding partitions And many more
23

Plan 1: evaluate GPA = 4.0(Student); for each result SID, find the Enroll tuples with this SID and check if CID is CPS 296 Plan 2: evaluate CID = CPS 296(Enroll); for each result SID, find the Student tuple with this SID and check if GPA is 4.0 Plan 3: evaluate both GPA = 4.0(Student) and CID = CPS 296(Enroll), and join them on SID Any many more
24

Query optimization
A huge number of possible execution plans
With different access methods, join order, join methods, etc.

Optimizing for I/O

Location Registers Memory Disk Cycles 1 100 106 Location Time My head 1 min. Washington D.C. 1.5 hr. Pluto 2 yr. (source: AlphaSort paper, 1995)

Query optimizers job

Enumerate candidate plans
Query rewrite: transform queries or query plans into equivalent ones

Estimate costs of plans

Use statistics such as histograms

! I/O costs dominate database operations

DBMS typically optimizes the number of I/Os

Pick a plan with reasonably low cost

Dynamic programming Randomized search

Example: Which of the following is a more efficient way to process SELECT * FROM R ORDER BY R.A;?
Use an available secondary B+-tree index on R.A: follow leaf pointers, which are already ordered by R.A Just sort the table
25 26

Relational Databases 101
No ratings yet
Relational Databases 101
17 pages
SQL Fundamentals for Students
No ratings yet
SQL Fundamentals for Students
7 pages
01 Relationalmodel
No ratings yet
01 Relationalmodel
5 pages
02-Modernsql 2
No ratings yet
02-Modernsql 2
8 pages
02 Modernsql
No ratings yet
02 Modernsql
8 pages
Bajwa A C
No ratings yet
Bajwa A C
4 pages
Department of Computer Science and Engineering: Certification Course
No ratings yet
Department of Computer Science and Engineering: Certification Course
36 pages
Unit - 4 Dbms
No ratings yet
Unit - 4 Dbms
12 pages
SQL Aggregation and Grouping Essentials
No ratings yet
SQL Aggregation and Grouping Essentials
7 pages
Relational Databases Explained
No ratings yet
Relational Databases Explained
14 pages
Database System 3.19
No ratings yet
Database System 3.19
33 pages
SQL Queries and PL/SQL
No ratings yet
SQL Queries and PL/SQL
92 pages
Database Systems I Course Overview
No ratings yet
Database Systems I Course Overview
65 pages
Database
No ratings yet
Database
47 pages
Database and DBMS Overview
No ratings yet
Database and DBMS Overview
4 pages
My D426 Study Guide
No ratings yet
My D426 Study Guide
10 pages
Database SMY PQ
No ratings yet
Database SMY PQ
15 pages
22426913
No ratings yet
22426913
124 pages
Lecture 3
No ratings yet
Lecture 3
24 pages
Dbms Unit3 Part1
No ratings yet
Dbms Unit3 Part1
35 pages
IICT - Database SQL
No ratings yet
IICT - Database SQL
41 pages
+2 Computer Science Marathon Note
No ratings yet
+2 Computer Science Marathon Note
195 pages
Join The Club: C207 - Database Systems 2012
No ratings yet
Join The Club: C207 - Database Systems 2012
237 pages
CHAP 3 Relational Database
No ratings yet
CHAP 3 Relational Database
41 pages
02 Advancedsql
No ratings yet
02 Advancedsql
5 pages
Relational Algebra: CS 186 Spring 2006, Lecture 8 R & G, Chapter 4
No ratings yet
Relational Algebra: CS 186 Spring 2006, Lecture 8 R & G, Chapter 4
30 pages
Rdbms SQL Basics
No ratings yet
Rdbms SQL Basics
33 pages
RDBMS Basics and SQL Overview
100% (1)
RDBMS Basics and SQL Overview
33 pages
Relational Databases: Week 9 INFM 603
No ratings yet
Relational Databases: Week 9 INFM 603
54 pages
Relational Model and Algebra: Introduction To Databases Compsci 316 Fall 2014
No ratings yet
Relational Model and Algebra: Introduction To Databases Compsci 316 Fall 2014
41 pages
CS SE IT 1105 Database Management Systems - Lecture 05
No ratings yet
CS SE IT 1105 Database Management Systems - Lecture 05
33 pages
Database System Overview and Concepts
100% (1)
Database System Overview and Concepts
14 pages
SQL Query Optimization Techniques
No ratings yet
SQL Query Optimization Techniques
30 pages
Iict - Database SQL
No ratings yet
Iict - Database SQL
56 pages
Introduction to SQL and Relational Databases
No ratings yet
Introduction to SQL and Relational Databases
103 pages
DBMS Case Study
No ratings yet
DBMS Case Study
10 pages
4.SQL 1
No ratings yet
4.SQL 1
73 pages
SQL Basics and Key Concepts Overview
No ratings yet
SQL Basics and Key Concepts Overview
44 pages
Database Concepts for Beginners
No ratings yet
Database Concepts for Beginners
7 pages
Te 17et7006 Case - Study Ramanand Dhole
No ratings yet
Te 17et7006 Case - Study Ramanand Dhole
10 pages
Chapter 03
No ratings yet
Chapter 03
43 pages
Database Management Systems-1
No ratings yet
Database Management Systems-1
10 pages
Database
No ratings yet
Database
14 pages
Dbms Notes 1.1
No ratings yet
Dbms Notes 1.1
11 pages
Database Concepts
No ratings yet
Database Concepts
6 pages
SQL Basics: History, Persistence, and Key Concepts
No ratings yet
SQL Basics: History, Persistence, and Key Concepts
41 pages
CS143 Final Cheatsheet
No ratings yet
CS143 Final Cheatsheet
4 pages
Database Concepts
No ratings yet
Database Concepts
25 pages
SQL - Structured Query Language A Standard That Specifies How
No ratings yet
SQL - Structured Query Language A Standard That Specifies How
66 pages
SQL - Chapters - 10 - To - 13 - Notes Final
No ratings yet
SQL - Chapters - 10 - To - 13 - Notes Final
15 pages
CS 186, Spring 2007, Lecture 7 R&G, Chapter 5 Mary Roth: The Important Thing Is Not To Stop Questioning
No ratings yet
CS 186, Spring 2007, Lecture 7 R&G, Chapter 5 Mary Roth: The Important Thing Is Not To Stop Questioning
36 pages
Best Online Software Testing Training Course PDF
No ratings yet
Best Online Software Testing Training Course PDF
8 pages
Agile Testing Tutorial
0% (1)
Agile Testing Tutorial
13 pages
Business Analysis Tutorial PDF
100% (1)
Business Analysis Tutorial PDF
13 pages
Mobile Testing Tutorial PDF
100% (1)
Mobile Testing Tutorial PDF
69 pages
Software Testing Interview Questions & Answers
No ratings yet
Software Testing Interview Questions & Answers
29 pages
Jira Tutorial PDF
71% (7)
Jira Tutorial PDF
20 pages
Data Entry Jobs
No ratings yet
Data Entry Jobs
3 pages
Bank Management Tutorial PDF
No ratings yet
Bank Management Tutorial PDF
13 pages
50 SQL Query Questions You Should Practice For Interview PDF
80% (5)
50 SQL Query Questions You Should Practice For Interview PDF
15 pages
Manual API Testing Using Postman For Beginners PDF
100% (1)
Manual API Testing Using Postman For Beginners PDF
8 pages
Manual API Testing Using Postman For Beginners PDF
100% (1)
Manual API Testing Using Postman For Beginners PDF
8 pages
Software Testing Life Cycle Guide
No ratings yet
Software Testing Life Cycle Guide
4 pages
Software Testing Life Cycle Guide
No ratings yet
Software Testing Life Cycle Guide
4 pages
Technical Profile
No ratings yet
Technical Profile
5 pages
Xiaomi. v281481691
No ratings yet
Xiaomi. v281481691
4 pages
What Is Software Testing - Definition, Types, Methods, Approaches
No ratings yet
What Is Software Testing - Definition, Types, Methods, Approaches
17 pages
Recruitment of Specialist Cadre Officers in State Bank of India On Regular Basis
No ratings yet
Recruitment of Specialist Cadre Officers in State Bank of India On Regular Basis
2 pages
Vacancy: Madhya Pradesh Agency For Promotion of Information Technology (MAP - IT)
No ratings yet
Vacancy: Madhya Pradesh Agency For Promotion of Information Technology (MAP - IT)
9 pages
Edit in PDF
No ratings yet
Edit in PDF
1 page
Software Development Life Cycle - SDLC - Software Testing Material
No ratings yet
Software Development Life Cycle - SDLC - Software Testing Material
14 pages
Admin 201 Certified Administrator Notes
100% (3)
Admin 201 Certified Administrator Notes
69 pages
Mca
No ratings yet
Mca
4 pages
DYPATIL College Admission Details
No ratings yet
DYPATIL College Admission Details
1 page
SQ L Server 2008 Interview Questions Answers
No ratings yet
SQ L Server 2008 Interview Questions Answers
111 pages
Swing Motor and Gear Maintenance Guide
No ratings yet
Swing Motor and Gear Maintenance Guide
25 pages
Natwest Interview Questions For SDETs 1761415299
No ratings yet
Natwest Interview Questions For SDETs 1761415299
6 pages
Chemistry Acids and Bases Questions Dalal
No ratings yet
Chemistry Acids and Bases Questions Dalal
50 pages
Prachi Mishra Industrial Area Tajopur Mau Tajopur Mau Uttar Pradesh 275101
No ratings yet
Prachi Mishra Industrial Area Tajopur Mau Tajopur Mau Uttar Pradesh 275101
3 pages
Creating Scs Curve Number Grid Using Hec-Geohms: Vmerwade@Purdue - Edu
No ratings yet
Creating Scs Curve Number Grid Using Hec-Geohms: Vmerwade@Purdue - Edu
15 pages
Ao 43 S 1999
No ratings yet
Ao 43 S 1999
92 pages
Introduction To Network Traffic Flow Theory: Principles, Concepts, Models, and Methods 1st Edition Wen-Long Jin Ebook Instant File Access
100% (7)
Introduction To Network Traffic Flow Theory: Principles, Concepts, Models, and Methods 1st Edition Wen-Long Jin Ebook Instant File Access
44 pages
Compressor Performance Assessment Report
No ratings yet
Compressor Performance Assessment Report
9 pages
Acceptance Letter
No ratings yet
Acceptance Letter
2 pages
Abyss Piano Sheet Music Collection
No ratings yet
Abyss Piano Sheet Music Collection
3 pages
Handbook On Works Contract Management
100% (1)
Handbook On Works Contract Management
173 pages
Year 5 English Language Scheme Overview
No ratings yet
Year 5 English Language Scheme Overview
214 pages
Compound Nouns in Business English
No ratings yet
Compound Nouns in Business English
14 pages
DATUM Centrifugal Compressors Brochure PDF
No ratings yet
DATUM Centrifugal Compressors Brochure PDF
10 pages
Nursing Assessment: Presented by
No ratings yet
Nursing Assessment: Presented by
42 pages
Operation and Maintenance Manual For Atlas Copco Jumbo.
No ratings yet
Operation and Maintenance Manual For Atlas Copco Jumbo.
37 pages
I4C Brand Guidelines v1.7 22-06-2021
No ratings yet
I4C Brand Guidelines v1.7 22-06-2021
31 pages
Table Tennis Racket Coverings List
No ratings yet
Table Tennis Racket Coverings List
18 pages
CBM Project Final
No ratings yet
CBM Project Final
10 pages
Flower Recognition Using CNN Techniques
100% (1)
Flower Recognition Using CNN Techniques
3 pages
Chapter 7.17 Elecsys® Immunoassay Systems
No ratings yet
Chapter 7.17 Elecsys® Immunoassay Systems
5 pages
Cost II CH 6 Responsibility Acct
No ratings yet
Cost II CH 6 Responsibility Acct
20 pages
Tech Specs for Engineers
No ratings yet
Tech Specs for Engineers
44 pages
Metl-Span CF Urethane Insulated Panels Weight, PSF
No ratings yet
Metl-Span CF Urethane Insulated Panels Weight, PSF
1 page
Logan County Employee Handbook Attachment D
No ratings yet
Logan County Employee Handbook Attachment D
2 pages
Wire Line Guide Assembly 5447805 03 1 of 1
No ratings yet
Wire Line Guide Assembly 5447805 03 1 of 1
2 pages
Free PDFs for Competitive Exams
No ratings yet
Free PDFs for Competitive Exams
148 pages
SHS Automated Form 138
No ratings yet
SHS Automated Form 138
11 pages
Dietaryand Lifestyle Habits Andthe Associated Health Risks in Shift Workers
No ratings yet
Dietaryand Lifestyle Habits Andthe Associated Health Risks in Shift Workers
24 pages
Explanatory Notes - PAE Route 3 (Rev
No ratings yet
Explanatory Notes - PAE Route 3 (Rev
1 page

What's A Database System?

Uploaded by

What's A Database System?

Uploaded by

Whats a database system?

Review of Basic Database Concepts

More precisely, a DBMS should support

Two important questions

Entity-relationship (E/R) diagram

How does the DBMS support the API?

Before the relational revolution

Physical data independence

Example: a simplified CODASYL query

! Physical data independence

Relational data model

Schema versus instance

Additional, derived operators:

GPA > 3.0 ( Student )

SID 142 123 857 456

GPA > 3.0

SID 142 123 857 456

name Bart Milhouse Lisa Ralph

GPA 2.3 3.1 4.3 2.3

Output: only the columns in L Example: age distribution of students

GPA 2.3 3.1 ...

SID 142 123 857 456

GPA 2.3 3.1 4.3 2.3

SID 142 123 857 456

name Bart Milhouse Lisa Ralph

GPA 2.3 3.1 4.3 2.3

name Bart Bart Bart Milhouse Milhouse Milhouse ...

SID 142 142 123 ...

CID CPS 296 CPS 216 CPS 296 ...

Derived operator: join

Union and difference

name Bart Bart Bart Milhouse Milhouse Milhouse ...

SID 142 142 123 ...

CID CPS 296 CPS 216 CPS 296 ...

Relational algebra example

Example: all pairs of (different) students

Student1 (SID1, name1, age1, GPA1)

>< SID1 < > SID2

CID = CPS 296

Student2 (SID2, name2, age2, GPA2)

Compare this query to the CODASYL version!

Compare this query to the CODASYL version!

More SQL features

SQL example with aggregation

Summary: relational query languages

One table may have multiple access paths

Next: How do we support relational query languages efficiently? 21

Query processing methods

Motivation for query optimization

Optimizing for I/O

Query optimizers job

Estimate costs of plans

! I/O costs dominate database operations

Pick a plan with reasonably low cost

You might also like