Midterm Solution
Midterm Solution
Exam
1
– Personal notes
– Textbook
– Printed lecture notes
– Phone
• The exam is 90 minutes long
• Multiple choice questions are graded in the following way: You get points for correct answers and points
subtracted for wrong answers. The minimum points for each questions is 0. For example, assume there
is a multiple choice question with 6 answers - each may be correct or incorrect - and each answer gives
1 point. If you answer 3 questions correct and 3 incorrect you get 0 points. If you answer 4 questions
correct and 2 incorrect you get 2 points. . . .
• For your convenience the number of points for each part and questions are shown in parenthesis.
• There are 5 parts in this exam
1. SQL (32)
2. Relational Algebra (26)
3. Index Structures (24)
4. I/O Estimation (18)
location
lName city owner sizeSf
Windsor Castle Windsor Queen 40,000
Big Ben London Public 3,500
Stonehedge Amesbury Public 14,000
account
witness time suspect crimeId
Bob 10:30 Peter 1
Peter 10:30 Bob 1
Queen 11:00 Bob 2
crime
id location time type victim
1 Big Ben 10:30 murder Alice
2 Windsor Castle 11:00 theft Queen
Hints:
• When writing queries do only take the schema into account and not the example data given here. That
is your queries should return correct results for all potential instances of this schema.
• Attributes with black background form the primary key of an relation. For example, lName is the primary
key of relation location.
• The attribute crimeId of relation account is a foreign key to the attribute id of relation crime.
Solution
SELECT DISTINCT s u s p e c t
FROM a c c o u n t a , c r i m e c , l o c a t i o n l
WHERE a . c r i m e I d = c . i d
AND c . type = ’ murder ’
AND l . owner = ’ Queen ’
AND l . lName = c . l o c a t i o n ;
Solution
Solution
SELECT DISTINCT l . w i t n e s s , l . s u s p e c t
FROM a c c o u n t l , a c c o u n t r
WHERE l . c r i m e I d = r . c r i m e I d
AND l . w i t n e s s = r . s u s p e c t
AND l . s u s p e c t = r . w i t n e s s
AND l . w i t n e s s < r . s u s p e c t ;
DISTINCT is needed here, because there may be multiple crimes that two persons are accusing each other of.
Solution
WITH s o l v e d C r i m e s AS (
SELECT c i t y , type , c r i m e I d
FROM l o c a t i o n l , a c c o u n t a , c r i m e c
WHERE l . lName = c . l o c a t i o n AND c . i d = a . c r i m e I d
GROUP BY c i t y , type , c r i m e I d
HAVING count ( DISTINCT s u s p e c t ) = 1
),
numSolved AS (
SELECT c i t y , type , count ( ∗ ) AS numSolved
FROM s o l v e d C r i m e s
GROUP BY c i t y , type
),
numCrimes AS (
SELECT c i t y , type , count ( ∗ )
FROM l o c a t i o n l , c r i m e c
WHERE l . lName = c . l o c a t i o n
GROUP BY c i t y , type
)
Solution
Solution
Solution
insert(6),insert(4),insert(3),delete(40)
• Leaf Split: In case a leaf node needs to be split, the left node should get the extra key if the keys cannot
be split evenly.
• Non-Leaf Split: In case a non-leaf node is split evenly, the “middle” value should be taken from the
right node.
• Node Underflow: In case of a node underflow you should first try to redistribute and only if this fails
merge. Both approaches should prefer the left sibling.
10 40
1 5 9 15 22 40 41
Solution
1 5 6 9 15 22 40 41
insert(4)
6 10 40
1 4 5 6 9 15 22 40 41
insert(3)
10
4 6 40
1 3 4 5 6 9 15 22 40 41
delete(40)
4 6 10
1 3 4 5 6 9 15 22 41
Solution
Block Nested-loop:
Use smaller table R as the inner. We only have one chunk of size 100 = B(R). Thus, we get 1 × (B(R) + B(S))
= 2,100 I/Os.
Merge-join:
Relation R can be sorted in memory resulting in 2 ∗ B(R) = 200 I/Os. Relation S requires one merge phase,
merging 20 runs: 2 × 2 × B(S) = 8, 000 I/Os. The last merge phase of relation S cannot be combined with
sorting R (121 blocks of memory required). However, the merge join can be execute during this merge phase
avoiding one read of relation S. Without optimizations we get 8, 200 + B(R) + B(S) = 10, 300. If we execute
the merge-join during the last merge phase for S we get 8, 200 + B(R) = 8, 300.
Hash-join:
Relation R fits into memory. Thus, the hash-join requires B(R) + B(S) = 2, 100 I/O.
Solution