Concepts of Database Management Seventh Edition
Chapter 7 DBMS Functions
Objectives
Introduce the functions, or services, provided by a DBMS
Describe how a DBMS handles updating and retrieving data Examine the catalog feature of a DBMS Illustrate the concurrent update problem and describe how a DBMS handles this problem Explain the data recovery process in a database environment
Objectives (continued)
Describe the security services provided by a DBMS Examine the data integrity features provided by a DBMS Discuss the extent to which a DBMS achieves data independence Define and describe data replication Present the utility services provided by a DBMS
Introduction
Some services provided by a DBMS
Data Access: Update and retrieve data Support data independence Data integrity features Concurrent update Catalog Security Support data replication Data recovery Utility services
4
Access to Data: Client Server
User (or application) issues SQL commands to the DB server
Client / Application (e.g., ssms or C++ app)
Database Server
Database Storage
5
Web Access to DB
Web server issues SQL commands to the DB server
Browser is more or less a terminal to the web server application.
Web Client (e.g., Browser)
Web Server / application
Database Server
Database Storage 6
Update and Retrieve Data
Fundamental capability of a DBMS Users or applications dont need to know how data is stored or how to manipulate it
User (or program) makes a request that describes what needs to be accomplished The DBMS interprets the request and does all the work to retrieve, add, update or delete data
Making DB Requests
SQL Access (a non-procedural language) QBE - Easy-to-use menu-driven interface Access by non-DBMS software (usually implemented with procedural languages)
Intermixing SQL with procedural statements and using a pre-compiler to translate to appropriate procedure calls Using an API (application program interface), usually implemented as an object-oriented set of classes that provide a procedural access to the database, incorporating SQL statements as strings as required.
8
Microsoft Database APIs
Taken from Wikipedia MDAC
Supported by many platforms
ADO Example Code
http://www.timesheetsmts.com/adotutorial.htm
Private Sub Command1_Click() declare ADO objects Dim conConnection As New ADODB.Connection Dim cmdCommand As New ADODB.Command Dim rstRecordSet As New ADODB.Recordset open a connection conConnection.ConnectionString = _ "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & _ App.Path & "\" & "database.mdb;Mode=Read|Write" conConnection.CursorLocation = adUseClient conConnection.Open create a command With cmdCommand Prepare SQL command .ActiveConnection = conConnection .CommandText = "SELECT * FROM tabTestTable;" .CommandType = adCmdText End With use command to establish a record set With rstRecordSet Use SQL to get cursor .CursorType = adOpenStatic .CursorLocation = adUseClient .LockType = adLockOptimistic .Open cmdCommand End With If rstRecordSet.EOF = False Then Use cursor rstRecordSet.MoveFirst Do See whats in the table MsgBox "Record " & rstRecordSet.AbsolutePosition & " " & _ rstRecordSet.Fields(0).Name & "=" & rstRecordSet.Fields(0) & " " & _ rstRecordSet.Fields(1).Name & "=" & rstRecordSet.Fields(1) rstRecordSet.MoveNext Loop Until rstRecordSet.EOF = True With rstRecordSet Add a row .AddNew .Fields(0) = "New" .Fields(1) = "Record" .Update End With rstRecordSet.MoveFirst Delete a row rstRecordSet.Delete rstRecordSet.Update rstRecordSet.Close Else MsgBox "No records were returned using the query " & _ cmdCommand.CommandText End If reset objects conConnection.Close Set conConnection = Nothing Set cmdCommand = Nothing Set rstRecordSet = Nothing End Sub
10
Support Data Independence
Data independence: can change database structure without needing to change programs that access the database Types of changes:
Adding a field Changing a field property (such as length) Creating an index Adding or changing a relationship
11
Adding a Field
Only need to change programs that will use the new field Note: SQL SELECT * FROM command will present the extra field
Solution:
Many application program interfaces (API) make this transparent, but for those that do not, Instead of using *, list the required fields in any SQL SELECT command this is considered a best practice
12
Changing the Length of a Field
Generally, dont need to change programs unless a portion of a screen or report (or a variable used in the program) that is set aside for the field is no longer large enough to support new, larger data values.
13
Creating an Index
To create an index, enter a simple SQL command or select a few options DBMSs use the new index transparently, as appropriate, for best performance For some DBMSs, may need to make minor changes in already existing programs in order to take advantage of new indexes
14
Adding or Changing a Relationship
May need to modify applications A better solution, though, is to provide for this sort of change in advance by creating views incorporating tables whose business requirements can change then, change the view at the same time you change the relationship the change is then transparent to any programs that use the view.
15
Provide Data Integrity Features
Business requirements dictate constraints related to data accuracy and consistency
Constraints may relate to tables (e.g., a column must be unique, or may contain only certain values), or They may relate to relationships (e.g., a tuple in one table cannot exist without a corresponding tuple in another), or They may relate to contents of several tables, possibly not even in a relationship (e.g., the inventory table must be updated whenever an order is generated)
A good DBMS enforces a wide variety of constraints, but programming may be required for complex constraints.
16
Provide Data Integrity Features (review)
Constraints enforced by good databases include Key integrity
Primary key Foreign key
Data integrity
Legal values Data type Domain / Format (Check) Null allowed Default Uniqueness
17
Provide Data Integrity Features
Four ways to handle integrity constraints:
1. 2. 3. 4. Ignore it Place responsibility on users Place responsibility on programmers Place responsibility on DBMS
Number 4 is preferred; number 3 has issues, but is better than not enforcing at all (essentially, numbers 1 and 2)
18
Support Concurrent Update
Concurrent update: multiple users make updates to the same database at the same time DBMS manages complex update scenarios
Ensures accuracy when several users update database at the same time
19
The Concurrent Update Problem
FIGURE 7-4: Scenario 1 Ryan and Elenas updates dont overlap (OK) - continued on next slide -
20
The Concurrent Update Problem (continued)
Ryan is done with his update
FIGURE 7-5: Scenario 1 (continued) no overlap no problems
21
The Concurrent Update Problem (continued)
Ryan and Elena are updating at the same time
Ryan adds 100
FIGURE 7-6: Scenario 2 Ryans and Elenas updates overlap
22
The Concurrent Update Problem Elena (continued) subtracts
100
FIGURE 7-6: Scenario 2 (continued) Ryans and Elenas updates overlap result in a lost update
23
Avoiding the Lost Update Problem
Need to assure that updates dont overlap inappropriately. One approach: Batch processing
All updates done through a special program run periodically Problem: data becomes stale and may cause incorrect decisions regarding further transactions Does not work in situations that require data to be current
24
Batch Processing
FIGURE 7-7: Delaying updates to the Premiere Products database to avoid the lost update problem
25
DB Transaction
Transaction: set of steps completed by a DBMS to accomplish a user task
A sequence of operations (e.g., SQL statements) Brings the database from one consistent state to another (but partial completion would leave the database in an inconsistent state) Must be executed completely or undone completely SQL Server places the statements between BEGIN TRANSACTION and {COMMIT | ROLLBACK} TRANSACTION (Oracle has implicit BEGIN only specify COMMIT or ROLLBACK)
26
Locking
Locking: deny other users access to data while one users updates are being processed
The DBMS determines what locks are necessary Generally at least two levels: read, write a read lock allows other read locks (but not write locks), a write lock allows neither a read nor a write lock. Locks have granularity: database, table, block, record, field
The DBMS choses the lowest appropriate level of lock preferring table, block or record If a user holds a number of locks at a given level, the DBMS may choose to promote it (if that doesnt conflict with other users locks).
27
Locking (continued)
FIGURE 7-8: The DBMS uses a locking scheme to apply Ryans and Elenas updates to the database
28
Locking (continued)
FIGURE 7-8: The DBMS uses a locking scheme to apply Ryans and Elenas updates to the database (continued)
29
Locking (continued)
Ryan now cannot access the record.
FIGURE 7-8: The DBMS uses a locking scheme to apply Ryans and Elenas updates to the database (continued)
30
Two-Phase Locking
Transactions usually require multiple locks All the locks need to be held until the end of the transaction to assure integrity The DBMS accomplishes this using two-phase locking:
Growing phase: DBMS requests new locks without releasing those already held Shrinking phase: DBMS releases locks without acquiring any new ones
31
Deadlock
Deadlock or deadly embrace
Two users hold locks and require a lock on a resource that the other already has neither can procede To minimize occurrence, the DBMS could try to lock records in a consistent order applicable to all users, but it has no way to predict what data the user will access, let alone the order.
The DBMS must detect and break deadlocks when they occur
DBMS chooses one user to be the victim DBMS rolls back victims transaction, freeing up its locks for other users transactions this usually requires the victims software to detect the failure and reattempt the transaction. 32
Deadlock (continued)
FIGURE 7-9: Two users experiencing deadlock choose one as the victim
33
Locking on PC-Based DBMSs
Usually more limited than locking facilities on enterprise DBMSs Programs can lock an entire table or an individual row within a table, but only one or the other Programs can release any or all of the locks they currently hold Programs can inquire whether a given row or table is locked
34
Timestamping
Time stamping is an alternative to locking
DBMS assigns each database update a unique time (timestamp) when the update started Compares the stamp to decide victim
Advantages
Avoids need to lock rows Eliminates processing time needed to apply and release locks and to detect and resolve deadlocks
Disadvantages
Additional disk and memory space Extra processing time
35
ACID Properties
to guarantee transaction reliability
Atomicity
Transaction is all or nothing
Consistency
Transaction brings DB from one consistent state to another
Isolation
No transaction can interfere with another
Durability
Once committed, it will remain so no matter what
36
Catalog Services
Metadata: data about data Catalog stores metadata and makes it accessible to users A data dictionary is a view of the catalog, providing information about database objects.
37
Security Services
Security: prevention of unauthorized access, either intentional or accidental, to a database Most common security features used by DBMSs:
Authentication Authorization Views Encryption
38
Authentication
Authentication: techniques for identifying the person attempting to access the DBMS
Password: string of characters assigned by DBA to a user that must be entered for access
Note that more secure options are available; e.g., integration with Windows security is preferred for SQL Server.
Biometrics: identify users by physical characteristics such as fingerprints, voiceprints, handwritten signatures, and facial characteristics Smart cards: small plastic cards with built-in circuits containing processing logic to identify the cardholder
39
Authorization
DBA can use authorization rules to specify which users have what type of access to which data Workgroups: groups (classes) of users defined to simplify the authorization process Permissions: specify what kind of access the user has to objects in the database we talked about the GRANT command earlier in the course.
40
Views
View: snapshot of certain data in the database at a given moment in time For most users or groups, the DBA grants access to views, rather than the underlying tables, in order to provide appropriate data security.
41
Privacy
Privacy: right of individuals to have certain information about them kept confidential Laws and regulations dictate some privacy rules Companies institute additional privacy rules Privacy is enforced by views.
42
Encryption
Encryption: converts data to a format indecipherable to other programs, and stores it in that form when writing or updating. Decryption: reverses the encryption to get clear data. DBMS uses encryption in several contexts:
Passwords encryption or one-way hash Data stored on disk - only DBMS may decrypt
prevents back-door access transparent to legitimate users
Data transmitted between user and DBMS only endpoints see data in the clear
prevents snoopers from seeing data
43
Two Types of Encryption Symmetric
Symmetric uses a key
Key is relatively short (<= 256 bits)
Key is a binary number but, for desktop DBMS, may be represented as a password, cutting down considerably on the number of possibe values
Same key is used to encrypt and decrypt blocks of the same size as the key Relatively fast Generally used in a chained mode (i.e., the encrypted value of a block depends on both the key and the encrypted value of the previous block)
44
Two Types of Encryption Asymmetric (aka. Public Key)
Asymmetric uses a key pair
Two keys one private another public public key is published (e.g., in a certificate) private key must be protected (often in a separate device that does the actual encryption / decryption)
To encrypt (and send) data, encrypt with other partys public key only that party can read it. To sign (and publish) data, encrypt with private key anyone can use public key to decrypt (knowing its from you)
Relatively slow Keys / blocks are large (>= 1KB) Generally used to transmit symmetric keys, or digital signatures (an encrypted hash of the signed content)
45
Two Types of Encryption
Symmetric and asymmetric are often used in combination
E.g., when using a https site, the browser chooses a symmetric key, encrypts it using the servers public key, and sends it to the server (who is the only one that can decrypt it); data is encrypted both directions using the symmetric key
Because symmetric keys are easier to break, the two sides may agree to change the symmetric key from time to time
Some secure transmissions require both sides to have a public key pair smart cards provide one secure mechanism to support this
46
Data Replication
Replication: providing redundant copies (replicas) of data
often at different sites for performance (at remote sites) and/or recovery (of central site)
Synchronization: DBMS exchanges all updated data between master database and a replica
Depending on requirements and cost, synchronization can be batch or real time (or in between) The complexity of synchronization depends on whether data can be updated at more than one site. This is often done by third-party software.
47
Support Data Replication (continued)
FIGURE 7-18: DBMS synchronizes two databases in a replica set
48
Recover Data
Recovery: returning database to a correct state from an incorrect state This is usually required after a database failure (e.g., due to power failure, disaster, or malware attack)
49
Backup / Restore
Simplest recovery involves using backups or replicas
Backup or save: copy of database
A full backup is often made on the order of once a month (depending on the size and activity of the database) Partial backups are made at least daily they include changes since the last full backup
Backups are always done, but are used only as a last resort since restoration of a large database may take days (or even weeks)
50
Recovery
Recovery: returning database to a correct state from an incorrect state
This is required after a database failure (e.g., due to power failure, disaster, or malware attack)
Simplest recovery involves using backups or replicas
Backups are always done, but recovery from backups is slow. Replicas (often at a different site) lag the original, but are not too far behind provides a hot standby for faster resumption of service
No matter which is used, there is still a need to do some adjustments to assure integrity and to capture recent transactions.
51
Logging
Logging (aka Journalling): maintaining a log (journal) of all updates to the database
Logs are usually replicated offsite and available even if database is destroyed The DBMS writes to the log before writing changes to data.
The DBA must keep transaction logs going back to the most recent backup (or at least to what is known to be replicated at a hot backup site)
52
Logging
Information is logged for each transaction:
Each log entry includes:
Transaction ID Date and time of each update
Log events include:
Start of a transaction Before image (values of relevant data before a change) After image (values of relevant data after a change) Successful completion (commit) of a transaction Checkpoint (i.e., periodic events where the DBMS stops all activity and writes all information in cache to the database, bringing the log in sync with the database)
53
Logging
FIGURE 7-10: Four sample transactions
54
Using Logs
Logs can be used for recovery in two ways:
1. After a power failure, the DBMS uses the log to roll back transactions that were not committed, using before-images. 2. After a recovery (from backups or using a replication site), the DBMS uses the log to roll forward transactions that were completed but are not present in the recovered database (e.g., were not on the most recent backup or not yet replicated), using after-images. 3. Recovery often requires a rollback phase followed by a recovery phase since a replicated database may have partially completed transactions.
55
Forward Recovery (continued)
FIGURE 7-12: Forward recovery
56
Backward Recovery (continued)
FIGURE 7-13: Backward recovery
57
Recovery on PC-Based DBMSs
Sophisticated recovery features not available on PC-based DBMSs Regularly make backup copies using DBMS
Use most recent backup for recovery
Systems with large number of updates between backups
Recovery features not supplied by DBMS need to be included in application programs
58
Utility Services
Utility services assist in general database maintenance Change database structure Add new indexes and delete indexes Use services available from operating system Export and import data Support for easy-to-use edit and query capabilities, screen generators, report generators, etc.
59
Summary
DBMS allows users to update and retrieve data in a database without needing to know how data is structured on disk or manipulated DBMS must store metadata (data about the data) and make this data accessible to users DBMS must support concurrent update Locking denies access by other users to data while DBMS processes one users updates During deadlock and deadly embrace, two or more users are waiting for the other user to release a lock before they can proceed
60
Summary (continued)
In timestamping, DBMS processes updates to a database in timestamp order DBMS must provide methods to recover a database in the event the database is damaged DBMSs provide facilities for periodically making a backup copy of the database Enterprise DBMSs maintain a log or journal of all database updates since the last backup; log is used in recovery process
61
Summary (continued)
DBMSs provide security features (encryption, authentication, authorizations, and views) to prevent unauthorized access to a database DBMS must follow rules or integrity constraints (key integrity constraints and data integrity constraints) so that it updates data accurately and consistently DBMS must support data independence DBMS must have facility to handle data replication DBMS must provide utility services that assist in general maintenance of a database
62