A database transaction is a logical unit of database processing. It consists of a series of database access operations: insertion, deletions, selection, updating.
A transaction can be part of a user program or an interactive session.
Transactions can be read-only (selection) or updates (insertion, deletion, update).
Transactions have a clearly defined beginning and end. Example: an ATM transaction. There is an explicit login followed by a series of other transactions (deposit, withdrawl, transfer, balance) followed by an implicit logout. Each transaction must be finished before moving onto the next one.
At the physics level transactions involve a series of reads and writes to and from memory and disk. (We will worry about the physical layer in only a few special cases).
ACID properties. Transactions are:
Database Concurrency is defined by the number of users who may access the database simultaneously:
The following are examples of concurrency problems. A and B are transactions.
Occurs when transactions are interleaved.
A | B |
read(N) N = N - i |
|
read(N) N = N + k |
|
write(N) (this update is lost) | |
write(N) |
A read is done on an updated item before the transaction that updated the item fails.
A | B |
read(N) N = N - i write(N) |
|
read(N) | |
failure rollback(N) |
The N read by B is 'dirty data'
When an aggregate function is applied to a number of records that are undergoing updating by another transaction
A | B |
sum = 0 sum = sum + N1 sum = sum + N2 |
|
read(N5) N5 = N5 - i write(N5) |
... |
sum = sum + N5 sum = sum + N6 |
|
read(N6) N6 = N6 - i write(N6) |
(assume reads of the appropriate values are done before each summation)
A value is changed between two reads, meaning that you do not get the same value twice
A | B |
read(N) display N |
|
read(N) N = n - i write(N) |
|
read(N) N = N + j write(N) |
The update is done to the 'wrong' value of N.
System Failures
Media Failures
Logs are scanned from tail to head to create a list of committed transactions, and then from head to tail to commit those transactions.
Checkpoints reduce the amount of log data that must be scanned after a failure. Tail to head scan goes only until last checkpoint (after system failure).
Concurrency
For performance reasons, a database processing several transactions simultaneously. However, transactions appear to be processed serially.
Database Concurrency is defined by the number of users who may access the database simultaneously: