2020-12-16 07:12

CP363 : Database Recovery

(covers recovery techniques discusses with respect to system logs)

Recovery from non-catastrophic failure

deferred updates: database on disk not updated until after commit - updates are held in buffers. Upon writing, logs are updated first, then data committed to disk. Failed transactions are simply flushed, so no undos are needed. NO-UNDO/REDO algorithm.
immediate update: transaction results are written to disk before commits, but operations are recorded to the log first. If a transaction fails, then it must be rolled back. Called UNDO/REDO.

Single user systems may write transactions to disk even before commits because there are no interleaving transactions.

Transaction Rollback

Data items that have been written to a database may have to be rolled back if a transaction fails. Log entries are used to roll back the database to its state before the transaction fails.

This can lead to cascading rollback, where transactions that read data written by a failed transaction must themselves be rolled back, possibly leading to further rollbacks, and so on. In order to implement a cascading rollback the log must contain a record of read operations in order to determine which transaction read what data and when.

With properly designed schedules (or transaction histories), there is never any need to perform a cascading rollback.

Recovery in Multidatabase Systems

Until now, all recovery discussion has been about single database systems.

Some transactions may require access to multiple databases. (eg. distributed databases)

Need a global recovery manager to coordinate commits using a 2 phase protocol:

Databases signal coordinator that their part is finished. Coordinator sends a 'prepare to commit' message. Databases write log records to disk, and replay 'ready'. Sends 'not OK' if there is a problem. Coordinator assumes 'not OK' if there is a timeout.
If all databases signal 'OK', coordinator sends 'commit'.Since all databases have logs, they can recover. If a database signals a commit failure, then coordinator and ask all databases to roll back.

Thus, either everyone commits, or no-one does.

Catastrophic Database Recovery

Catastrophic failure occurs when events such as a disk crash take place. A disk crash could affect a database, a log, or both. The only guaranteed method of recovering from catastrophic failure is through database backup. Copies of databases and logs are made to other media (tapes, CDs, other drives) at regular intervals. Logs, generally being smaller than the database itself, should be backed up more often than the database. In case of a crash, the last database backup can be loaded onto disk, and the logs from that time until the crash can be rerun. A new log should be started after each database backup in order to minimize the size of the log. Mirroring a log to a device separate from the database can make catastrophic recovery easier as it allows the database administrator access to the most recent transactions. Without a mirror, only transactions made before the most recent backup can be recovered.