0% completed
In database systems, maintaining data consistency and ensuring quick recovery after a crash is critical. Checkpointing plays a vital role in achieving these objectives by periodically saving the database state to disk, reducing recovery time. Combined with logging mechanisms, checkpointing allows the system to restore itself efficiently while minimizing the work needed during recovery.
Let’s explore checkpointing strategies, their significance, and how they interact with recovery processes.
Checkpointing is the process of saving a snapshot of the database’s current state, marking a point from which recovery can begin if a failure occurs. Instead of scanning the entire log file during recovery, checkpointing allows the system to start from the most recent saved state, reducing downtime.
The image shows a sequence of transactions (T1, T2, T3, and T4) along with a checkpoint and a failure point. Here's what happens:
Transactions T1, T2, and T3:
Checkpoint:
Transaction T4:
Failure:
After a crash, recovery involves the following steps:
Start from the Checkpoint:
Redo Phase:
Undo Phase:
Reduces Recovery Time: Without checkpoints, the recovery process would need to scan the entire log file, which can be time-consuming for large databases.
Improves Performance: By marking stable points, checkpointing prevents the log file from growing indefinitely, optimizing storage and I/O operations.
Simplifies Recovery: Recovery can skip all operations before the last checkpoint, focusing only on recent changes.
Sharp Checkpoints:
Pros:
Cons:
Fuzzy Checkpoints:
Pros:
Cons:
Distributed checkpointing extends the concept of checkpointing to distributed systems. It involves saving the global state of the system across all nodes to facilitate recovery after a failure.
Consistent Global State:
Coordinated Checkpointing:
Uncoordinated Checkpointing:
Consider a distributed e-commerce application where:
If a failure occurs:
.....
.....
.....