18 Chandy-Lamport Snapshot Protocol
Assumptions
- No failures during snapshotting
- FIFO reliable channels: no lost or duplicate messages
- Strongly connected execution graph: each process can reach every other
- process in the system
- Single initiating process
Goals
- Taking a snapshot does not interfere with processing
- Processing and messages do not stop
- Each process can locally record its own state
- Any process can initiate the algorithm (not a specially designed process)
Initiator Process
- Records its own state
- Sends a marker out on each of its outgoing channels
- Starts recording all data (application) messages it receives on all of its incoming channels
- The marker is a special (control) message that is not recorded in the snapshot but enforces the exact point in the FIFO communication between processes where the cut occurs.
Receiving a Marker
receive marker for first time:
- Records its own state
- Marks the channel that the marker came in on as “empty” (Future messages arriving on this channel will not be part of the snapshot)
- Sends markers to all its outgoing channels
- Starts recording incoming messages on all its incoming channels except the one marked as “empty”
Otherwise (not first time): - Stops recording on the channel it received the marker from
Completing a snapshot
The snapshotting process is complete when all processes:
- Have received a marker and recorded their local state
- Have received markers on all incoming channels and have recorded all channel states
After this point, we can then collect the locally recorded states and construct a global snapshot