18 Chandy-Lamport Snapshot Protocol

Assumptions

  • No failures during snapshotting
  • FIFO reliable channels: no lost or duplicate messages
  • Strongly connected execution graph: each process can reach every other
  • process in the system
  • Single initiating process

Goals

  • Taking a snapshot does not interfere with processing
  • Processing and messages do not stop
  • Each process can locally record its own state
  • Any process can initiate the algorithm (not a specially designed process)

Initiator Process

  1. Records its own state
  2. Sends a marker out on each of its outgoing channels
  3. Starts recording all data (application) messages it receives on all of its incoming channels
  4. The marker is a special (control) message that is not recorded in the snapshot but enforces the exact point in the FIFO communication between processes where the cut occurs.

Receiving a Marker

receive marker for first time:

  1. Records its own state
  2. Marks the channel that the marker came in on as “empty” (Future messages arriving on this channel will not be part of the snapshot)
  3. Sends markers to all its outgoing channels
  4. Starts recording incoming messages on all its incoming channels except the one marked as “empty”
    Otherwise (not first time):
  5. Stops recording on the channel it received the marker from

Completing a snapshot

The snapshotting process is complete when all processes:

  1. Have received a marker and recorded their local state
  2. Have received markers on all incoming channels and have recorded all channel states
    After this point, we can then collect the locally recorded states and construct a global snapshot