Thursday, March 25, 2010

Failure detector in timing model

There are basically 3 timing models.
  1. Synchronous
  2. Asynchronous
  3. Partial Synchronous

Synchronous

Where we know the upper bound for communication time, computation time and the time difference for the physical clock. Which means we know the upper bound of response for a process.

Asynchronous

Where we do not have any idea of time bound and we can not make any assumption about time.

Partial Synchronous

When we assume that after some finite time the system becomes synchronous.

Now let us consider different timing models on the failure detectors. For any failure detector completeness property can be satisfied by any timing model as it is a liveness property and can be satisfied in later execution.
So the property to worry about is the accuracy property. For details see here.

Strong Accuracy

Can only be satisfied in synchronous system. Because strong accuracy does not allow to detect a correct node as failed even once. To check if a node is alive we use heartbeat technique. Which introduces a timeout to check the pulse to determine the node failure. In synchronous model we can specify a time out and be happy with it, because we know the time bound of the system. But for asynchronous system we do not have any time bound and can not assume anything about time. So any kind of timeout is meaningless in asynchronous model. 
So strong accuracy(Perfect failure detector) is only achievable by synchronous system.

Eventual Strong Accuracy

Well it is same as strong accuracy only that it is given a some finite time to get strong accurate. So it is implementable in partial synchronous model. As partial synchronous gets synchronous eventually.

Weak Accuracy

weak accuracy says that at least one correct node will not be suspected by other correct nodes. That weakens strong accuracy by reducing the number of correct nodes that are not suspected. But never the less we have to ensure accuracy for one node which does not allows us to reduce our timing model to less then synchronous model. At least there should be one node with every one is well connected. And for those links the timing synchrony holds.

Eventual Weak Accuracy

Then it turns out that the timing model of Eventual weak Accuracy will be same as Eventual Strong Accuracy.

Monday, March 22, 2010

Failure Detectors and it's properties

Failure detection is essential for many algorithms. And any failure detector is composed of two basic properties.
  1. Completeness
  2. Accuracy
There can be 2 types of completeness property
  1. Strong completeness
  2. weak completeness

Strong completeness

Every crashed node is eventually detected by all correct nodes.

Weak completeness

Every crashed node is eventually detected by some correct nodes.

Actually the weak and the strong completeness is equivalent. Because given a weak completeness we can achieve strong completeness. As some correct nodes has the information about all crashed nodes it can disseminate[using BEB, and as the some nodes are correct so all the correct nodes will get the information] the list of crashed nodes and every one will be aware of the crashed nodes.
Correctness is a liveness property. It can be proved as if some node do not detect the crashed node yet then there is possibility that it will be detected at some point is the future :)

Now let's talk about the accuracy property. It can be classified into 4 as,
  1. Strong accuracy
  2. Eventual Strong accuracy
  3. weak accuracy
  4. Eventual weak accuracy

Strong accuracy

No correct node is ever suspected.

This is really a strong property and this is safety property as well. Because if you detect a correct node as failed then the property is violated for any execution after that.

Eventual Strong accuracy

Eventually no correct node is suspected.

So now this property becomes liveness property because if a correct node is suspected now then there is still a possibility that in future execution no correct node will be detected as failed.

Weak accuracy

There exists a correct node which is never suspected by any node.

So this is a safety property because if none of the node is recognized as correct node then the property can not be satisfied in any future execution.

Eventual Weak accuracy

Eventually there exists a correct node which is never suspected by any node. And this is a liveness property.

So any failure detector is a combination of completeness property and any of the accuracy properties.

So here are the possible failure detectors(considering that strong and weak completeness is equivalent)

Accuracy/CompletenessStrong
StrongPerfect Failure Detector
Eventually StrongEventual Perfect Failure Detector
WeakStrong Failure Detector
Eventually WeakEventual Strong Failure Detector

References