Causal Broadcast
Causal broadcast guarantees all of the FIFO broadcast properties plus Causal Ordering, which means that no message will be delivered to a process before it has already delivered the prerequisite messages.
There is generally no way that causality itself is captured. Instead we can capture potential causality. If the first process broadcast both ÒAÓ and ÒBÓ (and the 2nd process received it) before the second process broadcast Ò1Ó and Ò2Ó, we must assume that messages Ò1Ó and Ò2Ó are causally dependent on messages ÒAÓ and ÒBÓ.
To illustrate the problem, we can cheat and simulate a network delay by simply starting one process late. Here is a sample run that illustrates the problem:
java
FifoBroadcastProcess Tanenenbaum |
java
FifoBroadcastProcess Linus |
|
Linux
sucks. |
|
|
Delivered: Linux sucks. |
Delivered: Linux sucks. |
<We simulate a message delay here by starting up this process
late.> |
|
|
java
FifoBroadcastProcess Jobs |
|
Yeah,
whatever... It is better than
Minix. |
|
Delivered: Yeah, whatever... It is better than Minix. |
Delivered: Yeah, whatever... It is better than Minix. |
Delivered: Yeah,
whatever... It is better than
Minix. |
The ÔJobsÕ process never received the initial message, so it has no idea what ÔitÕ is supposed to be better than Minix. Even if we had used RecoveringFifoBP instead of FifoBroadcastProcess, no error would have been detected, since all of the previous messages coming from Linus would have been received.
There are several approaches to this problem. IÕll discuss the Hadzilacos/Toueg approach first.
Their solution is to have each
process re-broadcast every message received before the process broadcasts its
own message. So, in the example
above, the Linus process would have broadcast ÒLinux SucksÓ before broadcasting
its own ÒYeah, whateverÓ message.
Note on terminology: Re-broadcasting is
not the same as relaying a message.
Relaying just resends on the message to other processes
unchanged. Re-broadcasting the
message means sending out a new message with the same text. So if the Linus process relayed
TanenbaumÕs message, it would still be labeled as ÒTanenbaum 0Ó, but when
it re-broadcasts the message, it would then be ÒLinus 0Ó, and the following
message would be ÒLinus 1Ó.
Then, becaused of the FIFO characteristics, so process will receive
these messages out of the desired causal order.
This approach seems to highlight both the strengths and weaknesses of the authorsÕ layered approach. As a plus, none of the lower level code needs to be changed to accommodate this type of broadcast. The problem is that the number of messages required per broadcast has risen astronomically. And we have another problemÉ
There is no built in mechanism to detect that a message has already been received. (After all, the entire point of this design is that re-broadcast messages are treated as new messages). If we cannot tolerate a message being delivered multiple times, we must now find a way to compare messages. In CausalBroadcastProcess.java, we cheat and just compare the message text.
java
CausalBroadcastProssTanenenbaum |
java
CausalBroadcastPross Linus |
|
Linux
sucks. |
|
|
Delivered: Linux sucks. |
Delivered: Linux sucks. |
<We again simulate a message delay here by starting up this
process late.> |
|
|
java
CausalBroadcastProssJobs |
|
Yeah,
whatever... It is better than
Minix. |
|
Delivered: Yeah, whatever... It is better than Minix. |
Delivered: Yeah, whatever... It is better than Minix. |
Delivered: Linux sucks. Delivered: Yeah, whatever... It is better than Minix. |
This worksÉ sort of. The delayed message is now delivered. However, this means that the same message text can never be sent twice – hardly a desirable feature.
This approach is not dead yetÉ We can add a unique ID to each message (and keep that ID in the re-broadcast messages), or we could allow the message to be tagged with multiple sender IDs. But all of these fixes seem to violate the layered approach – we now have to modify lower levels of our design to accommodate this algorithm.
Given these problems, another solution may be better. My favorite is Vector Timestamps, discussed in chapter 5 of ÒDistrubuted Systems: Principles and ParadigmsÓ by Andrew S. Tanenbaum and Maarten van Steen.