Home > Distribution > Distribution Protocols Specifications

Distribution Protocols Specifications

I recently looked around for the different distribution protocols that I mentioned in my previous post.

Here is what I found :
– The Bittorrent protocol specification,
– The Erlang’s Distribution protocol,
– The BOINC System Architecture,

And I also found out some interesting initiatives, that seems to be pretty inactive at the moment, sadly : Gittorrent and its protocol : GTP
An interesting mix 🙂

Apart from that, after reading a lot of research papers on distribution. I have selected a few that seem to be related to my interests.

One of the problems, in the way I see it, is that when you have one CPU, you can easily say that instruction happened “after” that one. And that becomes natural when you write a line of code after another. Now when you have multiple CPU running multiple processes in different places, with delays in transmissions, it become crucial to be able to determine if an instruction actually happened “before” or “after” ( or “at the same time” ) another instruction.

The most widely used algorithm to do that might be what is called Vector Clocks.
One problem though, is that vector clocks aren’t really dynamic, as you must know from the start how many concurrent processes are running, and you might be quite limited on the number of concurrent process you can have, as it will, with a naive implementation, change the size of your clock.
Second issue is that you need, most of the time, if you care about the size of the data you are sending over the wire, to have a central point to establish and increment that clock. Otherwise you need either :
– to duplicate your clock to actually end up with a Matrix Clock that can be in some context prohibitive because of its size…
– to send all your messages to all processes, so that the clock doesn’t become inconsistent between 2 nodes, and make sure they actually arrive…
This is not a simple issue to work around if you want to use your distribution redundancy as a safety net, for a process to replace another one if something fails for example, or for a “tunnel”  to be setup when a connection in your net falls down.

All this is a little annoying as if you want to optimise your transfer over the network, while using the benefits of the distribution to write safe applications, without any single failure point.

An interesting alternative to vector clock has been proposed : Interval Tree Clocks
It deals with the dynamic problem, and I planned first to try to implement something similar to organize data over my distributed network.

I am not sure I understand all of it already, but from what I can catch so far, I think the size of data you need to send to the different nodes might change with a naive implementation at least… I am not so sure of how to handle that especially if I want to use UDP and not TCP, as I do not really need the total ordering TCP provides, but I will likely enjoy the UDP performance…
The idea to have an ProcessID that is used when you increment the clock seems to be the way to go to handle dynamic creation / failures / termination of processes ; and the way the number of concurrent processes is “compressed” is pretty effective and interesting. However I am not sure if it will actually work in an architecture without “central point” or total broadcast, or some node might wait forever for a message who is not coming…
I now think that I need something closer to the network, at a lower level. Something that can deal with network intricacies (latency, losses … ) if needed. Or maybe I will need to subdivide my feature set, and implement them in separate components… we’ll see.

When you have a distributed, network hungry, application, the best way to reduce messages transmitted from one place to the other is to actually filter the messages you are going to send. Therefore there is in many protocols a “publish/subscribe” system to actually determine what needs to be send to who. And that “selective messaging” while improving network usage, doesnt play nice with ordering algorithms…

What I want to try to get, is a completely distributed architecture, without the need for a central “reference” point at any moment, and with possible “selective sending”. While keeping the partial ordering in a dynamic context that is provided by ITC. I think this might raise a lot of interests in people towards distributed architecture, because of the safety you might gain ( no central failure point ) and the optimisation on the network that appears when you have many processes communicating the same information around…

A lot of other improvements have already been made around vector clocks, more or less publicly, with more or less success, and what I need to likely to be a “mix of those”. Yet I ll have to prove somehow that it s going to work, and that it will not impact the network in a bad way…
The search is far from over…

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: