Comet story (part I)

A few years ago, I wanted to create a remote tutoring website. I even bought a domain name - remoteguru.info. Then after a long thought, I decided against creating the site and instead concentrate on AI (no, I haven't built R2D2 yet). But in the meantime, I have done some homework on this remote tutoring site. In this piece, I'll tell what happened during that time.

The success of any software product depends on whether the users of the software like it or not. Technical excellence is always an important issue, but it can never woo the user community on its own. The software must be highly usable and responsive to get the love of the user. I knew all this and yet, being a programmer, I focused on the technical side! I made it my immediate goal to gain the know how to make a web-based remote learning site instead of thinking on providing the best experience a remote learning site can offer. That was bad :-)

During my homework on the technical side of the site, I determined that such a site will need a wonderful online whiteboard and the whiteboard must be collaborative. When I say collaborative, I mean every user will see the same content on their browser. Whenever somebody will update their whiteboard, everybody's board will be updated. It's like a multiuser chat session where anything typed by any user appears on everybody else's screen instantaneously.

Updating anything in a remote system can only happen when the update has been propagated to the remote system. For a multiuser whiteboard, this will mean propagating the update to all users except the one who has made the update. On the web, this will mean if N users are using a whiteboard from their browsers, the web server will have to propagate the update to N-1 browsers.

Now anybody who is anybody in the web development world knows that a web server can send anything to the web browser iff the web browser specifically requests it. This is because the HTTP protocol has been designed that way. It looks as if this one way communication is not a speciality of HTTP only. Even the SMTP server does not send anything to the client unless requested. In fact most protocols do not facilitate two way communication. This is not surprising given the fact that most servers only serve data which does not involve lengthy computation. Most servers are not built for collaboration either. A notable exception is chat and game servers and we will come back to them later.

Because HTTP servers cannot send data to their clients whenever they want, there is no way to propagate an update to a browser, let alone N-1 browsers. This limitation mostly killed off the prospect of web-based collaborative apps. This is really frustrating. But before giving in to the frustration, people researched the issue and settled on a not-so-great solution and that was long lived HTTP connection. There was really no alternative, so it has to be that way. So the world was saved, but there was still a long way to go to get the girl :-(

The concept of long lived HTTP connection is nothing complicated. Once the browser establishes a connection with the web server, instead of closing the connection, the server keeps it open as long as it wants. If the connection is kept open for one hour, the web server can send data to the client whenever it wants in that time span. This opened the almost closing door for webapps that need N-way collaboration.

Once the solution was known, people tried to leverage it and stumbled upon two problems. As you can see, it was a torturous journey. There was no end to "problems". But our heroes eventually did make it. Now that first problem was - different web browsers behave differently to long lived HTTP connections. Enter the browser hell.

The second problem was, popular web servers like Apache are not the best piece of software for maintaining lots of HTTP connections. In an N-way collaboration, the web server will have to keep N connections open for each session. If the scenario is such that 1000 web clients are each seeing 2 whiteboards, the web server will have 2000 connections open. That's a bit too much for Apache and friends. The end result is - web-based collaborative apps have to go through the browser hell only to face the scalability mountain.

The scalability issue was resolved by handing over the task of serving long lived HTTP connections to a separate web server. This class of web servers are known as Comet servers and they have no problem of scaling up to thousands of long lived HTTP connections.

Unfortunately, the browser hell was not an easy zone. To make the long story short, I'll only describe the divine solution. The main webpage will open an invisible iframe and this iframe will come from the Comet server. Next the iframe will open a long lived HTTP connection with the Comet server using an Ajax call. This Ajax call will remain open as long as the Comet server has nothing to offer. As soon as the Comet server has an update, it will send it to the client and close the Ajax connection. Having received the update, the Ajax callback hands in the update to the parent window (the parent of the iframe) which then applies the update. This takes care of only one update. So what happens to the subsequent updates? This part is easy - the receipt of every update will trigger a new Ajax call to the Comet server and this will go on for the length of the session. Wikipedia calls this whole process as "Ajax with long polling".

This leaves us with the question - where does the Comet server receives the updates from? There are many possibilities; here are two significant ones -

0. Our main web server (read Apache) can pass the updates.
1. Updates can come from an external source, like a chat server.

That is mostly the summary of Comet and the reason it exists. Having introduced the most interesting obstacle standing between me and my collaborative whiteboard, I'll go back to my main story.

After realizing that my web-based whiteboard will need a Comet server, it was time to pick one. After a very detailed and long search mission, I had two Open Source comet servers on my lap - Orbited and Meteor server. Orbited is written in Python and uses the famous Twisted network programming framework. Meteor is written in Perl. Orbited listens on the HTTP port and serves long lived HTTP requests itself and passes on the other requests to the main web server. So for normal HTTP connections, Orbited acts like a proxy. And this is something I didn't like. I am not sure why, but I just don't want to surrender port 80 so easily. This left me with Meteor only.

Meteor does not mess with port 80, but it does need an extra IP address; sounds like a high maintenance girlfriend! Despite this, I decided to give it a go and guess what, my ISP (the good guys at rootbsd.net) gave me a free IP address! With a good omen like this, how can you not go with Meteor. With the new Free IP address at my disposal, Meteor was up in no time. Here is how the IP addresses are used:

Server IP address Port
Apache * 80
Meteor * 6666

And here is how the port redirection works:

IP Port Redirection IP Port
X.X.X.71 80 No X.X.X.71 80
X.X.X.81 80 Yes X.X.X.71 6666
$ dig remoteguru.info
...
;; ANSWER SECTION:
remoteguru.info.        14379   IN      A       X.X.X.71
...

$ dig comet0.remoteguru.info
...
;; ANSWER SECTION:
comet0.remoteguru.info. 14388   IN      A       X.X.X.81
...

So all traffic to http://comet0.remoteguru.info:80 are passed to Meteor which is actually listening on port 6666, thanks to port redirection. On GNU-Linux, port redirection is set up with IPTables. On FreeBSD, I used IPNat. In both cases, the redirection is handled by the Kernel and has a very low overhead.

Meteor also comes with excellent documentation. The documentation not only tells you how to write Comet clients for Meteor, it also explains the theory behind Comet. Everything I now know about Comet, I have learned it from Meteor's documentation.

But every good thing comes to an end. Being a Drupal developer, I wanted to write Drupal-based Comet apps. That brought the question of authenticated user sessions and Meteor does not support authentication! By using some obscure URLs, it is actually possible to get a kind of authentication, but that is not true authentication. Besides, I noticed that quite often, Meteor is falling back to polling instead of Comet. Polling is repeated Ajax requests to the web server asking for updates instead of maintaining a long lived connection with the web server and let the server send the updates. I didn't like this at all. I probably could have lived with security through obscurity, but not polling. At this point I had to part with Meteor :-(

Despite this, I am immensely grateful to Meteor because I have learned so much while playing with it. At this point I decided to drop this whole Comet thing from my mind and concentrate on other things. Little did I know that Comet is going into hibernation in my mind only to pop out later at a suitable time. But I will leave that story for the next part :-)