Real-time Data Delivery: HTTP Streaming Versus PubSubHubbub

There are a number of ways of delivering data in real-time but until recently it has looked like PubSubHubbub, with the backing of Google, was going to be the preferred method. However, the past couple of weeks have seen a couple of interesting developments which could indicate that the developer community may actually prefer HTTP Streaming.

The emergence of the real-time web has seen an increase in the visibility of technologies that facilitate the delivery of data in real-time. Twitter was most probably the catalyst for this due to the many high profile cases where Twitter has been able to deliver the news before any other traditional news medium; the Hudson river plane crash is probably the best example of this. Some of the real-time technologies include PubSubhubbub, RSSCloud, Comet, XMPP, MQTT, Adobe LiveCycle, Google Wave Protocol, WebHooks, WebSockets and HTTP Streaming to name but a few.

We've also seen an increase in the number of real-time services over the past year who have used these technologies. Services such as Beacon, DataSift, Google Buzz, Kwwika (disclosure: author is a founder),, PubNub, Pusher, Superfeedr and of course Twitter. You can also find a number of other real-time APIs in our directory.

HTTP Streaming has been generally associated with Ajax in the past. In fact the Wikipedia entry for HTTP Streaming (under the Push Technology page and listed as HTTP server push) talks only about "sending data from a web server to a web browser." This is out of date and HTTP Streaming is now much more than this. HTTP Streaming takes advantage of the fact that the Internet infrastructure has been built with HTTP in mind (as does PubSubHubbub). HTTP is fully supported so as well as using this protocol to distribute your static content such as HTML, images, CSS and JavaScript why not use it to distribute real-time data as well. The part of the Wikipedia definition for HTTP Streaming that is correct is:

Generally the web server does not terminate a connection after response data has been served to a client. The web server leaves the connection open such that if an event is received, it can immediately be sent to one or multiple clients.

A client in this context doesn't have to be a web browser. It can be another web server, a desktop app, a mobile phone app, an embedded program running on a piece of hardware, a web application; basically any web enabled device capable of making a persistent HTTP connection.

This might be why services such as Superfeedr, who consistently champions PubSubHubbub, have introduced support for HTTP Streaming and why new services like DataSift has provided support from almost day one.

So, why are services starting to offer HTTP Streaming? The first thing you may think is that a persistent HTTP connection might be a faster way of receiving data than PubSubHubbub and it's intermittent HTTP Push requests. Surprisingly this isn't supposed to be the case since "HTTP 1.1 reuses TCP connections by default" as I recently found out.

One thing that PubSubHubbub does require is that the push notifications have to be made to a web server. This means that PubSubHubbub is highly unlikely to be used for real-time client data delivery because client applications don't tend to run their own web server. Therefore HTTP Streaming is a more accessible real-time data delivery mechanism since any technology that can make a web request, and hold a persistent HTTP connection, can receive real-time push notifications. This means that by offering a HTTP Streaming API a service can be consumed by anything from a hardware embedded system to a mobile application as long as they are connected to the Internet.

The other thing that PubSubHubbub does is define the message format. This can be seen as a positive and a negative but since we are seeing JSON continuing to win over XML as the preferred data format it looks like PubSubHubbub will have to evolve away from XML to keep up as this question on Quora suggests.

This is an exciting trend which will most probably continue and will lead to us seeing truly real-time applications on any web-enabled device. It certainly doesn't signal the end of the road for PubSubHubbub, which has its roots firmly in RSS (and XML), along with so much of the Internet. However, HTTP Streaming could become defacto standard for client push applications.

Photo via Blake Patterson

Be sure to read the next Real Time article: Browse, Build and Share Real-time Streams with DataSift