Data Portability and Pushability with Gnip

Gnip today announced a much needed piece of the web services infrastructure - a proxy service that sits between Data Publishers (like Digg, Flickr, and Twitter) and Data Consumers (like Plaxo and MyBlogLog) as a means to make moving structured data between services more efficient, flexible and scalable.

Up until now the systems for consumers to monitor the activity of publishers have been ad-hoc and typically based on Polling their individual APIs. This is challenge for data consumers and as well as providers. For API publishers they face issues like multiple protocols and data formats to support, the need for API throttling and management, identity and security, and not least of all, Scaling the service if you're successful.

Gnip hopes to change that into a standardized model that combines both push and pull elements. This can benefit publishers and consumers by improving the latency from hours to seconds, solving the scaling problem of the API load, and removing the hassle of data conversion. From the blog post announcing today's launch:

We built a system that connects Data Consumers to Data Publishers in a low-latency, highly-scalable standards-based way. Data can be pushed or pulled into Gnip (via XMPP, Atom, RSS, REST) and it can be pushed or pulled out of Gnip (currently only via REST, but the rest to follow). This release of Gnip is focused on propagating user generated activity events from point A to point B.

Gnip can be thought of as an 'activity bridge' for the social web, or as Nik Cubrilovic of TechCrunchIT put it, doing for social activity what pinging services have done for blogs. Except that the activities include more than blog posting - commenting, rating, sharing, tweeting, purchasing, attending and more. Gnip would like to normalize the names of all online activities - the full list of their initial set is here.


The advantage for publishers is that their activity streams can be posted once to Gnip, who will offload the push and pull servicing to many consumers,and provide the data translations. The advantage to consumers is standardized and timely notification. As Joseph Smarr of Plaxo describes it

For Plaxo users, the benefit is simple: when you digg a story or bookmark a link with, etc. you should see that activity show up in Pulse a lot quicker--often within 60 seconds, whereas before integrating with Gnip, it might have taken an hour or more. Starting today, Digg and should be very quick to update, with Flickr and Twitter hopefully following shortly.

The initial version of Gnip offers the notification service. Future versions scheduled for this year will include some very interesting enhancements:

  • Gnip Notifications: For "Data Consumers" they can poll for new data the moment it exists. Avoid throttling & decrease latency from hours to seconds. For "Data Providers" it lets them reduce API traffic by an order of magnitude while increasing distribution through aggregators.
  • Gnip Polling (coming soon): Offload API and RSS polling to Gnip and receive full content updates via your preferred protocol (REST, XMPP, ATOM, etc).
  • Gnip Transformation (coming soon): Receive standardized cross-service XML markup and turn integrating with new APIs into a plug-and-play experience.
  • Gnip Identification (coming soon): Let Gnip offer suggestions for your users' profiles through a variety of identity discovery mechanisms.

In the initial version of the notification service Gnip is only reporting to consumers that an activity has occurred, with the identification of the source and the guid of the item that was rated or consumed. In later versions the activity itself will be part of the transmission.


The protocol bridge graphic above shows the different formats that Gnip will support. The Gnip API shows standard REST calls for both producer and consumer, and includes language specific convenience libraries in Ruby, PHP, Perl, Python, and Java. We have created a new Gnip API profile here.

The company is lead by Eric Marcoullier, who founded MyBlogLog and sold it to Yahoo last year. The full list of partners on the Gnip community site includes delicious, Digg, Flickr, Twitter, MyBloglog, and Six Apart.

For more details on the service see good writeups from Marshall Kirkpatrick and John McCrea.

Be sure to read the next Tools article: Build Agnostic Map Mashups with Mapstraction