Who Curates the Real-Time Web?

SXSW was the source of a flood of real-time information on the web. Information flowed from attendees using social media tools to share what was being discussed, their thoughts and their experiences. This information was amplified further by the information be re-shared (retweeted on Twitter) and by other opinions being expressed about all things SXSW. But how is it that you ensure you don't miss an important piece of information from within your social media connections or even outside of your normal social media circles? From an earlier post on Cadmus, an algorithmic Twitter Feed service, you may be aware of the idea of curation - filtering content to ensure that you don't miss the most relevant information. But who performs this curation and what roles do technology have in the process?

It is technology that has made it possible for us to produce, share and consume so much information. We are living in a time that is often referred to as an era of digital overload. It's therefore very interesting and essential that the cause of this data overload should also be the solution: Technology, in combination with human input, can help us solve these new problems through automated curation of data.

SXSW ran a panel covering this subject: "Humans Versus Robots: Who Curates the Real-Time Web?". The panel consisted of representatives from curation services including Henry Nothhaft from Trapit, Jim England from Keepstream, Sam Decker from Mass Relevance, Xavier Damman from Storify, and Megan McCarthy from Media Gazer who chaired the discussion. Unsurprisingly there wasn't a definitive answer to this question. The answer really depends on the data that is being curated. It was agreed that automated curation (robots) was required in cases where there was so much data that it would not be possible for humans to possibly curate. However, the consensus was that generally once the amount of data had been reduced to a more acceptable level that it was humans who could provide the best form of curation since they can understand and apply more context and objectivity to any data. Technology will of course improve and we'll see the robots getting smarter and smarter.

Curation cannot be solved by any one technology and it will require smooth interaction and Integration between numerous technologies in order to solve this problem. APIs will play a massive part in the curation process and we've already seen this directly through the Cadmus API and within services such as DataSift, where they use third party services such as Klout, Peer Index and Salience for influence and sentiment analysis. DataSift provide a delivery and curation service but they can't do everything on their own which is why they have outsourced some parts of the process to other services. As more curation techniques and algorithms are identified we'll see these techniques exposed through additional services and APIs and in turn these APIs will be integrated into other curation services.

The amount of information being published to the real-time web is going to increase and as it does so will the importance of curation. It might not be long until it will be essential in ensuring that we don't all suffer digital overload. How long will it be before Twitter.com and the ever increasing list of official Twitter clients, who presently only offer manual curation through selective follows and lists, join in and add curation to their service? Will they build it themselves, use a number of the available APIs (unlikely based on the amount of data) or make another acquisition?

Photo via Blake Patterson

Be sure to read the next Real Time article: Spend Less Time Building Real-time Apps With Pusher