How DataSift Survived Twitter's Merciless Business Behavior in the API Economy

This is part four of ProgrammableWeb's series on the End of the API Economy as We Know It. In part three we offered a case study on how Stitch found a way to succeed, despite the uncertainties inherent to the API economy.

In this last installment of our series, "The End of the API Economy As We Know It," we look at the story of DataSift. DataSift enjoyed extraordinary success early on working with Twitter firehose data. It was a strong, special relationship until Twitter terminated their licensing agreement in 2015. It could have been a catastrophic event for DataSift, yet the company prevailed. The details about how DataSift coped with the setback provides valuable lessons for any company that depends on third-party APIs to make money operating on the internet.

Twitter, TweetMeme and Flight 1549

If it were not such a calamity of near-fatal proportions, it would have been perfect cinema. A flock of Canadian geese crash head-on into the engines of an Airbus A320-214 two minutes after takeoff from New York's LaGuardia Airport. Both engines shut down, forcing the airplane to make an emergency landing using the Hudson River as a runway.

The jet was a 167,000-pound glider. It had no power to control its landing. Yet it came down intact, skimming the waters of the Hudson River before coming to a floating stop. Almost as if by magic, ferries that normally shuttled people between New York and New Jersey appeared on the scene to get the passengers off of the sinking plane and onto safety. The timespan from takeoff until the last person was evacuated from the aircraft was a mere 30 minutes. However, in that brief amount of time in January of 2009 things changed for the social media Platform, Twitter. It went from being a 140 character amusement to a viable news source. Not only were ferries workers rescuing passengers from the aircraft, they were also tweeting about it in real time. As Jack Dorsey, co-founder of Twitter said, "Suddenly the world turned its attention (to us) because we were the source of news — but it wasn't us, it was this person in the boat, using the service, which was even more amazing."

It seemed as if everybody in New York was tweeting about the landing. Twitter was firehosing data about it and TweetMeme, the data aggregation company that would eventually become DataSift, was capturing every tweet in real time. TweetMeme's analysis algorithms transformed the thousands of captured tweets into a breaking story that made the Miracle on the Hudson the Story of the Day. Twitter had come into its own; so had TweetMeme.

DataSift, neé TweetMeme, went on to become a premier provider of data curation and analysis service for data generated from Twitter. Sadly, the mutually beneficial association the companies developed would be short-lived. They parted ways due to Twitter's instigation and for reasons that will be revealed in the pages that follow. The parting was an ordeal that took its toll on DataSift.

The DataSift story is a tale worth telling. The story is important not only because of the inspiration it provides but also because it exposes the risks inherent for any company that depends on third-party APIs for its commercial well-being. Operating in the API economy is a risky undertaking, particularly for companies that rely upon a third-party APIs to do business. Things can change in a moment, with or without warning. The DataSift story attests to this fact.

Solving the Cold Start Problem

Adding an eyedropper full of water to a 5-gallon pail is unnoticeable. Add 2 gallons and you have an impact. Such is the "code start" problem. In 2008, 6 million people were tweeting away on Twitter. Many of the tweets were amusing messages and memes passed around the internet as entertainment and provocations. But, a growing number of tweets were real news generated by users and from standard news organization such as Reuters, The Wall Street Journal, ABCNews, BBCNews, Al Jazeera, and FoxNews, to name but a few.

Twitter was proving to be a viable news platform for news companies. But, there was a significant obstacle: the cold start. Generating content was laborious. Someone or something at the news organization had to type in the characters and make the tiny URL to embed in a tweet. Remember, at the time a tweet was limited to 140 characters and Twitter didn't allow photo sharing until 2011.

Publishing a newsworthy tweet required effort. The tweets that did get published were but a drop in the bucket compared to the all the newsworthy content available on the web. If someone could figure how to get the unharvested news content into Twitter, it would be the like adding the gallons of water needed to make Twitter a news source with impact. Nick Halstead figured it out with the Retweet button.

Today people take the share buttons provided by Facebook, LinkedIn, Twitter, Google+ and the like for granted. But ten years ago share buttons didn't exist. If the average user wanted to share content on Twitter, he or she needed to take the time to copy the URL of the content into a tweet. It was a time-consuming undertaking ripe for optimization. Nick Halstead saw the opportunity and created the Retweet button.

The Retweet button made it possible for web developers to embed a button on a web page that allowed consumers to share the page's URL on Twitter with nothing more than a mouse click. It was a big deal. Within a year of its release, the Retweet button was on 400,000 sites across the Internet.

The Retweet button was pouring content into Twitter and contributing to the platform's growth. 18 million people used Twitter in 2009. By 2010 the number grew to 26 million. The Retweet button's contribution to the growth was not coincidental. The button was generating 750 million impressions per day.

The Retweet button was not Halstead's only foray into using the Twitter platform. In addition to creating the Retweet button, TweetMeme also implemented a service that analyzed data coming out of Twitter and upon identifying the information as newsworthy, republished the tweet on the TweetMeme website. The emergency landing of Delta Flight 1549 was one of its early scoops.

TweetMeme was making good money from selling advertising on the site. Also, the company was getting really good a curating data coming off of Twitter. As it turns out, companies began to show more interest in the curation algorithms and than buying advertising. As the current CEO, Tim Barker describes events at the time, "What was interesting was that the companies did come to us, companies like BBC, Dow Jones, Bloomberg and they said, 'I don't want to advertise on it, I want to know how you're getting this data'".

Twitter realized the power of the Retweet button, so much so that in 2010 the company introduced a share button of its own, the Tweet button. TweetMeme's founder Halstead welcomed the Tweet button. Twitter's button was going to replace the Retweet button, but there was a definite upside for Halstead. Twitter agreed to license some of the TweetMeme technology. Also, as part of the deal, Twitter granted TweetMeme permission to resell its data. The way Halstead saw it, the Retweet button was a nice utility, but one with limited value. And, republishing the news served only a fraction of the commercial market.

The management at TweetMeme understood that the market for consuming data and analysis from social media sites was vastly bigger than the one for advertising on a curated news site. According to Barker, "Why try to compete for 1% of the eyeballs on the media if you could power 100% of those platforms and applications."

There was much more to be had and TweetMeme wanted it. Permission to resell Twitter's data was a significant stepping stone to something much bigger. That next step was DataSift.

Powering The Platforms

Twitter and other social media platforms are a treasure trove of information provided you can make sense of it all. But in 2010 making sense of the information, which came to be known as Big Data, required expertise well beyond that possessed by the average business analyst. Working with Big Data bordered on rocket science. So companies went out and hired scientists who had the skills required to work through the mountains of data sitting on sites such as Twitter, Google, and Facebook. The data scientists could do the heavy lifting of data analytics, but as far as the day-to-day needs went, more was needed, particularly for those business analysts working with datasets on the standard social media sites. Halstead understood the benefit of providing operational access to social media data to non-technical personnel. So they went to work on tools focused on the needs and expertise of business analysts.

The Query Builder tool, which was DataSift's flagship product, provided a web page that allowed a business analyst to query data from the popular social media sites. Using Query Builder was as simple as selecting a platform, Twitter for example, and then applying search parameters such as age, location, date range, and hashtag. Datasift used its backend technology, which intermediated with the social media source, to get and store datasets of interest. Also, the technology had the ability to infer information such as sentiment analysis. Query Builder allowed business analysts to apply skills they'd developed using mainstream business intelligence tools such as Business Objects, Cognos and Tableau onto social media-based Big Data. Companies loved it and DataSift took off.

Be sure to read the next API Strategy article: Why Intuit’s Strategy for Monetizing APIs Starts with the Customer Experience