The Search for the Value of Data

A thought provoking article, at O’Reilly’s Radar blog is set to change the general view about how we look at APIs and the data that they expose. Data or content is getting more valuable and the need to get access to data being held by the API publisher is more or likely going to undergo severe pressure thereby unlocking the true value of data.

In The Black Market of Data Jud Valeski clearly lays the ground for why the market came to be. APIs have largely been the driving force behind thousands of mashups and innovative applications created over the past few years. The underlying force has been the data that has been exposed by the publishers API and consumed by applications that needed it. Data is King and both the API publisher and API consumer are clearly aware of that. However, what is interesting to note is that the data that the API consumer thinks is valuable is not always being exposed by the API directly. There's a need that is not getting met easily. Think about all your personal data in various social sites, that several companies would love to get a handle on to understand and market appropriate products to you.

Valeski lists down three players in the game:

  • The user who signs up at a site and agrees to the Terms and Conditions at the site regarding usage of his/her personal data.
  • The publisher of that site, who creates an API and exposes relevant data.
  • Organizations or applications that use that API to access the data exposed.

API Publishers are very clear in the terms about not exposing private data to API consumers. However, it is likely that the information that the API consumers need or most value, is not being exposed by the API and hence one may resort to data scraping or go to companies that have already scraped this data and can provide it for a fee.

So what could be a solution to this problem? Obviously, any solution should and will not compromise on the user's privacy. I think that is clearly laid down in the Terms of Service of the site and in good faith, that should be followed. The article suggests that the way to eliminate this black market would be to unlock the value of the data that resides in various sites. With appropriate permissions and a price, there could be a middle ground where the publisher exposes the data via the API and the API consumers will agree to paying the price to get access to the data. It is, as a recent guest post wrote, Data as a Service.

There is a very large amount of data that is already scattered around the many web sites that we visit. Currently, there is no order to what is the value of that data and it is straightforward to understand that producers and consumers of the data do not agree on what data should be exposed, thereby creating a black market where data is potentially scraped from sites, resulting in clear violation of Terms of Service. With necessary controls and permissions of all parties (producers, consumers and end users) and a fair market for pricing data, it makes for interesting times ahead.