Do you remember what it was like to figure out what was going on in your city fifteen years ago? You probably called your friends. Less than 20% of U.S. households were online. People learned about local news and events from newspapers, TV and their friends. I remember a lot of nights that we were frustrated “nothing cool was going on.” Information was scarce, and people still paid for it.
Fast-forward to today, and this is obviously no longer the case. There is an abundance of information, a flood. Now the issue isn’t finding something to do but filtering though everything that you don’t want to do. Instead of us paying for information about what’s going on, now businesses pay for the privilege of telling us.
Just because, we, the consumers, aren’t paying for that information, doesn’t mean nobody is. Companies like Groupon have hired thousands of people to help them compile and curate local deals. Where do you think that information goes? In a huge database. They’re also creating a lot of “data exhaust,” such as user buying patterns (users who bought x, also bought y), pricing information (a dinner for two at a similar venue costs x in one city and y in another), etc.
There are a lot of companies like Groupon that generate a lot of data, but whose primary business is selling something else. Companies like MailChimp and OK Cupid are great examples of businesses that understand the value of their data byproduct and frequently blog about it.
At Infochimps, we are building a platform to help companies in the business of selling data, and also those that aren’t, to monetize their data assets. We’re often asked to help determine how much to charge for data, and we usually look at the following factors:
1. What is the scarcity?
Some customers need data in real time, and are willing to pay a premium for it. For instance, Gnip charges big bucks for access to the Twitter firehose and SaaS companies like Radian6 pay for it to power their social media analytics dashboard. In other cases, customers need historical data, which can be hard to find, collect and compile. A file containing the last 40 years of NYSE open, close, high, low and volume data is very popular on Infochimps. We give that one away as a loss leader. Another key scarcity is trust. If your customer depends on the accuracy of the data, then you can charge for you putting your company’s name behind it.
Lesson: Think about how scarce your data is. In the case of data, the scarcity might not only be the data itself, but it could be other factors, like timeliness, accuracy or trust.
2. What are the opportunity costs to the customer?
Companies like Yelp spend a lot of money to keep their local business database updated because businesses open, close and move all the time. Companies like Yelp pay other companies to collect this information because it allows them to focus on finding users, advertisers, and of course, improving their product.
Lesson: If prospective buyers of your data have more money than time, the more you can charge.
3. How difficult is it to store?
Companies like Dropbox help users store and manage all those messy files on their computer. Their “pro” plan only goes up to 100GB. What about files bigger than that? Some of our files are many TB in size (1 TB = 1,000GB). As a result, Infochimps offers many data sets through the InfoChimps Datasets API. This allows customers to ask and get only the information they need when they need it, instead of managing their own data store.
Lesson: Customers will pay for someone else storing and managing the data they need.
4. How easy is it to find?
With so much information on the web, the issue isn’t whether or not the data exists, but rather how difficult it is to find. Until recently, a search for “data” on Google returned the Wikipedia entry for the Star Trek character. Companies like Infochimps, Factual, Microsoft and DataMarket are aggregating data just as Orbitz, Travelocity and Hotels.com have done for travel. They’ve done quite well.
Lesson: When setting the price of your data, consider the hoops customers would have to jump through just to get to it.
5. How easy is it to make sense of?
Most people are not data scientists, but anyone can gain insight from data. Companies like Chart.io and Visual.ly are bringing data visualization to the masses. Viz tool providers can extract value directly from end users because they are further up the value chain than the data itself. Look for more interesting tools to emerge in this space in the near future.
Lesson: Is your data full of great insights that are easily uncovered with the right tools? Making your data easily digestible by these tools increases the value of your data.
The Big (Data) Picture
There will never be less information available to us than there is now. This data will become even more prolific as more machines go online. If you don’t believe me, EMC predicts 35ZB of data online for 2020. That’s 35 trillion GB!
The first major hurdle is an ever-increasing need to help people filter through that flood of information to find what they’re looking for – easily and effectively. In this landscape, discovery and curation become paramount, recognizing that much of this value may come from the long tail. Then, figuring out appropriate pricing models is the next step.
To price data properly, as we discussed before, it’s key to keep in mind the true scarcities at hand. If your data is timely, the scarcity of freshness is what you can price on. If your data is large, the scarcity of convenience will allow you to charge for the storage or quick access to specific portions of the data, as in an API. If the accuracy of your data is paramount, then the scarcity of trust affords you the ability to charge for putting your name behind the quality of your data.
As the overall fluency around data grows, the demand for data will continue to expand this emerging market for data. We’re excited to provide a platform around which this market can efficiently find the right mechanisms and the right prices for the consumption of data.