Data-warehouse-as-a-service BitYota is using APIs to make data analytics in the cloud scalable and elastic. Already working with marketing customers who need to draw in semi-structured data from multiple source streams in order to create business intelligence, they now have their sights set on city and government use cases where streaming in multiple data sources are also central to performing any sort of meaningful analysis.
“The genesis of our technology is performance and scalability of a database engine, but with the horizontal scale out of a Hadoop-like system,” says CEO and Founder, Dev Patel.
“Semi-structured data is usually in JSON, XML, or key-value formats. We are able to ingest that as first class objects of the data stream directly into our database. Analysts using BitYota do not need to learn any new tools: they can use SQL directly over these semi-structured data sets. And therefore, you get to analytics quickly. As soon as your data is loaded, you are able to run analytics.”
Initial Use Cases: A Marketer’s 360 View
BitYota’s website confirms the type of use cases Patel also talked to ProgrammableWeb about as being the most common: for marketers to get a 360 view of a customer.
Current customers are using their application APIs to feed in a copy of their usage data from their web and mobile applications into BitYota’s data warehouse storage layer, and are integrating that with external APIs that stream in external data from sources like web analytics, social media, and loyalty programs. Customers can then use another set of APIs to feed this data back into their business analysis and SQL tools-of-choice to analyze this data for user value, and churn. Patel confirms marketers working across a number of industry verticals are using this 360-view analytics capability.
“We take all this data in different forms, and bring it into the database. Once the data is loaded in BitYota’s storage layer, then our compute layer is turned on at the time of analysis. The user does the analysis in a compute engine, and when completed, the user shuts down the compute instance. So the compute instance is charged only once.” Pricing schemes divide the storage and compute layer, so that customers can select whether to use Amazon Web Services or Microsoft Azure for storage. Data can be added via the user’s API to the storage as needed, and then only run through compute as analytics are needed. A free trial and developer account is possible.
Because the data is fed into BitYota’s storage layer as needed from either APIs directly or from the customer’s cloud data storage, there is no data lock-in for end users. “Our ability to separate storage and compute gives a lot of flexibility to users for only paying when you are computing,” confirms Patel.
When creating the 360 view, each API’s underlying dataset needs a join-key that connects each data row with its equivalent record in another dataset (for example, knowing which mobile app user should be connected with which Twitter account). “Join keys can vary, for example, social media authentication keys. Join keys are dependent on the analysis, and we provide a platform on which they can do it. The join-key could be activity, time, user ID, location. But the 360 view of the consumer is definitely a very powerful use case,” says Patel, who counts SumAll as one of their customers.
Seeking Out Government, City and Public Data Use Cases
Now, Patel hopes that BitYota can work with government agencies, city authorities, utilities and other public and open data publishers to offer them the warehouse-as-a-service capabilities for both internal departments and their emerging ecosystem of potential external data and API users.
One of the things we are seeing is that there are all sorts of hidden assets of data in government agencies, so just putting data out there via API isn’t the answer. Analysts may have SQL skills but are not necessarily skilled at writing the code for pulling data from an API.
Patel sees a natural alliance with open data publishers like Socrata, OpenDataSoft, and governments using the CKAN platform:
One of the themes here is to partner with organizations and have all of this data available in a data warehouse: the data is available, laid out in an appropriate way, available in a scalable architecture (we have separated storage and compute so storage can scale, and then independently add compute and scale that up and down).
The advantage is to pull all of this data from government systems (census data, realtime traffic feeds, crime data), then all of is available in a cost-effective way and people just pay when they want to analyze the data.
Such data and analysis could be instrumental in city planning. Some potential use cases could be:
- A logistics company could optimize their food distribution deliveries across a city or region, factoring in locations of supermarkets, traffic congestion realtime or historical data
- Similarly food co-ops or community groups could use a mix of data to advocate for strategies to address local food insecurity and lobby for community garden plots and local farmers markets or a local produce store by looking at socio-economic demographics from census, public transport routes, current availability of supermarkets within walking distance, levels of crime and safety, etc.
- A planning department or company could analyze the amount of car parking spaces needed in new office buildings or identify optimal locations for new major infrastructure like sporting facilities etc based on population projections, current transport modalities and realtime feeds, available space, etc
- Community and resident groups could use the data to advocate for reducing liquor licenses in their area, or to argue for new medical centers in their neighborhood or for a library, sports venue, childcare services etc by drawing in census, air quality, existing business locations, public transport, traffic congestion and crime and safety data.
Patel is excited by the opportunities BitYota could bring to the sector. “We are working on some proposals on this,” he says, but acknowledges government and city data is just “one of several markets” that BitYota is targeting at the moment. “We focus on opportunities where analysis of data from a multiple streams is critical. This is one area where multiples streams of data are available and necessary for analytics.”
The difficulty will be the lead time that cities and governments currently take when considering new IT services or for innovating on processes. While there are signs of open innovation in government, the prevailing culture is much more monolithic, with convoluted procurement processes and difficulties for new entrants to explain and demonstrate their business services to government customers. This is often a deterrent for many cloud-based businesses that are seeking to leverage the new distributed, application architecture that is emerging. Why pursue civic opportunities when marketing, finance and gaming are ready to exploit new technologies? How BitYota generates civic opportunities for their product and how they are accepted within government should be a watching brief for any startup hoping to target this market.