The whole point of investing in Big Data is to drive the development of a new generation of data-driven applications. The challenge most organizations will face is that once they build those applications they will need to have a robust set of APIs through which to share the information those applications create.
While it’s still early days in terms of the evolution of Big Data in the enterprise, it’s clear that moving massive amounts of data across networks isn’t going to be a very practical way to share information. Instead, providers of Big Data analytics applications will rely on APIs to, for example, share models with visualization tools based on the insight gained from the patterns identified. Taking that idea one step further, APIs will also be the mechanism through which they will share models with other distributed applications.
Case in point is Alteryx, a provider of Big Data analytics that recently moved to integrate its application with data visualization software from Tableau Software via an API. Over time Alteryx president and COO George Mathew will see a number of API level integrations between the predictive analytics application developed by Alteryx and a range of other applications that will invoke a service layer API provided by Alteryx.
Alpine Data Labs, another provider of data analytics applications is moving down a similar path. Bruno Aziza, chief marketing officer for Alpine Data Labs, says there will be multiple levels of integration between Big Data applications. Applications that share access to the same Hadoop cluster will share data via Hadoop, but more distributed applications will invoke APIs to share data at the presentation layer.
For that reason, companies such as Informatica expects there to be demand for libraries of Big Data APIs. According to Peter Benesh, senior product marketing manager for messaging middleware that will make it easier for organizations to publish and subscribe to different classes of data, which is the reason why this week Informatica announced the ability to both capture streaming Big Data and form a data warehouse alliance with Hadoop distribution provider Cloudera.
Most developers of the actual Big Data applications will naturally prefer to work with raw data. But as these application evolve Big Data will become more nuanced. At the bottom of any given data lake will be massive amounts of raw data. But at the top there will be subsets of patterns of data and entire analytics models that can be invoked via an API. And once a rapid expansion in the amount of data-driven applications should inevitably to lead to an explosion of activity across the entire API economy.