Hunk Offers Big Data Analysis Muscle for Hadoop Systems

Big data platform Splunk has released an analytics product specifically to allow for on-the-fly analysis of data stored in Hadoop platforms. Under the sub-brand moniker “Hunk”, the Splunk-analytics-for-Hadoop product has been released today. Using the Splunk API along with a library of SDKs in popular develop languages including Javascript, Ruby and Python, users can point Hunk at any Hadoop-stored data and almost immediately start running search queries and producing analytics visualizations. Data results gleaned from running a query in Hunk can be returned in JSON format or transformed into widgets that can be embedded into an enterprise’s own application interfaces.

Sanjay Mehta, Vice President of Product Marketing at Splunk told ProgrammableWeb:

“We have become experts at poly- and unstructured data. This is one of the most complex, most difficult sectors of big data, but also one of the most valuable.”

Hunk custom dashboard

A key goal of Hunk is to allow the “democratization of data” within an organization, so that it is accessible and useful for a wider range of industry constituents beyond the data scientist role. “Hunk opens up access to data to those across the whole organization,” Mehta said.

For developers using APIs for search queries, Hunk removes many of the usual latency frustrations. When designing a search query for a large database, many developers are frustrated to wait five hours for the results, only to realize that a join in the query was wrong and that the search must be repeated with a simple change to just one line of code. With Hunk, search results begin streaming almost immediately, allowing for editing of search queries "on-the-fly" or for additional search terms to be added for more finer grain analysis.

Hunk’s entire interface uses a set of APIs against the data engine to allow for ad hoc analytics to be carried out by developers and end users within an enterprise. The results of data queries can be displayed in UI widgets that can be embedded into an enterprise’s own application, or can be returned in formats like JSON for use in other data analytics workflows.

hunk visualization report

In a ProgrammableWeb interview with Clint Sharp, Splunk’s Senior Product Manager said:

“Hadoop is incredibly powerful but it doesn’t come easy to use. We wanted to get people up and running in minutes. It’s easy to get data into Hadoop but many users are not very successful at getting data out. It doesn’t require pre-building of tables, rows and columns, so you can very easily explore the data in Hadoop and worry about the analytics once you know what is in there.”

Hunk uses a patent-pending virtual indexing technology to allow instant exploration and visualization of Hadoop-stored data. Hadoop is an open-source distributed file system that stores big data in sets of clusters. According to IT industry analysts IDC, usage of Hadoop platforms is growing at a rate of around 60% a year, yet the bulk of this is for data storage rather than analytics. Splunk is hoping Hunk will fill a market void where many enterprises are as yet incapable of mining machine data for insights, such as customer acquisition predictors, funnel pipeline analysis or managing a just-in-time inventory.

Mark Boyd is a ProgrammableWeb writer covering breaking news, API business strategies and models, open data, and smart cities.