Review: HP IDOL OnDemand APIs Help Organizations Make Sense of Data

Romin Irani
Jul. 15 2014, 11:55AM EDT

HP IDOL OnDemand is a suite of APIs focused on solving overcoming some of the tough challenges of the data boom. Indeed, data is being generated at levels never before seen, and the flow shows no signs of stopping or even slowing down. This data comes from many different sources, and, to make use of it, organizations must be able to effectively tap into--and make effective use of--unstructured text and images. This will enable them to drive a variety of key applications, ranging from business intelligence to analytics to behavior analysis. This is no easy task: The data in question may not be in the format that you want (or need) it in, and it involves text/image processing algorithms combined with natural language processing.

Powered by the HP Cloud, HP IDOL OnDemand is composed of APIs that address, among many other things, face detection, sentiment analysis, text extraction and text format conversion.

The OnDemand platform is currently open to all developers in Preview mode, and developers get a complete package when it comes to trying out the APIs. This includes:

HP IDOL onDemand API List

The HP IDOL OnDemand platform exposes more than 20 APIs in the following categories. (We’ve listed the key APIs in each of the categories.)  

Format Conversion

  • Text Extraction: Supports more than 500 file formats.
  • Expand Container: Enables users to process and extract from PST, ZIP or TAR files. (For example, you can process all the emails in a PST file and then perform sentiment analysis on individual emails using the Sentiment Analysis API.
  • OCR Document : Extracts text from an image.
  • Store Object : Can be used in a generic way to store the contents of any document and then use the contents as a reference in other APIs.
  • View Document : Renders a document in HTML form and highlights text.

Image Analysis

Indexing

This set of APIs can be used to manage an index (collection of documents). The APIs include Create Text Index and Delete Text Index. The APIs can also be used to manage documents (Add to Text Index and Delete from Text Index) from within an index.
This API set enables users to store objects in an index, which can then be referenced in other APIs. For example, you could have a background process that uploads several photos to this index. Another process would then pick up these references and pass them on to, say, the Face Detection API. Additional APIs in this category include List Indexes and Index Status.

Search

  • Find Related Concepts : Returns a list of the best terms and phrases in query result documents; the documents are returned and matched against specific data sets.
  • Find Similar : Returns documents that are conceptually similar to other texts or documents.
  • Get Content : Can be used to get content from documents present in public data sets or your own indexes.
  • Get Parametric Values : Allows users to perform faceted search—for example, filtering out search results based on specific parameter values.
  • Query Text Index : Allows users to search for content in various public data sets that OnDemand has indexed (for example, in Wikipedia).

Text Analysis

The set of APIs in this category addresses functions such as Text Tokenization, Sentiment Analysis, Language Identification, Entity Extraction and Highlighting Text.

Public Data Sets

Key to many of the APIs that IDOL OnDemand provides is the availability of Public Data Sets. These include Wikipedia across multiple languages and various news sites (starting from May 2, 2014, according to the documentation). The news sites aggregated include The New York Times, The Guardian, BBC and Reuters.  

Signing up

A single developer key provides access to all the HP IDOL OnDemand APIs. The APIs are currently in preview mode. (No information is currently available on pricing and final release of the APIs.) To sign up for a developer key, go to https://www.idolondemand.com/signup.html.  Within minutes, you should have completed the sign-up process and logged into the developer portal at https://www.idolondemand.com/login.html. Once logged in, click on Manage your API Keys and get your API key, which will be used to identify you when making calls to the various APIs.

 


Currently, the service provides two API keys per developer, with the ability to disable/revoke and track API usage per key.

API Explorer

With any API platform, it is important for developers to be able to try before they “buy” to determine the quality of the API and whether it’s a fit for the task at hand. One measure of quality is documentation. Developers can not only try out HP IDOL OnDemand APIs, but they can also fill out the request parameters and view responses right in the browser. Further, the HP IDOL OnDemand documentation is solid and explains each parameter clearly.

The Try option is available for each API page. We tested it out using the OCR Document API on a recent ProgrammableWeb article titled "How to Use DailyMotion API to leverage Video Search.”

 
Let’s say you received a screen shot of the article, as shown below:
 

 
We saved the screenshot locally with the name pw-dailymotionapi.png. We then pointed to that file as an input source for the OCR API to extract text from. After clicking on the Try It button, we get the response shown below:

The JSON response gives back the text along with the position of the text block in the document (left, top, width and height).

You can try out all of the APIs in this manner--in most cases, you need either files uploaded from your local machine or a URL or reference object (in case you have saved the file into the one of your own HP IDOL OnDemad indexes).

IDOL OnDemand APIs - Java and Python code

The HP IDOL OnDemand APIs are REST-based and use JSON as the data format.

Each of the APIs is governed by quotas, rate limits, maximum data size and data expiry  Access is governed by the API key you get when you sign up for a developer account. This API key parameter (apikey) will need to be passed with each API request.

The format of the API is as follows:
https://api.idolondemand.com/<platform-version>/api/<synchronous-or-asyn...

  • <platform-version> : OnDemand Platform version
  • <synchronous-or-asynchronous> : Some of the APIs support an asynchronous mode; this value will be either sync or async
  • <api-identifier> : API name--for example, sentimentanalysis, querytextindex
  • <api-version> : API version

With all of this said, the API URL for the OCR Document API is:
https://api.idolondemand.com/1/api/sync/ocrdocument/v1

(Note that when you try out the API, you get the Curl command for the API at the bottom. This is very useful when creating the entire URL in your code.)

Next we will look at Java and Python samples to invoke the API code. We will be invoking the Sentiment Analysis API to determine the sentiment of the text we will provide. Organizations can use this API to track their brands and monitor user feedback.  You can integrate this code into a much larger application where, for example, you run a daily job to access the Twitter API, retrieve tweets related to your company, industry, etc., and then perform a sentiment analysis on all the tweets. You can do the same with emails, blog post comments and more.

Python Code

Here is a sample Python snippet used to take an array of tweets and invoke the Sentiment Analysis API on them:

import urllib2
import json
apikey = '<Your_API_Key>'
url = "https://api.idolondemand.com/1/api/sync/analyzesentiment/v1"
tweets = ['ProgrammableWeb is a great resource for API News',
                  'The developer experience with XYZ API is not good',
                  'Check out ProgrammableWeb for API News']
for tweet in tweets:
api_url = url + "?apikey=" + apikey+ "&text=" + urllib2.quote(tweet)
  response = urllib2.urlopen(api_url).read()
decoded_data = json.loads(response);
  aggregate = decoded_data['aggregate']
  print 'Tweet : ' + tweet
  print ">> Sentiment : " + aggregate['sentiment']
  print ">> Score : " + str(aggregate['score'])
If we execute this code, we get the following response: Tweet :ProgrammableWeb is a great resource for API News
>> Sentiment : positive
>> Score : 0.87486231395
Tweet : The developer experience with XYZ API is not good
>> Sentiment : negative
>> Score : -0.670031048282
Tweet : Check out ProgrammableWeb for API News
>> Sentiment : neutral
>> Score : 0
The Sentiment Analysis API provides a score and an overall sentiment for each Tweet.

Java Code

The equivalent code in Java is shown below. (Note that you will need to ensure that the JSON libraries and Apache HTTP Client libraries are added to your compile and run-time environment): String apikey = "<Your_API_Key";
String base_url = "https://api.idolondemand.com/1/api/sync/analyzesentiment/v1";
String tweets[] = {"ProgrammableWeb is a great resource for API News",
                  "The developer experience with XYZ API is not good",
                  "Check out ProgrammableWeb for API News"};
             
try {
       for (inti = 0; i<tweets.length; i++) {
       String url = base_url + "?apikey="+apikey+"&text=" +
URLEncoder.encode(tweets[i], "UTF-8");
       HttpClient client = new DefaultHttpClient();
       HttpGet request = new HttpGet(url);
       HttpResponse response = client.execute(request);
 
       // Get the response
       BufferedReaderrd = new BufferedReader(new
InputStreamReader(response.getEntity().getContent()));
       StringBufferresponseJSON = new StringBuffer();
       String line = "";
       while ((line = rd.readLine()) != null) {
responseJSON.append(line);
}                         
 
       //Parse the JSON Response
       String strResponse = responseJSON.toString();
       JSONObjectjsonResponse = new JSONObject(strResponse);
       String sentiment =
jsonResponse.getJSONObject("aggregate").getString("sentiment");
       double score = jsonResponse.getJSONObject("aggregate").getDouble("score");
       System.out.println(tweets[i]);
       System.out.println(">> Sentiment : " + sentiment);
       System.out.println(">> Score : " + score);
       }
}
catch(Exception e){
       e.printStackTrace();
       }
finally { //Resource clean up}

Quotas and API Usage

The OnDemand API Developer site provides users with current information on API usage and quotas. Because the APIs are in preview stage, it is not clear what the quota limits will be when they are released. However, it is good to see a place where you can check out your usage.
The monthly quotas available per developer account are shown below:

A sample screen from my current usage is shown below:

Often while using cloud APIs, developers like to see the last few requests and their parameters. This is possible with HP IDOL OnDemand under the Request Log feature, which provides developers with the request log for their last 100 calls. There is also a Service Status page that gives the current status of all the services, along with any issues from the past few days.

HP IDOL OnDemand's administration console is preliminary compared with other PaaS providers', but it may be just the bones of what we will see when the platform comes out of preview mode and into a paid tier.

What APIs are you using to manage the data deluge at your organization? Let us know in the comments section below.

Romin Irani Google Developer Expert Cloud 2014. Romin loves learning about new technologies and teaching it to others. Follow me on Google+

Comments