BigML API Gets Bigger

We have covered the BigML API before because of its innovative use of machine learning through an API tied up with very easy to use interface. Think of it as Google Prediction API but without a black box algorithm, and with enhanced real time data visualizations and a much better User Interface and crisper documentation.

BigML is a REST-style API for creating and managing BigML resources programmatically. Using BigML you can create, retrieve, update and delete Sources, Datasets, Models, and Predictions using standard HTTP methods. However, BigML is currently limited to 1,000 requests per API Key per hour. We hope this gets bigger to further justify the name of the API as Big ML , but we understand that these are new players and are ramping up infrastructure steadily but in  a stable manner.

The BigML secret sauce is Clojure .  Clojure is a compiled language - it compiles directly to JVM bytecode, yet remains completely dynamic.

Bindings for BigML have been made in many languages including Java, Bash and  R, but in Python you can just do a "pip install bigml" to get going.

Some recent enhancements have been announced to the Python bindings for BigML:

  • Development Mode- All dataset and models smaller than 1MB can be created for free. Just use a one line code          api = BigML(dev_mode=True)
  • Remote Sources- You can create new “remote” sources using URLs. The URLs to create remote sources can be http or HTTPS, Amazon S3 buckets or  Microsoft Azure blobs (e.g.,  azure://csv/iris.csv?AccountName=bigmlpublic)
  • Local Predictions- Using local models for predictions has a big advantage: they won’t cost you anything. You can now import a  BigML model that you can use to make predictions locally, generate an IF-THEN rule version of your model, or even a Python Function that implements the model.  This is very helpful if you are using to make the models but for reasons of confidentiality (say in financial sector data) you want to score the predictions locally.The white-box nature of the model is further enhanced by the creation of IF-THEN rules.

You can get the rules that implement a BigML predictive model as follows:  >>> local_model.rules()

  • Asynchronous Uploading-an  async flag to enable asynchronous creation of new sources.

e.g. source = api.create_source("sales2012.csv", async=True)

  • Bigger files streamed with the python package Poster - The modules in the Python standard Library don't provide a way to upload large files via HTTP without having to load the entire file into memory first. Poster provides support for both streaming POST requests as well as multipart/form-data encoding of string or file parameters.

The updates to Python bindings are expected to be rolled out to other languages but pythonistas can Clone it from Github. Machine Learning classification models -

Build on the cloud remotely, Score  predictions locally, Just another API call away!

Be sure to read the next API article: SureChem Delivers First Freely Available Chemistry Patent Database via API