We have covered BigML.com API for making machine learning models here before. Recently BigML made some innovative changes including new data visualization and new classification ensemble models, all doable by API. Here is an interview with Fransisco J Martin, CEO of BigML.com .
PW- What are some usage stats on the BigML API that you can share?
Fransisco- BigML's API is a little bit different from other well-known REST APIs in the sense that each resource creation implies an asynchronous task that can go from few seconds to a few days depending on the size of the input. So far our API has performed more than 240,000 tasks. From those, 34,000 have been to build models – with over 50% of those models created in just the last month.
PW- What are some of the changes in the BigML API that have enabled , and what changes are planned?
Fransisco- Recently we brought ensembles as first-class citizens. That means you can now create, retrieve, update, and delete ensembles in only one API call.
You can also evaluate an ensemble in only one call.
We have also enable the creation of new datasets using sampling.
PW-Can we link up the BigML APO with other APIs (specifically like Quandl API for datasets) to automate the data ingress and out in the system. How can we use BigML to automate scoring of models
Fransisco- Of course, you can… It's as easy as bigmler --train "http://www.quandl.com/api/v1/datasets/GOOG/NASDAQ_AAPL.csv?&trim_start=2013-01-01&trim_end=2013-05-13" --objective Volume --name "Predict Apple's Volume"
PW- What are the different data visualizations you use at BigML ? What are the reasons for choosing the sun burst type of data visualization as a new addition. What are some of the other graphs that you are looking at
Fransisco- In addition to our histograms for descriptive analytics and our interactive tree models we have recently brought sunburst visualizations for decision trees. I think these are an innovative way to represent decision trees (sunbursts aren't new but using them for representing decision trees is) as are very powerful for certain use cases, such as the requirement to quickly and visually identify rules with very low confidence or support. We have a few more visualizations in the works that will help our users better understand and analyze the output of single models as well as for ensembles.
Fransisco- We’re not ready to disclose this type of information yet but we can share with you that we have over 3500 registered users who span a variety of industries and specialties – we’re seeing a lot of use for marketing analysis, HR, and financial services – as well as for more traditional data science and research. BigML saves company significant money through increased productivity in terms of licenses, machines, configuration, etc. More significantly, BigML will help companies make better decisions and investments based on the analytics that come from our predictive models and ensembles.
Dr. Francisco Martin is co-Founder and CEO of BigML, Inc – the market’s leading hosted machine learning platform for predictive analytics. He can be reached on LinkedIn.
Better ensemble models, anyone? Just another BigML API call away?