ProgrammableWeb recently caught up with Matt Sundquist, founder and COO of Plot.ly, a graphing and analytics startup. The Plotly API enables users to analyze and visualize data in one place, and forms an important step in building the infrastructure for data science to be further democratized. Sundquist spoke about the Plot.ly API, comparing it with existing visual solutions, and how visual data science solutions can evolve.
PW—How do the Plotly APIs work? What are its uses for someone already using Tableau, ggplot2 or matplotlib?
Matt Sundquist (MS)-We love and have been inspired by those tools. Plotly is different because it’s web-based. Our goal is for people to have one place to analyze and visualize their data, and do so collaboratively and online. The Plotly APIs allow you to create an interactive, shareable graph online. You can do so from your language of choice with our APIs (Python, R, MATLAB, REST, Julia, Perl, Rest and Arduino). You can stream and import data (including from Arduino). Plotly also reads dates and times, and can be used with NumPy, pandas and Datetime. You send your script to Plotly to make a graph, then Plotly returns a URL where you can access a graph (see the code here).
Plotly gives you total control of the graph, so you can style, color, and format everything from the API or GUI. Plotly has bubble charts, box plots, line charts, scatter plots, histograms, 2D histograms, and heatmaps. We also support log axes, error bars, date axes, multiple axes, subplots and LaTeX. Here is the Plotly gallery, which shows some of the graph types you can make:
Plotly lets you keep your graphs private, share publicly with a URL or share with your team. Here is our documentation on privacy, for Python, and you can see the same for other languages, too. Sharing does a lot:
- Invites others to collaborate with you
- Lets teams analyze together with stats, fits and functions
- Lets others style graphs with you
- Means others can comment on the graph and data right where it lives
That means no more emailing data and spreadsheets around, having to download graphs or take static screenshots and put them in a deck or email you can’t access anymore. You can simply send a Plotly URL to your team or share your project with them so you can edit together.
Plotly lets you add data to an existing graph. So if you already have a graph you want to add data to, or make a new plot with, you can simply add onto that file. Normally, if you have a graph already created in a PDF, that would not be possible. But with web-based Plotly files, you can make a copy of your original graph, add new data and style away, creating two graphs. For example:
You can also interplay between the API and GUI. Let’s say you start off by making your graph with R, and code it how you want. Then, you want to explore a different layout, graph type, legend placement, etc. Normally, you would need to code it again. But with Plotly, you can change your graph with the GUI instead of code. It’s fast and easy to find out “how would this look if it were a [graph type]”?
More broadly, Plotly has other features.
- You can do your analysis and graphing with Plotly, because the Plotly grid offers stats, fits, functions, and more, so you can analyze and graph your data with Plotly.
- Plotly is interactive. When you share a graph, others can access the data and graph by clicking through to Plotly, zooming, or hovering on the graph. That means you’re viewing the data and graph together, painting a richer picture of your work. Or, you can embed in an article, like in the Washington Post or an IPython Notebook, (click here for gallery).
- Plotly allows you to save, create, and apply custom filters to your data. That means you can, with just one click apply any filter to your data to create a beautiful, styled graph.
- Plotly gives you a profile, so you can keep all your work in one place where others can access what you’re working on. Here’s one we like.
PW—Do you have any plans to tie up with raw data API providers like Quandl or DataMarket to help with data visualization (since data input is clearly critical to you) ?
MS-We would love to work with data providers and are in talks with a few potential partners. Our newest integration is with WebPlotDigitizer, which allows people to pull data from any graph and then import it to Plotly (background and documentation here). A lightweight integration is our Graph in Plotly button, which lets you make files accessible for graphing in Plotly.
PW-What future do you see for statistical analysis, data mining and data visualization on the cloud? What technologies are leading the trend?
MS-Speaking only for myself, I think the future of analysis will revolve around creating accessible, collaborative and powerful tools. Two broader trends that seem to be happening in technology are (1) taking things that happen locally into the cloud; and (2) making things that you normally do alone social. Those trends will harness the real power of people and the cloud by making data and tools:
- Accessible, meaning that anyone can access the power of scientific computing without needing to code.
- Collaborative, meaning I can work with others to manipulate and analyze data, code and graphs with others at the same time.
- Proactive, as in I am much less likely to “miss” something because my suite of tools is mining data and events.
MS- We love D3.js, because of how much control it gives you to define interactivity and create beautiful graphs.
PW-Describe your journey as a startup. When did you ideate, conceive and execute your funding plans?
MS-Jack Parmer and Alex Johnson were working together on databases for energy and solar companies. Jack studied Engineering Physics at Stanford and was formerly with Alion Energy and Alex did his PhD in physics at Harvard and was formerly with C12 Energy. They had trouble collaborating with colleagues, and grew tired of emailing PowerPoints, data sets and code around. While building databases for collaboration, they realized this was a common problem for everyone doing analytics and graphing.
Over the next few months, the team was formed. We are: Chris Parmer, an EE student at McGill with experience at Gridco Systems; myself (I was at Facebook), Ben Postlethwaite, a geophysicist formerly with Aurora Geosciences; Carmel Dudley, a physics student from MIT and former designer at Inflection; Nolan Browne, former managing director of the Fraunhofer Center for Sustainable Energy Systems; and Christophe Viau, formerly with Datameer, who holds a PhD in information visualization.
PW- What is your product vision?
MS-We want to solve a few problems:
- First, in order to query a database, store and clean data, analyze data, graph results, share the project, and discuss it, you have to use a lot of tools, technologies and file types to store your data. Downloading, maintaining, learning, organizing and switching between these is a loss of time and inefficient.
- Second, collaborating on analysis and visualization is often impossible because we have to work alone on a desktop or be in the same place looking at the same screen with someone to truly edit the same data and graph. Often to collaborate, you have to email a data set, code, graphs, screenshots and an explanation of your work. Then, based on feedback and advice, you might re-do work.
- On top of that, conversations happen on email, removed from the graph and data. So if you want to refer back to a decision, graph or project, the relevant parts of that project (code, data, graph, discussion) will probably be distributed in different places.
- When you look at a graph, you can’t see or access the data behind it to do your own analysis, add data to it or share it, meaning you can’t validate, reproduce or contribute to an analysis or graph you see.
- There isn’t anywhere I can go if I just want to find raw data and work with people who are analyzing that data.
To solve those problems, we need one platform, where you can collaborate, analyze and visualize data, and control sharing. GitHub did this for code. It is an amazing platform for collaboration and has a dedicated community of awesome programmers. Our dream is to be a GitHub for data, where people can find and access data and graphs, join a community, and analyze and visualize their data.
Matt Sundquist, a co-founder and COO at Plot.ly, previously worked on the Privacy Team at Facebook and on product and privacy at a public records startup. Matt graduated with honors as a philosophy major at Harvard College, and has been a writer for SCOTUSblog.com and a Fulbright Scholar in Argentina. A Student Fellow of the Harvard Law School Program on the Legal Profession, he has published articles on the Supreme Court, privacy, the social contract and sundials. He has been cited in Senate Judiciary Committee Testimony, The New York Times and Yale Law Journal, and is a member of the CA Vital Statistics Advisory Committee. He lives in San Francisco.