Concerns following the release of the U.S. Energy Information Administration’s latest API-enabled tool draw attention to the problems government agencies will increasingly face when creating in the open. How the EIA is dealing with the valid criticisms demonstrates the types of engagement that governments at all levels will need to cultivate.
Last week, ProgrammableWeb ran an article on the EIA’s new Excel tool, powered by its API. The tool also enables economics data to be automatically fed into it from the Federal Reserve Bank of St Louis’ API.
Creating API Tools for Proprietary Products
While many were supportive of the new release, it raised some eyebrows as it prioritizes embedding the government-provided API into a commercially proprietary tool — the add-on is only suited for Microsoft Excel users working on Windows operating systems — while not releasing the underlying open source code that created the tool. From a transparency perspective, this makes it difficult for end users and open data advocates to ensure that the tool uses the same API endpoints that are available through the public API.
The approach contrasts with the U.S. Department of Labor, for example. It has apps for Department of Labor statistics that use Labor’s API to update the latest figures for the consumer price index, unemployment rate, productivity indexes and other labor indicators. The department’s API developer portal then has a link to the open source code for the stats app published on GitHub.
Mark Elbert, director of the Office of Web Management at EIA, confirms that the tool is completely built off the API, but that because the source code includes API registration data, it needs to remain private:
The reason it is not open source is not so much the data, it is the methodology, as the tool requires a registration key for APIs. We need to ensure that we don’t have accidentally abusive usage. For example, it would be very easy to be very aggressive with your API calls, say, a million calls a day from a single user. So we have a single point of registration, so if there are load problems, we can speak directly with the API consumer. So there are API registration keys embedded in the file, and that is the primary reason the source code is encrypted.
Elbert confirms: “Yes, the Excel tool is nothing but a wrapper for the public API.” To assuage concerns, EIA will update its download site to “make that very explicit,” he says.
Managing Bulk Usage of a Government API
How government agencies handle bulk downloads of data to avoid overloading API calls will continue to be a key issue, especially for departments like EIA that are releasing a lot of data. The New York City Metropolitan Transportation Authority (MTA) has a similar problem with its subway feeds data. Developers registering for an API key must agree to host MTA’s data feed on their own servers and then have any of their applications that access the data draw on their own server versions and not overload MTA’s data infrastructure. For agency’s like EIA, ways to manage scaling API usage may not be a problem for now and can be solved by reviewing logs of API registered user behavior and reaching out to individual users.
Elbert also acknowledges the concern that the agency has created a tool that embeds the API in an add-on for a Microsoft (commercial) product, but argues that that is where the majority of end users are located. Elbert repeated the comment he made in his original interview with ProgrammableWeb, that the bulk of analysts for which the tool is intended live in Excel spreadsheets every day. As such, “the reception we have had from analysts is almost ecstatic. A lot of these users don’t have a programming team, and in the past they have been taking our data and compiling time series into their own spreadsheets. This automates that process for them.”