Government Data: Web APIs vs. Bulk Data Files

Janet Wagner, Data Journalist / Full Stack Developer
Mar. 28 2012, 08:00AM EDT

PicketReport LifestyleOn December 8, 2009, the "Open Government Directive" was issued by the White House that requires government agencies to takes steps in establishing openness and transparency. The directive includes the publication of information and data sets. This leads to a question of how government agencies should provide the data:

Should open data be provided using web APIs or bulk data files?

Sunlight Labs and Peter Krantz recently published articles on this subject. Both articles discuss the considerations that need to be made when choosing to provide data using web APIs or through bulk data dump files. In most cases, using bulk data dump files is the less costly, most effective solution.

Here are a few key differences between using web APIs vs. bulk data files:

  • Web APIs are designed for direct integration into web sites and applications. Bulk data files allow developers to retrieve all of the data and then design their own API and/or web application based on their project requirements.
  • Web APIs are systems that support other web applications built upon them. They are also very costly to design, build and maintain. Bulk data files are independent and can be updated automatically using cron jobs and other tools. They are simple to create and there is very little maintenance, making them far less costly.
  • Server load can be an issue with both web APIs and bulk data files. However because web APIs are integrated with external applications, greater consideration needs to be taken in regards to server load. If one or more external applications require a large number of API calls, the load can impact other applications using the same API infrastructure.

When it comes to government open data, bulk data files are the appropriate choice in most cases. However, Web APIs can provide advanced query capabilities and functions. They are also a good choice if the source data changes very frequently, preventing external applications from generating outdated content.

Janet Wagner Janet is a data journalist and full stack developer based in Toledo, Ohio. Her focus revolves around APIs, open data, data visualization and data-driven journalism. Follow her on Twitter: @webcodepro and on Google+

Comments