It may not have been the zombie apocalypse, but the recent government shutdown stopped the flow of open data via APIs. It also raised questions about how API developers and API providers can best handle a crisis in future. Now, 21 days later, we examine the impact and fallout of the shutdown, and look at how a similar situation could be avoided in the U.S. or anywhere developers are making use of government open data in their API workflows.
What the - Just Happened?
As Janet Wagner reported in a summary of the government open data landscape, a May 2013 executive order from President Obama required federal agencies to move toward systems that made their data accessible in machine-readable - that is, API - format. The intent was to open up the data so it could be used to unleash a new cycle of innovation, and encourage startups to consider creating businesses based on open government data sources. The power of open data access in fostering civic participation was also a key driver, as was the Federal Government's responsibility to be accountable and transparent to the public.
Since then, federal agencies - you can see a full list of open datasets at data.gov - have slowly been writing digital strategy policies and opening up their data either piecemeal or in chunks. The White House began employing specific experts - like API Evangelist Kin Lane - as Innovation Fellows or in other roles to assist government departments in building APIs and moving their data to accessibility via API.
On Oct. 1, the United States Congress failed to pass legislation to enable financing of the government for 2014, causing a partial shutdown. As a result, more than 800,000 federal employees were put on leave, and non-essential services were halted. As previously reported on ProgrammableWeb, this included a wide range of US agency websites and the new APIs that were streaming access to some of the government's newly opened data sources.
During the shutdown, ProgrammableWeb's Executive Editor David Berlind reported that there was some confusion as to what data was still available and what data flow had been switched off.
On Oct. 17, Congress signed an interim appropriations law and the wheels of Government began turning once more, opening access to the data.
Government, industry and agency representatives have since been discussing the impact of the shutdown on their own API programs in a Google Groups forum looking at how to better manage data delivery via API in any future similar scenario. Forum participants identified four key areas that must be addressed:
- Acknowledging that data access is a right/need: Under the Anti-Deficiency Act that governs service delivery under a shutdown, data services are considered a 'mere benefit' and not an essential service that should be maintained.
- Defining cloud services as a utility: Some agencies would be able to keep their current APIs open - even if new data was not being added - if they kept up monthly payments of cloud storage services, for example. However, they are prevented from doing so. If cloud storage services were defined as a utility by the U.S. government, the agency could pay in advance, allowing continued access to any existing APIs and the data already stored.
- Business and community confidence: The potential disruption of data can impact citizen confidence and trust in open data strategies. It could also affect businesses that are encouraged to utilize open data in new business models and to power startup innovation. The data tap can be switched off during political battles - or potentially any other type of federal crisis - which reduces public confidence in the power of open data and creates added risks for businesses that build open data into their commercial products.
- Lack of visibility on open data use: The lack of stories about businesses and communities making use of open data reduces the public's voice in arguing for greater stability and wider deployment of open data via API access.
A Storm's a-Brewing: The Impact on Business
Stormpulse is one of many businesses that makes use of government open data sources in its business model. As noted by Eric Carter from ProgrammableWeb earlier this week, Stormpulse is a weather intelligence company that calculates the impact of extreme weather events on industries like logistics and commodities so that they can better adapt and improve business planning.
The business draws on government open data, but luckily for them, weather data was maintained: "Because we aggregate our data from so many different sources – both government and private – we are not reliant on government open data sources alone," says Joel Wright, Director of Sales at Stormpulse and Stormpulse's exec responsible for government relations. "Additionally, in the case of the shutdown, there was no government weather data interruption. If there was, though, we have a contingency plan and we manage risk by relying on many different data sources. The only way we were affected by the government shutdown was from a sales standpoint – we sell to every branch of the military and government disaster/defense organizations, so some conversations were put on hold."
Joel Gurin, former Chair of the White House Task Force on Smart Disclosure and author of the forthcoming book Open Data Now, believes all businesses should have the same stability of data access that Stormpulse has with weather data:
"For entrepreneurs developing with open data, the impact is like a temporary shutdown of airports and roads. Basically, you want to keep them open," says Gurin. "Data is an essential resource. When you look at any number of examples in healthcare, education, energy - open data is becoming essential for how any company functions in the 21st century and we need to see open data as a key resource."
Stabilizing the Open Data Infrastructure
Gurin notes that businesses with open data embedded into their product supply chain are not the only ones at risk during a shutdown: "Economic stats from the labor department are used in a variety of private sector economic planning activities, and data was missing for that period of time. There's also concerns over scientific data. We're seeing a shift toward [federally] funded science data being opened, and this would cause a disruption in the availability of data provided via government-funded projects," he says.
One idea floated by members of the government API Google Group was to define cloud services as a utility. In this way, agencies that stored their open data systems in cloud-based servers could continue paying for their monthly subscription during a shutdown and enable ongoing access to community and business. The OpenGovFoundation's Seamus Kraft believes that's a hard ask: "Frankly, we're quite a few years from the Federal Government's acquisitions policy having cloud-based services being recognized as a utility. There are a lot of changes that need to take place first," Kraft says.
As an alternative, some agencies — or the communities and businesses that access the open data that those agencies provide — can future-proof data access by creating external backup sources of their data in externally accessible cloud storage. The new government.github service aims to offer this platform to anyone who wants to store government open data in the cloud. Ben Balter, Github's government liaison chief, told ProgrammableWeb that having a copy of government's open data sources stored on Github provides greater consistency for the community and business:
"The great thing about open collaboration — whether it's open source software, open data, or open government — is that it removes the government as the single point of failure. Even during the shutdown, although the government wasn't able to approve them, civic hackers continued to improve upon the open data policy and its associated documentation, proposed changes that were waiting for the government when they returned to their desk."
"GitHub is format agnostic, so whatever format your data is, GitHub is a potential collaboration platform. Rather than uploading data as a zip file or other static format to an FTP server or your agency website, if you commit a CSV (tabular) or GeoJSON (geographic) file to GitHub, [the service] provides an interactive, web based interface for data consumers to visualize and filter your data, no software necessary."
However, Balter notes that providing open government data via Github is not always the best solution: "If it's a complex and multifaceted dataset, such as, say, census results, that may likely lend itself to a dynamic API," he says. However, at least for the moment, GitHub may not be the best tool where dynamic lookups are required, according to Balter. "That said, GitHub could be a great resource for getting a static data dump out of the agency, and directly into developers' hands, where it can be a vendor into an application through something like Bower, which is regularly used for static assets hosted on GitHub."
Where a backup copy of more static data is stored on Github, businesses could re-point their request calls to their Github backup source during a shutdown.
"On Github, any files pushed to a specially named branch are automatically placed behind a content delivery network and served through GitHub pages," says Balter. "For most datasets, we often don't think about it, but if the data is being served as a RESTful API, meaning each datapoint has its own unique URL, we don't necessarily need a complex or expensive server to make each query on the fly. Oftentimes, and especially with government data that changes infrequently, ... you can simulate that same experience to the degree that developers may not even know the difference, by emulating that RESTful query structure using files statically generated and committed to a Git repository. The advantage here, of course, is because the data is maintained within a distributed version control system, as that data updates, you can begin to see the changes over time, as well as collaborate directly on proposed improvements."
In defense of startups
Ensuring better access to data in the face of future crises can only be addressed in part with measures such as using the Github model. Sharing the stories of how businesses like Stormpulse - and more so stories from the next wave of startups that draw on non-essential data sources from government - will help demonstrate how open data is an essential business resource. Improving access to open data by keeping APIs accessible in any future event helps build business confidence in using government open data in product design and service delivery.
Mike McGeary, Co-Founder of Engine Advocacy told ProgrammableWeb: "The greatest danger of the government shutdown to business was the unnecessary uncertainty it created. Uncertainty is a curse for all businesses since it impacts their ability to plan effectively for investment and growth. In specific cases, when the federal government is actively engaged in programs that encourage startup ventures — open data initiatives for example — the negative impact is heightened."
This week, as government agencies pick up their digital strategies where they had left off at the end of September, the White House was able to release a write-enabled beta version of their We The People API. The write feature will enable developers to embed We The People services in their own web and mobile applications, allowing U.S. citizens to sign petitions without having to visit the White House home page. Perhaps one of the first uses could be to collect signatures from startups, business and the community that want government open data to be considered an essential service and core resource.