This is part 3 of our series What is The Green Button API initiative and How It Took OAuth To An Entirely New Level. In part 2 we helped you understand the requirements and standards behind the Green Button Initiative.
One of the U.S. government’s My Data initiatives is Green Button, a secure way to communicate energy usage information electronically using standardized RESTful API web services and a common data format. In the earlier articles in this series, we defined its goals and described the technology standards on which it is built.
In this article, we describe the building blocks of Green Button technology with respect to authorization of access to data provided to third parties and show how we addressed its requirements, including how the relevant standards addressed Green Button generally along with how they were implemented for Green Button specifically.
Even if you aren’t actually engaged in building or supporting software related to energy usage, most of these techniques are applicable to a broad number of data sets that might be exposed through web services. This is the case when:
- Periodic creation of data occurs, such as with sensor data streams in the Internet of Things (IoT)
- Service providers have large numbers of customers with a single data custodian
- A diversity of third-party services arise for different subsets of available customer data
We present the topics by describing the problem they solve, the specific issues for Green Button (such as like content), and finally what was used for Green Button.
So let’s dive in, starting with OAuth 2.0.
OAuth 2.0 is designed for exactly the sort of use as Green Button, which involves authorization of third party access to retail customer resources held by a data custodian is orchestrated using OAuth 2.0. OAuth’s key principle is that the data custodian and the third party should never exchange any private information about the retail customer. This is achieved in OAuth through the clever use of web browser redirection. The authorization process has the retail customer visit both the data custodian and third party websites, during the process of which the retail customer often is asked to authenticate himself. Non-personal Information about the authorization the third party and data custodian need to share are piggy-backed on the HTTP redirections as query parameters.
During our study of OAuth technologies, we found some minor challenges to its application among our stakeholder community. We believe these needs might be similar among some other stakeholders as well. They are specifically:
Custom response parameters. OAuth provides a limited set of details of the authorization established when providing the access-token. For example, OAuth provides the access-token itself but not a means to retrieve the authorization state or detect if it has changed.
In our case, the URI used in the data exchange phase is distinct to the authorization and can be used in subsequent requests using RESTful path navigation to details of the authorization. This URI contains a unique identifier that can be used to look up the corresponding access-token when the URI is provided in a notification (see the PUSH model discussed below).
In addition, we needed a method to allow the third party to retrieve the complete set of details about the authorization at any time using a second specific URI. This would include the overall duration of the authorization; OAuth provides the ability to convey the duration of the access-token, but not separately the life of the authorization, which can be many months. OAuth directly supports this extension mechanism. Additionally, when the retail customer seeks to change the parameters of an authorization – typically at the data custodian – the new status needs to be conveyed to the third party (see the discussion of Notification below).
Scope negotiation. In many OAuth applications, the scope of the negotiation is obvious or limited to a few options. In the case of Green Button there are many degrees of freedom in what is specifically authorized. In such a case, the chances are that an “asked for” scope is not acceptable or practical for a specific combination of data custodian and retail customer and third party.
We encountered a need for a scope negotiation protocol to ensure that only valid scopes were presented to and on behalf of consumers in the authorization grant service. Once the scope is agreed to, the standard OAuth sequence is used providing the appropriate scope. For example, consider a situation in which a gas-and-electric utility company’s gas-only customer goes to a third-party site that analyzes electricity usage, and picks her utility to authorize, when this customer has no electricity service with that provider. It is important to discover this case programmatically in order to perform a graceful dialog with the customer.
Notification for PUSH model. Curated data is data that has been verified to some extent, as opposed to raw data. Green Button data available through utilities is curated and typically is acquired daily. As a result, data is ready when it is ready. The natural model for providing such data to a Third Party is to use a “PUSH model.” However, OAuth only supports a “PULL” model from the third party. That is, the third party “GET”s resource data by providing an access token as evidence of the right to inquire. We added a notification service to accommodate this Green Button data distribution to provide secure distribution of resource URIs of data that is new to be retrieved. Once received, the third party uses the URIs and the proper access tokens to retrieve the data using the normal OAuth flow. Thus an effective “PUSH” pattern is achieved.
Bulk Transfer. Utility companies may have millions of customers, and third party service providers may serve a significant fraction this number. It would be extremely inefficient for a third party to make potentially millions of daily requests for the new data using OAuth 2.0 for regulating access to these resources. For example, that might require a utility to support millions of transactions per day, usually in a fairly narrow time window.
For this reason, a bulk transfer mechanism was devised to allow data retrieval to occur with a single daily request.
The relationship requirements among the retail customer, data custodian, and third party are not symmetrical. Typically, the data custodian has a responsibility to maintain the privacy and access to the retail customer’s data and therefore must strongly authenticate the customer’s requests to authorize. On the other hand, the third party may have a very short term or casual relationship to the retail customer, so does not need strong authentication. And, of course, in some cases the relationship may be long term and require substantial trust between the third party and retail customer.
Let’s consider a specific example. Imagine a kiosk in a big box store that markets a solar panel installation. A retail customer might want a cost estimate for such an installation. She might provide general information about her house’s kind and shape and perhaps its orientation. To provide the estimate, the kiosk may request access to the customer’s energy usage information at her utility company, since this is a key factor in determining the economic payback. In order to render this analysis, the kiosk needs the data, but it does not need to know who the customer actually is. By routing the customer to the utility that has the data (data custodian), she can authorize the data transfer. If the data itself is anonymous (it contains no Personally Identifiable Information (PII)) the kiosk (here it’s the third party) need never know the consumer’s identity to render the results of the cost analysis. Yet, the utility requires the retail customer’s clear authentication before it is willing to provide the data to the kiosk.
For another example: An energy services company may provide a virtual audit of a client’s commercial facility. Opportunities for great energy savings might be discovered after a review of previous usage in comparison with other data sets such as weather patterns. In this case, it is essential for the third party to understand the location and other PII for the property. Such a relationship might persist during the deployment of remedial solutions to track performance.
The OAuth authorization sequence provides the third party (which OAuth calls the “client”) with a minimum set of parameters used to administer the authorization (access_token, token_type, expires_in, refresh_token, scope). This information is extended in Green Button to provide critical additional parameters. OAuth provides the ability to return customized data when authorizations have been established. This capability has been used for Green Button to provide the following additional parameters:
|resourceUri||The URI to retrieve the authorized data subscription. For example:
|authorizationUri||A URI to retrieve the entire state of the authorization as a Green Button resource. This resource contains additional Green Button specific data including the actual period of the authorization. For example:
OAuth 2.0 typically is concerned with access to single or few different data at a time. Often the nature of the data custodian makes the scope (OAuth’s term for what access is being authorized for) obvious, such as in providing access to one’s photos stored in a cloud. This scope is used in the third party request for authorization, according to the protocol.
For Green Button, however, scope is more complicated, and goes beyond the simple case of access to time series of Energy Usage Information. Many variables and options can pertain for a specific retail customer / data custodian pair. For example, a customer may be an electricity customer, a gas customer, or both. The customer may have five minute, hourly, or monthly data available. Additionally, the third party may have data preferences. For example, although monthly data might be available for a customer, a third party service may be designed to only exploit fine-grained data that is stored hourly in resolution or better.
For this reason, Green Button has a fairly detailed scope description language that allows for the third party, data custodian, and retail customer to discover the compatible data service that works for all three. This “scope negotiation” is outside the OAuth protocol and occurs prior to the operation of the OAuth process. It’s aim is to arrive at the acceptable scope string the third party can use to execute the authorization.
During scope negotiation, similar to the OAuth method of browser redirects, the retail customer visits both sites (the process can start at either data custodian or third party). During this process, the customer identifies himself to the respective parties, and those parties identify themselves to one other. The customer, properly authenticated at the data custodian stop, can identify the data available for him to share and select what to offer the third party. Note that both the data custodian or the retail customer may vary the availability of data based on any mutually-agreed basis, which may include what might be shared with a specific third party. The third party can determine if the data allows its service to be successfully provided (that is: “Do I have permission? Great!”), whereupon it can begin the OAuth authorization service.
The resulting scope or scopes are shared with the third party. These are shared as one or more query parameters, each of which identifies a scope the data custodian and retail customer is willing to share with the third party.
For example, let’s assume third party that provides energy management services for a residential customer. This third party requires access to electricity and gas data. Optimally the data includes an hourly resolution for electricity and monthly resolution for gas in order for their software to provide the best strategy for energy efficiency and cost savings. Further, assume the customer has Municipal Gas and Electric (MG&E, a fictitious name) as his utility company. When the retail customer is sent to the data custodian, MG&E, from the third party website, the user is provided with the ability to authenticate himself as a MG&E customer and accept sharing with the third party. MG&E is willing to provide, for this specific customer for whom they actually provide gas and electric service, any of the following:
- Monthly gas usage data
- Hourly electric usage data
- Monthly electric usage data
- Hourly electric and monthly gas data
Once the customer sends the browser on its way back to the third party, these alternatives are conveyed. The third party recognizes that only one of these scopes will be optimal for its services and thus uses the “Hourly electric and monthly gas data” scope in the OAuth 2.0 authorization sequence. In the example described, the retail customer did not need to be burdened with details or dialogs to determine the appropriate scope. Yet, the third party and data custodian can negotiate behind the scenes (aka redirects) to find the suitable scope for this specific customer.
This is part 3 of our series What is The Green Button API initiative and How It Took OAuth To An Entirely New Level. In part 4, we'll help you understand how scope is used. This series was co-authored by Dr. David Wollman, Deputy Director, Smart Grid and Cyber-Physical Systems Program Office, National Institute of Standards and Technology.