On October 26, 2013, users of both Facebook and Twitter discovered that their accounts were responsible for spamming both social networks with unauthorized posts, many of which promoted a weight-loss scheme. It wasn't long before the social posting and scheduling service Buffer realized that it was the source of those posts. Its infrastructure had been compromised and, for a brief period, the attackers inherited Buffer's carte blanche authority to make posts to both Twitter and Facebook on behalf of the Hootsuite-like company's registered users.
In an attempt at transparency, Buffer's executives blogged about the initial intrusion, offering details of the attack and the remedy over the course of that weekend. Judging by the comments on that blog, Buffer was applauded by its users for its quick response and openness. However, as it turns out, Buffer has so far fallen short in disclosing some of the most important details of the attack and its remedy.
The incident casts a spotlight on the blind faith that end-users are hastily placing in many of the applications to which they've entrusted their Twitter and Facebook accounts. If anything, the attack on Buffer should serve as a wake-up call. The Web as it turns out, is not nearly as secure as many believe it to be. The incident also serves as a clarion call to Web developers as well as API providers that security must be their top priority. It is a discipline that is intolerant of short-cuts, cost savings, and incompetence. There's simply too much at risk. This ProgrammableWeb investigation explains why.
So Many Attack Vectors. So Little Time.
In a blog post dated October 29, 2013, the former head of the Cloud Architecture & Security Team for Adobe's Creative Cloud initiative and current founder of Evident.io Tim Prendergast wrote "There are far too many APIs being cranked out in such a short period of time... there is no way that they have all been properly secured and built. There will definitely be new attack vectors in an API-centric Internet, but we are still too early to know the pervasiveness of such attacks."
It was only days earlier, that Buffer was attacked. The result was a flood of unauthorized posts (spam) to the accounts of unsuspecting Twitter and Facebook users whose Buffer-specific credentials to both social networks were hacked from Buffer's database infrastructure. The attack sheds light on an important, but previously known and well-documented vector -- that of OAuth token theft -- that all stakeholders in the API economy (API-consuming developers and API providers) must vigilantly defend against.
End-users must also recognize that, despite the best intentions of those stakeholders and the imprimaturs of widely-used federated credentialing technologies like OAuth, there's no guarantee that their identities cannot be stolen and abused for impersonation. Vulnerabilities exist, especially as a result of the implementation decisions that vary from developer to developer and API provider to API provider. So long as they do, it's impossible to know which services on the Web have taken the necessary precautions to protect personal credentials and information, and which have not. As a result, every time users authorize a new third-party service to make social posts on their behalf, they are also increasing the risk of unauthorized access to their accounts.
In this case, the damage was minimal. But, it could have been far worse. In April 2013, all it took was one false tweet to rattle the stock market after the Associated Press' Twitter account was hacked (see False White House Explosion Tweet Rattles Market). Who knows what financial and legal havoc Buffer's attackers could have wreaked while they had access to post to Facebook and Twitter on behalf of so many users?
Web Security Is A Journey. It's Never Over.
As Buffer sought to recover from the attack, the company's executives communicated their findings to the public via blog. To the extent that Buffer's investigation revealed weaknesses in its implementations of the Twitter and Facebook APIs, the company claims those implementations have been hardened. But what's not clear is how hard or if they're hard enough. ProgrammableWeb has made email contact with Buffer. But the company has been unresponsive to some important questions (despite the company's open invitation on its blog to email questions regarding the matter).
According to the company's blog:
With these improvements your Twitter and Facebook accounts are not at risk anymore. Attackers will not be able to use this method to send spam anymore…. The method which left our data vulnerable is now locked and secure.
Buffer's executives are doing what they should, trying to instill trust in the company's user base. But statements on behalf of any company that say "accounts are not at risk anymore" are grandiose at best.
According to NopSec CTO and co-founder Michelangelo Sidagni, "A one-time control fix is not usually enough to cover an entire security program. Instead of detailing just that aspect, [Buffer] should have talked about their renewed and revamped approach on their overall security program. Making a statement that this is going to fix all their problems without covering their entire security posture is just an invitation for the attacker to strike again....just for the sake of it." With the goal of preventing such cyber incidents, NopSec provides a SaaS-based on-demand security vulnerability management solution that helps organizations to discover and remediate security vulnerabilities in their networks and application infrastructures.
To Prendergast's aforementioned point, there's no guarantee that other attack vectors don't exist for Buffer or any other third party to which users entrust their social network credentials.
So, what exactly went wrong?
The Keys to the Kingdom
When an end-user authorizes a third-party application like Buffer to make Facebook or Twitter posts on his or her behalf, those social networks respond by delivering to that application a set of credentials -- essentially traceable keys -- to access the end-user's social network account. This key is known as an OAuth token and one of its primary advantages is that it gives third party apps like Buffer the access they need to make posts to social networks on end-users' behalves without having to know or store those end-users' usernames and passwords to those social networks. Though their implementations vary (sometimes requiring more than just the OAuth token to make a post), Facebook, Twitter, and many other services have standardized on OAuth for this type of third party Authentication. Facebook relies on version 2.0 of OAuth. Twitter on the other hand supports version 1.0a for trusted third-party proxy-style posting.
According to the OAuth Web site:
Many luxury cars today come with a valet key. It is a special key you give the parking attendant and unlike your regular key, will not allow the car to drive more than a mile or two. Some valet keys will not open the trunk, while others will block access to your onboard cell phone address book. Regardless of what restrictions the valet key imposes, the idea is very clever. You give someone limited access to your car with a special key, while using your regular key to unlock everything.
OAuth tokens are unique. For example, when Facebook issues an OAuth token to Hootsuite (at the request of an end-user) and then another token to Buffer on behalf of the same end-user, those two tokens (known under OAuth 2.0 as "bearer tokens") are unique from one another. They both allow Hootsuite and Buffer to make posts to the end-user's Facebook account, but they are also traceable to the application that requested them and they're revokable. Contained within the token is information about the Facebook end-user account that it's for as well as the specific application (e.g.: Buffer) to which it was issued. If an OAuth token to Facebook is revoked, it can no longer be used as a form of authentication.
Touted as one of the key advantages of OAuth, this allows for targeted revocation without forcing the end-user to (a) change their Facebook usernames and passwords and (b) re-authorize all their other apps (e.g.: Hootsuite) to continue to make posts on their behalf. To the extent that the hack of Buffer was a test of OAuth's ability to deliver on this promise, OAuth passed.
As end-users authorized Buffer to make posts to Twitter and Facebook on their behalf, those social networks responded with OAuth tokens that Buffer then stored in its database. In the case of Twitter (under OAuth version 1.0a), such OAuth tokens are also accompanied by an OAuth token secret.
Our operations team detected unauthorized access to an internal, employee-facing support application….Our support tool includes an "impersonate" feature that enables MongoHQ employees to access our primary web UI as if they were a logged in customer, for use in troubleshooting customer problems. This feature was used with a small number of customer web UI accounts. Our primary web UI allows customers to browse data and manage their databases. We are contacting affected customers directly.
Buffer outsources its database provisioning to MongoHQ. Although Buffer CEO Joe Gascoigne applauded MongoHQ's response to the situation in his blog update, Buffer's choice to outsource its database provisioning to an online service exemplifies one of the potential perils of turning to the cloud versus insourcing (building something on-premises or the equivalent thereof). With that choice comes certain risks and the users of Buffer inherited those risks, knowingly or not. To be fair, insourcing database provision comes with its own set of risks as well. As a result, end-users really have no idea to whom or what processes they're ultimately entrusting their social accounts.
With access to Buffer's databases, the perpetrator(s) had access to the OAuth token data that Buffer was storing for both Facebook and Twitter. Perhaps revealing a potential weakness in the OAuth approach to third party authentication (one that the OAuth Working Group is actively addressing), once the perpetrators took possession of those tokens, they were able to use those tokens to authenticate with Twitter and Facebook and make posts on behalf of the end-users to whom those tokens belonged.
What's Luck Got To Do With It?
In perpetrating the attack, the hackers had some luck as well. Through some form of hacking or social engineering, the attackers were able to gain access to the databases kept by MongoHQ. But the real luck came when they discovered the treasure trove of unencrypted OAuth tokens that were stored in Buffer's databases. The experts that ProgrammableWeb interviewed for this story agreed that this was a major oversight on Buffer's behalf. Not only is the Web rife with advice to encrypt OAuth tokens before storing them, the Internet Engineering Task Force's specifications for OAuth 2.0 minces no words on the issue.
According to section 10.3 of the specification for OAuth 2.0 on access tokens:
Access token credentials (as well as any confidential access token attributes) MUST be kept confidential in transit and storage, and only shared among the authorization server, the Resource servers the access token is valid for, and the client to whom the access token is issued. Access token credentials MUST only be transmitted using TLS as described in Section 1.6 with server authentication as defined by [RFC2818].
That section goes onto explain how the scope of OAuth 2.0 does not include verification of the authority to actually use an OAuth 2.0 token:
This specification does not provide any methods for the resource server to ensure that an access token presented to it by a given client was issued to that client by the authorization server.
Judging by Buffer's immediate remedy -- a part of which was to start encrypting tokens -- Buffer should have been encrypting them from the beginning. However, the attacker's luck didn't stop there.
Going back to OAuth's valet metaphor, there's nothing in the OAuth spec that inherently prevents the transfer of valet keys from one valet to another. OAuth implicitly relies on the custodian of the tokens to guard them as though they are a very closely kept secret. But as an additional layer of security, both Facebook and Twitter provide facilities to ensure that the bearer of an OAuth token has the legitimate agency to use that token.
Facebook's App Secret
In the case of Facebook, the technique has to do with verifying any calls to its Graph API with an "app_secret"; an additional authentication credential that the company introduced in August of this year. It depends on a special secret code known as the application secret that's issued when a developer first registers their application with Facebook. Presumably, Facebook and the developer (e.g.: Buffer) are the only ones who know this secret. Given how each OAuth token is keyed to the application that requested it, Facebook can deny access to any token that isn't also accompanied by the appropriate app_secret (again, a secret that's uniquely keyed to the same application that the OAuth token is keyed to).
Facebook's Web site is explicit about the purpose of the application secret and the risks of not using it -- advice that Buffer was either unaware of, or did not heed:
Graph API calls can be made from clients or from your server on behalf of clients. Calls from a server can be better secured by adding a parameter called appsecret_proof.
Access tokens are portable. It's possible to take an access token generated on a client by Facebook's SDK, send it to a server and then make calls from that server on behalf of the person. An access token can also be stolen by malicious software on a person's computer or a man in the middle attack. Then that access token can be used from an entirely different system that's not the client and not your server, generating spam or stealing data.
You can prevent this by adding the appsecret_proof parameter to every API call from a server. This prevents bad guys from making API calls with your access tokens from their servers.
But there are two problems. First, to the extent that the approach is actually effective, it's an optional setting. In other words, Facebook doesn't enforce its usage. Again, according to that same Facebook page:
In the advanced section of your app's settings, you can enable requiring the use of appsecret_proof. When this is enabled all calls that don't include the parameter will fail….Once you've changed that setting, mis-using stolen user access tokens will also require access to your app's app secret.
In Buffer's massive (aforementioned) blog post about the breach, Buffer CTO Sunil Sadasivin mentioned enablement of this feature as one of remedies. According to that post:
For Facebook API calls we are now using an extra security parameter to make all tokens more secure.
The implication is that the feature was not enabled prior to the breach, further softening Buffer as a target (and improving the attacker's luck). However, even with the app secret_proof feature enabled, it's not clear that Buffer would have averted the attack because such application secrets are problematic as well.
In some ways, the aforementioned Facebook page gets to the second problem when it says "mis-using stolen user access tokens will also require access to your app's app secret." The challenge is in keeping the application secret a secret. If, for example, the app secret ends up being incorporated into the developer's Source Code in clear text, hackers would only need access to that source code to compromise the secret. The same thing would go for most attempts to store the secret somewhere else in the developer's infrastructure.
Via email, former editor of the OAuth 2.0 specification Eran Hammer told ProgrammableWeb, "If an attacker gained access to the data store, source code, or credentials – the very nature of the [OAuth] protocol would prevent any real mitigation at that point." In addition to being the former editor of the OAuth 2.0 specification, Hammer was also the primary author of OAuth 1.0.
Buffer's Twitter Consumer Secret Stolen Too?
Perhaps reinforcing Hammer's point is the question of how the attackers were able to use the stolen OAuth tokens to post to Twitter. Whereas requiring the application secret is optional with Facebook's API, Twitter's corollary to the application secret -- known as the consumer secret --- is required before any post can be made. Just as with Facebook, an application-specific consumer secret is issued to the developer when the developer first registers an application with Twitter. It, like Facebook's app secret, can also be reset.
Calling into question just how transparent Buffer has been about the breach, one question that was not answered by the company's detailed blog has to do with how the attackers were able to make unauthorized posts to Twitter. The company's blog post clearly states that "the hackers were not able to get access to any passwords, billing information, or other user information other than specifically the Twitter and Facebook access tokens." However, in the case of Twitter, they could not have made their posts with the stolen OAuth access tokens alone. The attackers must have had to access to Buffer's Twitter consumer secret as well. If that's the case, important questions need to be answered. Where was Buffer's Twitter consumer secret kept? How was it kept (encrypted or not)? And how did the attackers find it? What specifically is being done now to secure Buffer's Twitter consumer secret and Facebook application secret? So far, these questions remain unanswered. (Editor's Note: Approximately 90 minutes after this story was published, Buffer updated its blog to address some of these questions. Questions still remain. An update appears at the end of this story).
Hardware Security Module Anyone?
According to several members of the security community contacted by ProgrammableWeb, there's really only one way to secure the sort of secrets that API consuming developers like Buffer must secure; the OAuth tokens as well as other application secrets. According to independent security researcher Taylor Hornby (author of the Web site CrackStation.net), simple software-based Encryption of data at rest is not enough and hardware security modules (HSM) are the way to go. According to Hornby's thesis on salted password hashing:
[Encryption keys have] to be kept secret from an attacker even in the event of a breach. If an attacker gains full access to the system, they'll be able to steal the key no matter where it is stored. The key must be stored in an external system, such as a physically separate server dedicated to password validation, or a special hardware device attached to the server such as the YubiHSM.
Without an HSM-based solution, the keys to any encryption scheme for protecting secrets like tokens and app secrets are ultimately discoverable. If the source code of an application can find them, so might a talented hacker.
Ping Identity senior technical architect John Bradley echoed the need for securing secrets with HSM. In alluding to an application secret-like solution coming from the OAuth Working Group, Bradley told ProgrammableWeb "The OAuth Working Group is developing an extension for proof of possession tokens in OAuth 2. That would allow a resource server to verify that the presenter of the access token controls a proof key for that token. That however would have only helped if Buffer were using a asymmetric proof key and storing it in a HSM to keep it from compromise." Bradley is also a member of the OAuth Working Group that's working on that "Holder of the Key" (HOTK) extension.
Also echoing Prendergast's prediction that there's probably more bad news to come, Bradley said "these sorts of compromises against password files and tokens are likely to continue given that there are a lot of developers out there that are soft targets." Developers for example that don't go the distance to use HSM-like solutions for encrypting secrets.
More Transparency and Guidance Please
This is where Buffer's transparency about the remedy enters the picture. The company's blog claims that it revoked all of the pre-existing Twitter tokens thereby forcing Buffer users to re-authenticate with Twitter and generate new ones.
While Twitter tokens may have been revoked, it's not clear if the same measure was taken with Facebook's tokens. The blog post goes on to state that OAuth tokens will now be encrypted but leaves open the question of encrypting the application and consumer secrets from Twitter and Facebook respectively. To the extent that data is being encrypted, it also doesn't specify whether that encryption is HSM-based or not.
Via email, Hornby told ProgrammableWeb "Everyone should use an HSM, whenever possible. I just doubt that it is feasible for small-time websites. To use one, you have to have a dedicated server (real hardware, all to yourself), to plug the HSM into. Dedicated servers are expensive. Most smaller websites run on shared hosting or virtual private servers, which are cheaper. But they may not give you access to the hardware, in which cases you can't use an HSM."
The scenario Hornby presents is a common one. Tech entrepreneurs with hot ideas routinely turn to the cloud for their entire infrastructure; servers, databases, etc. Not only can the infrastructure be turned on within hours if not minutes, it's significantly cheaper than dedicated or on-premise options and involves less financial risk should the business fail. There are fewer capital assets to dispose off. But, in the case of most cloud-based servers from the likes of Amazon and Rackspace, not only don't the entrepreneurs have access to the physical servers (to add an HSM module), the servers running their infrastructure don't even physically exist. They're purely virtual. There's nothing to plug the HSM module into.
Lack of HSM for IaaS offerings created a catch-22 to for many enterprises and government organizations who were eyeing the cloud as a datacenter alternative, but required HSMs for compliance or national security reasons. In response, Amazon launched AWS CloudHSM to give those organizations the best of both worlds.
But Amazon's upfront cost of $5,000 and an hourly charge of $1.88 per hour challenges the feasibility of HSMs for many startups. As a result, many organizations fall back to cheaper, faster, perhaps hastier methods that hope to protect encryption keys and other secrets through some form of obfuscation. Thought it's not nearly as good as having an HSM-based approach, Hornby says one approach is to store keys in files that are not accessible through the Web.
In announcing that it would start encrypting OAuth tokens, Buffer actually raised more questions than it answered. For example, are the layers of its infrastructure (application, database, etc) sufficiently partitioned from one another in such a way that all encryption-related remedies are completely isolated from Buffer's MongoHQ-based database?
According to Hornby, "The data should be encrypted while it's on Buffer's servers, before it gets sent to MongoHQ. [If an HSM is in use], the HSM would be attached to one of Buffer's Web servers so that it can encrypt the data before it gets saved to the database, and decrypt the data after it's loaded back from the database." Alluding to how a breach of MongoDB's servers alone would not have been enough to carry out the attack, Hornby says "the servers with the business logic code would have access to an HSM with the key, but the database servers (at MongoHQ) wouldn't have access to the key [because there's no reason for them to have it.]"
When asked about what the non-HSM options are, Hornby advised "If using an HSM isn't possible (e.g. in a virtual environment), a secret could be hidden in some obscure place, like in the Windows registry, or in an image file, with the hope that an attacker wouldn't be able to find it. It's slightly better than not encrypting at all, but using an HSM would be much better."
Unfortunately, the discussion of physically separating the data from the business logic highlights the one remaining and very nagging yet-to-be-answered question: How did the attackers gain access to Buffer's Twitter consumer secret? Twitter's OAuth settings dialog clearly states "Keep the Consumer secret a secret. This key should never be human readable in your application."
Unlike the Twitter OAuth tokens, the Twitter OAuth token secrets, and the Facebook OAuth bearer tokens, all of which are unique and specific to each end-user of Buffer's application (and that would need to be stored in a database), the Twitter consumer secret is specific to the Buffer application as a whole.
Although the Twitter consumer secret is necessary (along with Buffer's consumer key, the end-user's OAuth token and the end-user's OAuth token secret) to make an API-based post to Twitter, there's no pressing need to store it (or the consumer key for that matter) in a database. If the attackers found the Twitter consumer secret and its corresponding consumer key in the compromised MongoHQ database, it suggests questionable judgement on behalf of the application designers who put them there where they weren't reasonably protected. If the Twitter consumer secret and corresponding consumer key were kept with the business logic, then it suggests that the attackers may have had access to Buffer's source code and/or the underlying file system as well. Either that, or someone, perhaps an insider, had pre-existing knowledge of that information and conveyed it to the attackers.
Only Buffer can, and should answer these questions.
Back to the Drawing Board?
ProgrammableWeb views the attack on Buffer as an opportunity to publish something that's deeply prescriptive around API and Web application security and we welcome input from all stakeholders in the API economy as we seek to articulate a modern set of best practices. In the meantime, here are some closing thoughts:
While Buffer and MongoHQ would probably prefer to see an end to the Buffer incident's ten seconds of fame, it is probably better that the incident be kept in in the industry's front of mind as a clarion call for change.
The success of the attack, the oversights that made it possible, and the false sense of security that Buffer's users including ProgrammableWeb (yes, we use Buffer too) were drawn into serves as an example to all end-users that they really have no way of knowing how safe their secrets are with the online companies they're entrusting them to. If users really want to protect themselves from the sort of harm that can be caused by such attacks, blind faith is not a healthy modus operandi. Users need to be far more discriminating with their trust. Sadly, there isn't enough information on which to base such decisions. That needs to change too. Transparency will be key. More to come on that as well.
For API consuming developers, it's time to reassess how secure your applications and data are. Are your secrets stored in the data layer? In the business logic layer? Somewhere else? If you have data like OAuth tokens or other secrets that need guarding, there are no shortcuts to the best security. And, users who put their faith in you are owed the best security.
For API providers, it's time to work together on standard approaches to API security in a way that makes it easier for developers to build secure implementations. Developers can ill-afford the potential business and legal risks let alone the wrath of angry end-users and the embarrassment of a breach. The more API providers can do to make it easy for developers to build secure applications, the more all boats in the API economy will float higher.
Update: Approximately 90 minutes after this story was published, Buffer updated its blog to say that its consumer key and consumer secret for Twitter were indeed compromised. For that information, the hackers pursued another attack vector: they penetrated Buffer's GitHub account where it keeps the source code for its applications. This further emphasizes the point made by Tim Prendergast about new attack vectors surfacing all the time.
Notwithstanding how its GitHub account was compromised, the next question Buffer should address is why secret information is being kept with its source code when it could have been encrypted and stored elsewhere (with the encryption keys protected at best by an HSM, but at least by some form of obfuscation). Also, does Buffer plan to remedy that, or leave the secrets with the source code?
Finally, to the extent that Buffer entrusted its source code to GitHub versus an on-premises solution, this serves as another example of the perils of cloud-based computing. There's not enough information to tell exactly where the fault lies (with Buffer? with GitHub?). But either way, the Web may have eased the hacker's access to the information they were looking for.
More Details Emerge: There's now a follow-up post to this one in which Buffer's CTO Sunil Sadasivan answers many of ProgrammableWeb's questions. See GitHub Now Involved As Buffer Answers More Questions About Attack.