New revelations are coming to light now that Buffer is answering some of the questions that were left unanswered after last week's attack on its infrastructure. Drawing attention to the sophistication of the attack and how the hackers relied upon multiple vectors to pull it off, Buffer has disclosed that the attackers gained access to the source code that it kept (and thought to be confidential) on the socially-driven Web-based source code repository GitHub.
To recap, on October 26, end-users of both Twitter and Facebook noticed that their accounts on both social networks had been compromised. Using the compromised accounts, the hackers posted weight-loss spam to both social networks. There have been reports that other types of spam were involved but ProgrammableWeb didn't observe any and most of the offending posts have since been removed from the two services.
Later that Saturday, the social posting and scheduling service Buffer announced through its blog that it was the source of the incursion. The company revealed that hackers had successfully penetrated its infrastructure and stolen the special credentials it needed to spam Twitter and Facebook on behalf of as many as 30,000 of its registered users. The credentials that were stolen -- known as OAuth tokens -- are special types of keys that make it possible for Buffer to make posts to social networks without needing the the usernames and passwords of the users it makes those posts on behalf of. Buffer gets those tokens when end-users authorize the service to make such posts on their behalf.
When each OAuth token is issued to a service like Buffer, it is unique to Buffer and the user account that authorized Buffer to have that token. One of the advantages of OAuth is how the tokens issued to Buffer can be revoked without impacting the other services that end-users of Twitter and Facebook may have also authorized to make posts on their behalf. One of the disadvantages of OAuth is how someone in possession of a stolen Twitter or Facebook token can overtake the user account that it belongs to.
The degree to which such impersonation can be prevented depends on what security mechanisms that API providers like Twitter and Facebook make available to developers like Buffer, the extent to which those mechanisms are optional or required, and the choices that developers make in their implementations. As evidenced by the attack on Buffer, a lot can go wrong when so many variables are in play.
In addition to revealing some of the details of the attack, Buffer executives also enumerated some of the measures they were taking to shut the attack down and prevent it from happening again. But some of the most important questions were left unanswered until yesterday after ProgrammableWeb published its investigation of the attack.
One of the most important questions had to do with how the perpetrators were able to impersonate Buffer's customers on Twitter. Originally, Buffer's blogged claimed that it was only the OAuth tokens that were stolen. However, a special application-specific secret -- known as the consumer secret -- that is only supposed to be known to both Buffer and Twitter must accompany all requests to make a post through Twitter's API. For the hackers to have spammed the Twitter accounts of Buffer's users, they had to have stolen the consumer secret as well. The hackers' access to the consumer secret raises more questions about where that secret was kept, why it wasn't encrypted, and how the hackers got to it.
Yesterday, Buffer revealed that the consumer secret had indeed been compromised. The attack was very sophisticated. In addition to penetrating Buffer's databases where they found the OAuth tokens in an unencrypted state, the hackers also penetrated Buffer's source code repository on GitHub where its consumer secret was discovered in an unencrypted state in the source code for Buffer's application.
However, important questions still remain and Buffer's CTO Sunil Sadasivan has been gracious in answering them for ProgrammableWeb. It should be noted that Buffer has been uncommonly constructive in its response to ProgrammableWeb's published investigation. In many cases where a media outlet challenges the public statements of a tech company, the tech company will stop working with that media outlet. In this case, Buffer has atypically improved its transparency about the incident. Not only has it offered its CTO to answer any questions, the company has publicly supported (via Twitter) ProgrammableWeb's position that significantly more industry dialogue is required to better secure the Web for end-users, developers, and API providers. We look forward to Buffer's participation in that dialogue.
Here's the Q & A with Sadasivan. Some of the questions were edited after the fact to establish a better context for the reader:
ProgrammableWeb: The attack involved a penetration of an application that MongoHQ.com -- where your databases are hosted -- uses for its customer support. How were the attackers able to extract so many tokens out of that application?
Sunil Sadasivan: They were able to obtain the unencrypted tokens through the password breach of the MongoHQ web console. They wrote a script to scrape the access tokens from the web console.
PW: How did the hackers manage to get Buffer's consumer secret for Twitter and why was that detail omitted from your initial blog posts?
SS: This was our oversight in our initial investigation. Twitter does offer [an app secret functionality similar to that of Facebook] which we had forgotten they do. You need to provide the oauth_ key/secret and a nonce to make valid API requests. Once I tracked this down I knew we were infiltrated. After requiring the whole team to double check their security logs on GitHub, we found out about the unauthorized access to GitHub. We since invalidated all credentials stored within our code. (Editor's note: On it's blog, Buffer published the following detail: "The whole Buffer team has changed their passwords and enabled 2-step login for as many services as allow it. (Google, AWS, Twitter, Facebook, GitHub)."
PW: Your remedy (regarding 2 factor authentication on GitHub) seems to suggest that you're going to leave the consumer secret,the consumer key (both for Twitter) and the app secret for Facebook in the source code. Is that the case, or are you going to find a way to move them out of the source code to some place where they are encrypted?
SS: We're moving them away from our code base to a separate undisclosed place and encrypted.
PW: How do you know the hackers didn't tamper with your source-code?
SS: We went through our changelog and found no evidence of unauthorized writes. We're trying to get in contact with Github to obtain more logs around the unauthorized github access and activity (repository clones etc). We've also reset most keys that were located in the source, and will ensure all keys are changed and removed from the code base asap.
In terms of what we're doing with our source code, we're considering GitHub enterprise which allows us to self-host our repositories. Seeing this was a breach of passwords, our immediate change of resetting passwords and requiring dual-factor authentication should help us in the short term. We are not at all done or complacent with our current security standards. We're rethinking all levels here.
PW: How did OAuth facilitate any part of the short-term solution (eg: removing all Buffer posts, hiding other ones, etc) in ways that other authentication methods could not have facilitated?
SS: We were able to remove all posts by putting our Facebook app in sandbox (developer) mode. OAuth as you mention allows for providers to know which apps are responsible for which posts which allowed us to temporarily halt all posting.
PW: Buffer's disclosures in its blog mention how Buffer's OAuth access tokens for Twitter were revoked. It even discusses how users will have to re-authorize Buffer to make posts to Twitter (essentially generating new access tokens). But it doesn't say the same was done for Facebook's tokens. Why not?
SS: This was not done as we changed our [Facebook] app preferences to require [Facebook's] app_secretproof (Editor's Note: this detail was originally disclosed in Buffer's blog and then covered in more detail in ProgrammableWeb's investigation). With help from Facebook, we decided this is what we needed to do was to stop unauthorized requests. Once we realized the second breach into our code, we once again reset our Facebook app secret. This way we did not need to invalidate all of our app tokens.
PW: How did OAuth facilitate parts of the long-term solution in ways that other means of authentication could not?
SS: We're still investigating and understanding what our long-term solution is here. For now we're encrypting access tokens and ensuring valid proof of tokens, requiring two-factor authentication for all services and obfuscating the client secret as much as we can.
PW: How exactly are you going to be encrypting the OAuth access tokens and the access token secret (both for Twitter) and the OAuth bearer tokens for Facebook? Are you relying on MongoHQ for this functionality? Are you setting it up in your business logic layer? How are you going to protect the keys? With an Hardware Security Module (HSM)? Through some other form of obfuscation?
SS: We've set this up in our business logic layer (not through MongoHQ), and we're looking at HSM providers. We're doing everything we can to ensure this never happens again.
PW: As an API consumer (Buffer consumers APIs from Facebook and Twitter), what lessons did you learn that you would convey to other API consumers?
SS: With all that we've learned so far with the two separate shared password breaches, I'd like to convey the importance of two-factor authentication for all services. We've quickly built this out enabled this for our Buffer administrators. We'll soon be offering this for all of our users. I'd also stress the importance of encrypting access tokens and keeping GitHub or repositories in general safe.
PW: What recommendations would you offer to API publishers?
SS: There is no industry standard for the way to store API access tokens as there are for best practices for storing passwords. As you mention in the article, access tokens must be kept confidential in storage and transit. When developing we assumed our database storage and authorized access constitutes 'confidential.' As you may imagine, there was no way for us to for-see an unauthorized access to our database. We asked Facebook about best practices to store access tokens and they didn't have a standard practice. Encryption for us seems to be the best option. I would recommend having an official way to store access tokens. (Even so, with the password breach of our code, it's hard to say how safe this is as they can infiltrate encryption keys etc.)
PW: Anything else to add?
SS: All-in-all we were very unlucky that our attackers breached two authentication walls. It's unsettling as we may see more patterns of this soon.
By David Berlind. David is the editor-in-chief of ProgrammableWeb.com. You can reach him at firstname.lastname@example.org. Connect to David on Twitter at @dberlind or Google+, or friend him on Facebook.