Did APIs Play a Role in the Exfiltration of Personal Data From Parler.com After the Riot in Washington, DC?

In the aftermath of the riots in Washington, DC and the attack on the US Capitol Building on January 6, 2020, several public posts went viral across multimedia alleging that APIs played a role in the tens of terabytes of potentially private content including text, images, and video that were exfiltrated from Parler.com by tech vigilantes and even law enforcement authorities.

Below is a partial screenshot of a Reddit thread started by u/BlueMountainDace (the POST has since been edited) that discusses the initial rumors surrounding Parler:

Parler.com was viewed by many supporters of US President Donald Trump as a social media alternative that was free of censorship when compared to the likes of Facebook, Twitter, and Instagram. What the users of Parler didn't realize, even in the course of posting content that documented their involvement in the riot and their exact whereabouts on January 6th, was how easy it might be for hackers to freely access that content, even if it was thought to have been deleted. Parler's lack of censorship (again, a feature) was viewed across the tech industry as having played a role in promoting the violence that day and the response included a take-down of the entire Parler.com service when Amazon suspended Parler's account due to what it alleged to have been a violation of Amazon's acceptable use policy. But not before many tech vigilantes, internet sleuths, hackers, and maybe even law enforcement agencies went to work in hopes of bringing those responsible for that day's tragic events to justice.
 
While the core assertion found in the viral posts is essentially true -- that more than 50 terabytes of content was exfiltrated by way of hacking before Amazon took the site offline -- ProgrammableWeb has found some of the details, particularly as they relate to APIs, to have been inaccurately reported. For example, the posts tied one of Twilio's APIs to the ease with which the data was exfiltrated leaving some to question the integrity of Twilio's APIs. While it is true that Parler.com used Twilio's APIs for SMS-based activation of a newly downloaded copy of its mobile app and for password resets, the Twilio API had absolutely nothing to do with the techniques used by hackers to exfiltrate any data from the service. 
 
What is true is that Parler was using Twilio as a part of its verification workflow until Twilio threatened to suspend Parler's account for violations of its acceptable use policy. But what is also true, according to Twilio, is that Parler turned off its Integration with Twilio before Twilio could take that final action. Via email, Twilio, Inc. told ProgrammableWeb "On Friday, January 8th, we sent Parler a letter informing them they were in violation of our Acceptable Use Policy and notifying them that we would suspend their account if they did not make efforts to remediate multiple calls for violence on their Platform. Shortly after receiving our letter, Parler informed us they had already turned off their integration with Twilio."
 
While the sequence of events so far involves a lesson for any consumer of a public API to build contingency plans into their code that account for the sudden unavailability of that API for whatever reasons (takedown, failure, etc.), any implication that the unauthorized exfiltration of personal data from Parler.com happened as a result of an exploit of a weakness in the security of Twilio's APIs is patently false. To the extent that Twilio's SMS verification API essentially requires human interaction, Parler's choice to turn off the Twilio integration may have invited more imposter-driven user registrations from bots posing as the Parler mobile app. And while the ProgrammableWeb team has theorized how such imposter accounts, generated en masse on any Web site, could be leveraged to help exfiltrate data from a service, we see no evidence that such techniques played any role in any exfiltration of data from Parler.com.

In fact, at least one hacker (@donk_enby) who has attained a weird mix of celebrity status and notoriety after discovering and exploiting several weaknesses in Parler's service told WIRED that "A Reddit rumor that hackers gained access to more private data on the site—due to SMS provider Twilio cutting ties with Parler and disabling its two-factor Authentication—was "bullshit." In other words, the Twilio angle found in some of the viral social media posts was not only mischaracterized (in some cases, claiming Twilio issued a press release, which it never did), it appears to be irrelevant to any exfiltration of data.

Additionally, the original post that went viral mentions a “behind the login box API that is used to deliver content” potentially suggesting that another API, perhaps one from Parler that governs the mobile User Experience, somehow played a role. In another Twitter post, @donk_enby confirmed that she was exfiltrating content simply by crawling a series of sequentially numbered unguarded assets using the URL "https://par.pw/v1/photo?id={insert sequential integer here}." To the extent that API version numbers are typically included right after a RESTful API domain's root, the appearance of "V1" after "par.pw/" and then "/photo" after that suggests that she was exploiting a URL that Parler probably intended to be part of its API and that "photo" was one one of the resources available through that API. Despite the URL's reliance on HTTPS (the secure version of HTTP), @donk_enby's tweet also reported that no authorization was necessary to retrieve photos through that URL. In other words, no API keys, no user ID or password, and certainly no OAuth tokens of the sort that are typically used to secure Web APIs and their various resources. 
 
So, where did Parler go wrong?

How Parler’s Data Security Failed

The state of Parler’s data security was embarrassingly lacking. The list of security shortcomings that played a role in the collection of this massive trove of data included a complete lack of Rate Limiting across various functions, insecure direct object references that exposed deleted content, and an absence of fail-safes for service downtime. 

The most obvious omission, based on @donk_enby's description of how she accessed Parler's data, was the absence of any access control mechanisms whatsoever. Hacktivists were able to freely access Parler's API without having to authenticate themselves or demonstrate proof of privilege to the content they accessed. It's a complete failure of API security 101 that could have easily been addressed by an API management solution that, out of the box, would have applied standard security to all the APIs it managed.

Rate-limiting is another extremely basic step that is taken to ensure that users are not able to overload service bandwidth via excessive use. These limits also preclude bad actors from hacking security measures via brute force attacks. Most importantly, to the extent that rate-limiting throttles the number of requests that can be made over a given period of time, it serves to mitigate the amount of damage that can be done through the exploit of a vulnerability.

For example, had Parler applied reasonable rate limits to its APIs, hackers like @donk_enby might have been limited to a certain number of photo retrievals per minute or per hour. Instead, the only physical limit on the rate at which she could download data from Parler was the bandwidth of her Internet connection. In fact, her bandwidth was not enough given the race she was in to download all publicly available Parler content before Amazon took the site offline. Fortunately, another group of hacktivists including the self-proclaimed Archive Team stepped in to help.

Over on Vice.com, Leland Nally reported that @donk_enby told him that “The Archive Team deserves a lot of credit for orchestrating the big pull" and that "the group paid the steep server costs and constructed a tool that allowed anonymous Twitter users to volunteer their own bandwidth to help speed the transfer, which at one point peaked at 50 GB per second." According to Nally, by the time the downloading was done, the total haul covered 96% of the data on Parler's web site; 56.7 terabytes of data that included 412 million files, 150 million of which were photos and over a million of which were videos. However, contrary to what was insinuated in some of the viral social media posts, private information such as passwords or photos of drivers' licenses that might have been associated with each account were not captured.

To be clear, rate-limiting alone would not have prevented the retrieval of those images. But, had Parler applied rate limits to its API -- something that would have been relatively easy to do had Parler been using a commercial API management solution -- the total haul might have been just a few terabytes instead of 56.7TB. 

Insecure data object references were an equally, if not more so, blatant example of Parler’s development ineptitude. Object references are just that, a way of directing an application toward a specific Resource. When establishing a nomenclature for these references, it is standard practice to randomize the assignment of asset IDs rather than rely on sequential order. When a user uploads two images, it would be extremely unwise to reference these objects via URLs like https://par.pw/v1/photo?id=1,https://par.pw/v1/photo?id=2, and so on. By choosing to assign asset IDs in sequential order, Parler left no guesswork to a hacker like @donk_enby. Had Parler randomized its asset IDs, the combination of that randomization and the aforementioned rate-limiting would likely have reduced the number of downloaded photos to a small fraction of the 150 million that were gathered. 

For users of the Parler platform -- particularly for ones who may have published self-incriminating images from the riots on Jan 6, 2020 -- it gets worse. Whereas other social media and photo upload sites typically strip images of their metadata, @donk_enby told Vice.com that Parler left that metadata untouched. In other words, to the extent that law enforcement authorities might be examining those 150 million photos, each image was accompanied by the date and time showing when it was taken and the GPS coordinates showing exactly where it was taken (provided the device that captured the photo was capable of tagging the image with that data).  

For Parler users who realized in hindsight that they may have incriminated themselves by uploading their evidence to the social media site, attempting to DELETE the images was not enough. Yes, there was a delete button. But, behind the scenes, that button did not remove the images from Parler's Web site. While it may have removed the images from a user's collection of photos, it left the file in its original location on Parler's servers where it was still accessible to hacktivists like @donk_enby. 

Finally, there’s the matter of Parler having no plan for what happens if a service that it relies on suddenly becomes unavailable. While its integration with Twilio had nothing to do with the data that was exfiltrated from the site before Amazon took it down, there are dangers associated with relying on a sole public API for certain functionality.  In the same way that system architects rely on system redundancies to guarantee 99.999% availability, it would not have hurt for Parler to consider a backup provider for the sort of SMS messaging it was getting from Twilio.  Had Parler secured its APIs, PUT rate limits on them, and randomized asset ID numbers, the hacktivists would have needed a far more reaching botnet operating under thousands if not millions of new user accounts in order to achieve their objectives. But if an SMS verification service like Twilio's was in the way of each one of those new registrations, the hacktivists would have had to climb a nearly insurmountable mountain.

But in turning off its integration with Twilio with no substitute to fill the gap, Parler essentially pulled that mountain out of the way. If Parler had a Twilio substitute ready to go, its Source Code could have easily branched to to that substitute as soon as Twilio was no longer available. Of course, there's no guarantee that an alternative service to Twilio would not have also suspended its engagement with Parler for the same reasons that Twilio, Amazon and other tech companies did. But this article is as much about what can be learned from Parler's implementation of APIs for your own deployments as it is about the role that APIs played in the exfiltration of data from Parler's website.

Be sure to read the next Application Development article: Apple Issues Guidance for App Tracking Transparency APIs