Salesforce CEO Marc Benioff was quoted in November as telling journalist Kara Swisher on CNBC that "Facebook is the new cigarettes." He went on to say "You know, it's addictive. It's not good for you. There's people trying to get you to use it that even you don't understand what's going on." (Disclosure: MuleSoft, the parent company to ProgrammableWeb, was acquired by Salesforce in 2018. Neither parent was involved in the development of this special report).
History (and court testimony) has revealed that the tobacco industry -- aka "Big Tobacco" -- purposefully and surreptitiously added addictive compounds to cigarettes to make it physiologically difficult if not impossible for smokers to quit. The questions now are whether Facebook is similarly addictive and, if so, what if anything Facebook is doing (or purposefully not doing) to maintain that addiction. In other words, to keep us from leaving. These questions come at a time when Facebook's gargantuan global reach equates to unprecedented control of personal data and, in some cases, serious international consequences.
Lock-in: The holy grail of technology success
The history of computing is littered with obstacles, some more subtle than others, that are designed to prevent users from leaving one product for another, or just leaving altogether; a predicament that's often referred to as "lock-in." Such lock-in happens when your data is stored by an existing solution in a proprietary format that's impossible to repurpose should that data ever become disembodied from that solution. As you will learn from this special report, the data that you add to Facebook (posts, photos, etc.) along with other data the social network keeps about you falls into this category. Facebook allows you to download some of your data. But it comes with an extremely limited utility once outside of Facebook.
Microsoft for example once used proprietary word processing and spreadsheet storage formats to make it nearly impossible to quit Microsoft Office and take your documents with you to an Office alternative. Even after initially embracing the XML standard for the encoding of Word and Excel data, questions remained about whether the format was truly open enough.
Proprietary digital rights management (DRM) technology is another example that makes it nearly impossible for users to purchase or rent multimedia content through one platform (eg: iTunes) and play it back on another (eg: Google Play). The ensuing lock-in results in an undesirable and in some cases, expensive "addiction" to a platform. Many iPhone users, for example, have little choice but to buy a new iPhone when it is time to replace their phones (unless they're willing to give up their iTunes content).
Now, with many Facebook users having invested their personal data with Facebook for the better part of a decade, Benioff's quotes raise the question of whether Facebook is being honest with its users when it says you are free to leave and take your data with you. Or, like with Big Tobacco, are there addictive obstacles that make it harder for users to quit than the social networking giant lets on? The question comes at a time when the company is increasingly under fire for everything from the mishandling of personal data to inaction over election interference to its decision not to remove a doctored video of US House of Representatives Speaker Nancy Pelosi.
In hopes of answering that question, ProgrammableWeb developed a technical methodology to see just how difficult it really is to quit on terms that would be acceptable to most users; an analysis that, as you will see, is an API story as much as it is anything else. The million dollar question: Are users able to quit and easily take their data — all of their data -- with them in such a way that it would be useful to them (navigable, viewable, and portable) outside of the service? Or is quitting and taking your data with you much easier said than done?
Worth mentioning: There are at least two addictive components of Facebook that our methodology does not address. First, what about all the other logins across the Web that you used your Facebook ID for? There doesn't appear to be an easy way to fix all of those logins. Second, Facebook is where your friends, acquaintances, and special interest groups are. Unfortunately, the promising array of new social networks — none of which can be seriously considered as Facebook alternatives (at least not yet) — is where you're friends aren't [sic]. What, if anything, would get them to move?
Two Ways to Quit: With and Without Your Data
First, it's important to note that users can always delete their accounts and quit without taking their data with them. When this happens, and with very few exceptions, Facebook claims that virtually all of the data that's associated with a user is deleted from Facebook's systems. "We make sure all of your account information is deleted — for example, anything you've posted, your name in any tagged photos of you, pages you have liked," a Facebook spokesperson told ProgrammableWeb. "We don't retain any information that can be connected to you or identify you. We do retain de-identified logs about the fact that someone posted something at a given time, which helps us understand how people use Facebook." This is a statement that many users distrust. But ProgrammableWeb has no reason to believe that the company is deliberately misleading us or its users about its procedures.
That same Facebook spokesperson also pointed out that Facebook doesn't delete some things that a user may have sent to other Facebook users. "For example, if you've sent a message to someone using Facebook Messenger, we can delete it from your outbox. But we don't delete it from the recipient's inbox."
To its credit, Facebook appears to work pretty hard to wipe its service clean of any traces of a deleted user, even in situations where you might expect otherwise. For example, if you've posted a photo and tagged it with a friend's name and that friend adds the tagged photo to their timeline, that photo and its tags are deleted from the friend's timeline when you delete your account. Likewise, the aforementioned messages between you and a friend that are not deleted from the friend's inbox are stripped of your identifying information when you delete your account.
In fact, Facebook's process for wiping its servers is so complete that it could be disruptive to other users. For example, if you add a post to a Facebook group that garners 100 comments, that post and all of its comments will vanish from the group's existence when you delete your Facebook account. This could come as an unwanted surprise to anyone who left a comment, especially if they saved the post for later reference.
For users looking to preserve their data before quitting, they can not only quit at any time, they can also take their data with them (according to the user settings area of Facebook). Even better (according to the same user interface), Facebook will, at your option, provide the data to departing users in a format that it says can be imported into another service.
So, how could Facebook be the new cigarettes if, as Facebook suggests, one can just pack up their data and leave at any time? Even take their data with them to another service?
To test the veracity of Facebook's implicit claim, ProgrammableWeb enlisted writer and developer Shelby Switzer to pretend she was quitting Facebook and taking her data with her. She tried the downloads. And, although Facebook told ProgrammableWeb that its API (known as Facebook's Graph API) is not intended for the same use case as the download (eg: pre-departure retrieval of user data), she also tested Facebook's API for programmatic retrieval of personal data. Why? We felt that the only way to truly test the Facebook lock-in theory is to also exhaustively evaluate all channels through which some personal data — any personal data — was available. Theoretically, Facebook's Web-based user interface is another such channel that can be scraped for data that's not available through its downloads or APIs. But Facebook's terms of service explicitly prohibit the practice of scraping its Web site.
Facebook was also unequivocal in describing the differing roles for its downloads and APIs to ProgrammableWeb. "The Graph API and [the] Download Your Information [utility] are different tools for different purposes. The Graph API information supports the Facebook Platform. The information we make available there is appropriate for the use cases that Platform supports" a Facebook spokesperson told ProgrammableWeb. "For example, Facebook Login, Games, and social plugins such as the Like button. We need to be careful about what developers can access, as evidenced by the news of the past year. Because the data in Download Your Information is intended for your own personal download and use (vs. going directly to developers through an ongoing flow of information), it includes more of your information."
However, based on Shelby's tests (which included a purpose-built app that we've open sourced), ProgrammableWeb found that while it's true that Facebook's download offers information that the API does not, the API also offers important information that's excluded from the download; information that, in our estimation, most users would want before leaving the service.
As far as its application approval process is concerned, Facebook appears to stand by its belief that the API is for purposes other than quitting and taking your data with you. Although our purpose-built app -- appropriately named Salvager (naming it "Facebook Salvager" is against the rules) -- for extracting your data through Facebook's API hasn't been explicitly rejected by Facebook, it hasn't been approved either. This, despite assurances that we'd have closure on the application by the time we published this series. Our application appears to be in terminal limbo which, as far as we're concerned, is the equivalent of a rejection.
As our research unfolded, the notion that it might be harder to quit Facebook than Facebook would lead you to believe didn't seem so incredibly far-fetched. Yes, you can quit Facebook. But, in order to do so, you'll not only drop out of your network of friends, you might have to forego some precious data that's very difficult to leave behind. Furthermore, just because Facebook's API represents an opportunity (in combination with the downloads) to retrieve the full superset of data that a user might want doesn't necessarily mean that Facebook will allow it (which, apparently, it will not).
What Qualifies as Your Facebook Data?
Perhaps the most important questions to ask when attempting to leave Facebook and take your data with you are:
- Why are you leaving Facebook?
- What exactly qualifies as your data?
- What do you hope to do with your data outside of Facebook?
The grand majority of people I know who fantasize about leaving Facebook attribute their objections to one or more of three reasons. While many people feel as though Facebook is depersonalizing the very fabric of their lives and seek a return to the days of physical connections with friends and family (often phrased as "taking my life back"), many others are alarmed by the revelations regarding the personal data that Facebook collects and what has happened to that data in recent months and years. Still, others are dissatisfied with Facebook's impact on society (eg: live streaming or redistribution of horrible violence, or playing host to extremely polarizing conversations).
One underlying incentive common to each of these three reasons; they are mainly principled acts of protest. No one quits Facebook because there's a better alternative. But the appetite to actually quit could soften once reality sets in and you realize that disavowing yourself of Facebook for principled reasons means you should probably also quit Facebook's other services such as Instagram, WhatsApp, and Oculus. As it turns out, Facebook's tentacles stretch far and wide. Even if the download was all you needed to get all of your data out of Facebook (it isn't), there is no single download that covers all of Facebook proper thereby allowing you to quit Facebook, the company. The download and API we tested were just for the Facebook-branded social network itself.
Then, there's the data. Your data. You know that old saying that beauty is in the eyes of the beholder? In the context of quitting Facebook, what you think of as your data and what Facebook thinks of as your data could be different from one another. For example, should your data include "shared data" like a memorable photo that was posted by someone else but tagged with your name? Or what about the friends that liked various posts of yours? Or, when you quit, should you not only be able to take a list of your friends with you but the contact information that they voluntarily shared with you as well? With Facebook being your only form of contact with certain friends, you might need this before quitting. Otherwise, quitting could mean the loss of contact with family and friends.
What about when a photo posted by a friend is tagged with your name and you accept the prompt to add it to your timeline? Or, a photo that you tagged with another user's name? What happens to that tag?
As subjective as these grey area questions are, one thing Shelby's tests found for certain; whereas one subset of your data is available for download, a different subset of your data is available through Facebook's Graph API. In other words, the most complete superset of your personal data is only available when you combine and correlate the two.
For example, your friend list (essentially, a list of contacts) is available through the download, but not the API. However, if your posts have comments on them, the download does not bundle those comments with the posts they belong to. Additionally, your own posts and comments are available in separate parts of the download (two files, one named posts.json and the other named comments.json). But the connective tissue is lost in a way that you can't tell which comments go with what posts.
However, via the API, comments are not only attached to their associated posts, the posts and comments include ID fields that make it possible for them to be glued back together should they ever become orphaned from one another. Such indexes could be helpful towards automating the reconstruction of a Facebook user's history; either in some standalone archive intended for personal viewing or within a competing social service.
Unfortunately, as implied earlier, the API is not for ordinary users. To retrieve the API-based data in a usable format would require a third-party developed application that, as best as we can tell, doesn't exist.
Which is why we had Shelby write the Salvager app along with an article to document its initial objectives. The application's primary objective is to turn Facebook data into ActivityPub data. ActivityPub, which I discuss further down, may be the world's best shot at making social data interchangeable across services (some of which haven't even been invented yet). But, by the time she was done with the first version of the Salvager app, we concluded that Facebook is nearly impossible to quit if your hope is not only to leave with your data, but to also retain all of the original interconnectedness across that data. Unlike with cigarettes where you can switch brands or get a nicotine fix from some other source, Shelby's tests demonstrated how, despite Facebook's personal data download capabilities and APIs, it's the loss of the Facebook functionality once associated with that data that will leave you so adrift, you'll probably be compelled to simply stay on Facebook.
A key incentive to stay, verified by Shelby's tests that studied the data compatibilities and incompatibilities between Facebook and potential challengers like Mastodon, is that a worthy alternative to Facebook doesn't exist (even though Facebook's user interface explicitly suggests that its downloads are formatted so they can be uploaded to another service). It's not like the way you can export an Excel spreadsheet to a CSV (Comma Separated Value) file and then import that CSV file into Google Spreadsheets. Once Shelby retrieved her data from Facebook, there was no evidence of a service anywhere on the Net that could import it.
Why Data Formats are Part of the Problem
When it comes to importing and exporting data, one issue that's ripe for both confusion and incompatibility has to do with how Facebook technically formats its data on the way out (via download or API), and the degree to which that format is compatible with anything else out there.
Some of that confusion starts with the statement on Facebook's download utility page which explains that the data is available in "a JSON format which could allow another service to more easily import it."
Facebook, for its part, feels as though it acceded to external expectations in making this choice. "We consulted external groups including privacy advocates and regulators to determine the right format for facilitating upload to another service," the company's spokesperson told ProgrammableWeb. "The JSON format is consistent with guidance we've received, including this [publicly published] guidance from the UK Information Commissioner's Office: "Where no specific format is in common use within your industry or sector, you should provide personal data using open formats such as CSV, XML, and JSON."
Facebook appeared to offer this statement as validation for having chosen JSON as though JSON (or XML or CSV) was not the obvious choice anyway. To suggest there was acquiescence to external guidance is a red herring at best. XML and JSON have not only been the de facto lingua franca of data interchange for the past two decades, Facebook's internal and external APIs are, without exception, already fluent in JSON.
More importantly, JSON by itself is not a magic wand for importing and exporting Facebook data.
So, when Facebook says that its data downloads can be optionally downloaded in the JSON format, it's only dealing with half the problem: the easier half. The other, more difficult and meaningful half has to do with the actual data in the package. What fields of data are available? What is their meaning? What are the one-to-many relationships within that data (for example, the relationship between one post and its many comments)? Are they text fields? Number fields?
These specifics about the data inside Facebook's JSON package(s), sometimes referred to as the data's schema, are symptomatic of the larger challenge to anyone hoping to quit Facebook and take their data with them. Facebook's schema, which to a large extent, is a reflection of Facebook's functionality, is incredibly unique. Currently, there are no services or solutions that are quite like Facebook. Ergo, there are no services or solutions whose data schemas are quite like Facebook's. Nor does Facebook support ActivityPub; the closest thing the world has to an open standard for the interoperation or migration of social network data. Even if Facebook supported ActivityPub, it wouldn't matter. At best, the ActivityPub standard represents a fraction of Facebook's total schema.
In fairness to Facebook, the fact that a competing service doesn't exist — one that's capable of importing and reconstructing a Facebook archive — is hardly Facebook's fault. As Shelby points out in her article about potential Facebook alternatives, among the handful of interesting services and technologies floating about the Net, none that we could find are currently viable targets for receiving and digesting all that Facebook has to offer in the form of an export (regardless of whether Facebook's API is involved or not).
Not only could we not find a compatible social network or technology to which all of your Facebook data could be uploaded and shared with other users, we couldn't even find a third party piece of software for browsing such an archive (as though you were still on Facebook). Perhaps the biggest irony of all is how even Facebook can't import one of its own exports. Such a capability might be useful to someone wanting to close their existing Facebook account in favor of a new one. Or, it might also come in handy to anybody having second thoughts after having left Facebook.
Quite frankly, we at ProgrammableWeb were so shocked by the inexplicable dearth of solutions designed to import a Facebook archive — even Facebook itself -- that it became the impetus for creating and open sourcing the Salvager application. The idea behind Salvager wasn't so much to create a legitimate solution for working with Facebook data as it was to test our own assumptions while developing an understanding of the requirements and challenges in order for such a solution to be viable.
For such a solution to succeed, ProgrammableWeb has outlined the following top five minimal requirements (which, admittedly, Salvager doesn't yet achieve):
- Designed to import a Facebook archive as it is delivered to the end user through the Facebook download utility
- Extracts additional information and context as necessary from the Facebook Graph API (if Facebook will allow it) and applies that data as appropriate.
- Offers a Facebook-like viewing experience for privately browsing the archive.
- Where data is missing — for example, detailed contact information (phone numbers, email addresses, etc.) that goes with the downloadable friend list — provide the user with a manual data augmentation or annotation capability.
- Data transformation and upload: As Shelby points out in her coverage, though not ideal, certain subsets of a Facebook archive (eg: photos) could be applicable to other social networking services that focus on specific types of data. Additionally, although it is technically incomplete in our opinion, the emergent ActivityPub open standard format for storing newsfeed-like data could potentially accept a subset of the Facebook archive. Eventually, it could evolve to accept the entire export. As such, one requirement for a solution would be to handle any data transformations from Facebook's native data formats to other formats and then upload the transformed data to target services and solutions at the user's choosing.
Not a week seems to go by where Facebook isn't on the receiving end of new negative headlines. As we were putting the finishing touches on this series, Facebook's stock price fell on news that the House Judiciary Committee, the Department of Justice, and the Federal Trade Commission would be launching various anti-trust probes of several big tech companies including Facebook. The news comes at the same time Democratic presidential hopeful Senator Elizabeth Warren has been promoting the idea of breaking Facebook up (a controversial antitrust remedy for which almost no hi-tech precedent exists).
Then, just as we were getting ready to publish the series to the web, The Guardian reported how data from analytics firm Mixpanel is indicating a recent collapse in Facebook usage (the story also points out that Mixpanel's data is at odds with Facebook's most recent quarterly report which states that usage is trending up).
Unless legislators and the Department of Justice (DOJ) rewrite the current rules of antitrust law, the overarching questions when it comes to an antitrust investigation will focus on whether Facebook is in fact a monopoly and, if so, whether it engaged in anticompetitive conduct to maintain that monopoly. In answer to the first question, the DOJ would have to first identify the harmed market that Facebook monopolizes, prove that Facebook monopolizes it, and that harm came as a result. Keep in mind that having a monopoly is not a crime. But, once it is legally determined that Facebook is operating a monopoly (a very big assumption at this point), then comes the next question about its anticompetitive conduct, if any.
Few would argue that Facebook is a dominant player. A giant. As we've established in this special report, backed by some rigorous technical testing, Facebook is unquestionably a one-of-a-kind. For users interested in switching to a Facebook alternative and moving their data in the process, the problem isn't that Facebook prevents you from leaving. It's just that there is currently no place to go nor are there any obvious contenders on the horizon.
Whether such contenders might exist today had it not been for some anticompetitive conduct on Facebook's behalf is a question we cannot answer. What we can say is that Facebook, as expected, isn't bending over backward to encourage the existence of alternatives, nor does it go out of its way to make a user's Facebook data especially useful or complete once it's removed from Facebook. While the absence of applications and services that can work with Facebook's exported data is indeed troubling, we have no reason to suspect Facebook of the sort of anticompetitive conduct that prevents third parties from developing such solutions.
In her great coverage of potential Facebook alternatives, Shelby saw some promise in the structure of the Solid project now spearheaded by the inventor of the World Wide Web Sir Tim Berners-Lee. Today, Solid is not exactly a viable alternative to Facebook. The emphasis so far has been less on Solid's data schema and more on an API-driven, open (like the Web itself), and decentralized architecture that puts the user wholly in charge of their data (something many users and regulators would probably welcome). Given Solid's open nature, there's nothing that prevents it from evolving into more of a Facebook alternative than it is today. Time will tell and ProgrammableWeb will be there to cover it.