Facebook Has Many Sins To Atone For, But 'Selling Data' To Cambridge Analytica Is Not One Of Them

from the let's-at-least-be-accurate dept

Obviously, over the past few days there’s been plenty of talk about the big mess concerning Cambridge Analytica using data on 50 million Facebook users. And, with that talk has come all sorts of hot takes and ideas and demands — not all of which make sense. Indeed, it appears that there’s such a rush to condemn bad behavior that many are not taking the time to figure out exactly what bad behavior is worth condemning. And that’s a problem. Because if you don’t understand the actual bad behavior, then your “solutions” will be misplaced. Indeed, they could make problems worse. And… because I know that some are going to read this post as a defense of Facebook, let me be clear (as the title of this post notes): Facebook has many problems, and has done a lot of bad things (some of which we’ll discuss below). But if you mischaracterize those “bad” things, then your “solutions” will not actually solve them.

One theme that I’ve seen over and over again in discussions about what happened with Facebook and Cambridge Analytica is the idea that Facebook “sold” the data it had on users to Cambridge Analytica (alternatively that Cambridge Analytica “stole” that data). Neither is accurate, and I’m somewhat surprised to see people who are normally careful about these things — such as Edward Snowden — harping on the “selling data” concept. What Facebook actually does is sell access to individuals based on their data and, as part of that, open up the possibility for users to give some data to companies, but often unwittingly. There’s a lot of nuance in that sentence, and many will argue that for all reasonable purposes “selling data” and my much longer version are the same thing. But they are not.

So before we dig into why they’re so different, let’s point out one thing that Facebook deserves to be yelled at over: it does not make this clear to users in any reasonable way. Now, perhaps that’s because it’s not easy to make this point, but, really, Facebook could at least do a better job of explaining how all of this works. Now, let’s dig in a bit on why this is not selling data. And for that, we need to talk about three separate entities on Facebook. First are advertisers. Second are app developers. Third are users.

The users (all of us) supply a bunch of data to Facebook. Facebook, over the years, has done a piss poor job of explaining to users what data it actually keeps and what it does with that data. Despite some pretty horrendous practices on this front early on, the company has tried to improve greatly over the years. And, in some sense, it has succeeded — in that users have a lot more granular control and ability to dig into what Facebook is doing with their data. But, it does take a fair bit of digging and it’s not that easy to understand — or to understand the consequences of blocking some aspects of it.

The advertisers don’t (as is all too commonly believed) “buy” data from Facebook. Instead, the buy the ability to put ads into the feeds of users who match certain profiles. Again, some will argue this is the same thing. It is not. From merely buying ads, the advertiser gets no data in return about the users. It just knows what sort of profile info it asked for the ads to appear against, and it knows some very, very basic info about how many people saw or interacted with the ads. Now, if the ad includes some sort of call to action, the advertiser might then gain some information directly from the user, but that’s still at the user’s choice.

The app developer ecosystem is a bit more complicated. Back in April of 2010, Facebook introduced the Open Graph API, which allowed app developers to hook into the data that users were giving to Facebook. Here’s where “things look different in retrospect” comes into play. The original Graph API allowed developers to access a ton of information. In retrospect, many will argue that this created a privacy nightmare (which, it kinda did!), but at the same time, it also allowed lots of others to build interesting apps and services, leveraging that data that users themselves were sharing (though, not always realizing they were sharing it). It was actually a move towards openness in a manner that many considered benefited the open web by allowing other services to build on top of the Facebook social graph.

There is one aspect of the original API that does still seem problematic — and really should have been obviously problematic right from the beginning. And this is another thing that it’s entirely appropriate to slam Facebook for not comprehending at the time. As part of the API, developers could not only get access to all this information about you… but also about your friends. Like… everything. From the original Facebook page, you can see all the “friend permissions” that were available. These are better summarized in the following chart from a recent paper analyzing the “collateral damage of Facebook apps.”

If you can’t read that… it’s basically a ton of info from friends, including their likes, birthdays, activities, religion, status updates, interests, etc. You can kind of understand how Facebook ended up thinking this was a good idea. If an app developer was designing an app to provide you a better Facebook experience, it might be nice for that app to have access to all that information so it could display it to you as if you were using Facebook. But (1) that’s not how this ever worked (and, indeed, Facebook went legal against services that tried to provide a better Facebook experience) and (2) none of this was made clear to end-users — especially the idea that in sharing your data with your friends, they might cough up literally all of it to some shady dude pushing a silly “personality test” game.

But, of course, as I noted in my original post, in some cases, this set up was celebrated. When the Obama campaign used the app API this way to reach more and more people and collect all the same basic data, it was celebrated as being a clever “voter outreach” strategy. Of course, the transparency levels were different there. Users of the Obama app knew what they were supporting — though didn’t perhaps realize they were revealing a lot of friend data at the same time. Users of Cambridge Analytica’s app… just thought they were taking a personality quiz.

And that brings us to the final point here: Cambridge Analytica, like many others, used this setup to suck up a ton of data, much of it from friends of people who agreed to install a personality test app (and, a bunch of those users were actually paid via Mechanical Turk to basically cough up all their friends’ data). There are reasonable questions about why Facebook set up its API this way (though, as noted above, there were defensible, if short-sighted, reasons). There are reasonable questions about why Facebook wasn’t more careful about watching what apps were doing with the data they had access to. And, most importantly, there are reasonable questions about how transparent Facebook was to its end users through all of this (hint: it was not at all transparent).

So there are plenty of things that Facebook clearly did wrong, but it wasn’t about selling data to Cambridge Analytica and it wasn’t Cambridge Analytica “stealing” data. The real problem was in how all of this was hidden. It comes back to transparency. Facebook could argue that this information was all “public” — which, uh, okay, it was, but it was not public in a way that the average Facebook user (or even most “expert” Facebook users) truly understood. So if we’re going to bash Facebook here, it should be for the fact that none of this was clear to users.

Indeed, even though Facebook shut down this API in April of 2015 (after deprecating it in April of 2014), most users still had no idea just how much information Facebook apps had about them and their friends. Today, the new API still coughs up a lot more info than people realize about themselves (and, again, that’s bad and Facebook should improve that), but no longer your friends’ data as well.

So slam Facebook all your want for failing to make this clear. Slam Facebook for not warning users about the data they were sharing — or that their friends could share. Slam Facebook for not recognizing how apps were sucking up this data and the privacy implications related to that. But don’t slam Facebook for “selling your data” to advertisers, because that’s not what happened.

I was going to use this post to also discuss why this misconception is leading to bad policy prescriptions, but this one is long enough already, so stay tuned for that one next. Update: And here’s that post.

Filed Under: , , , , , ,
Companies: cambridge analytica, facebook

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Facebook Has Many Sins To Atone For, But 'Selling Data' To Cambridge Analytica Is Not One Of Them”

Subscribe: RSS Leave a comment
22 Comments
Ninja (profile) says:

I guess if Facebook starts being transparent on how user information is handled people will be up in arms. Reminds me of the Android permissions issue. Google also started doing a piss poor job (it actually still needs to improve a bunch). Same case, much less outrage.

Cue hysteric, uninformed people and “didn’t-read-the-article” trolls crapping all over the comments in 3, 2, 1…

BernardoVerda (profile) says:

Re: Re:

A more basic problem is that even when users were told about how revealing all this data was, how much more it revealed than what the users realized, how much perpetual memory and global collating amplified the power of every data-point, and how easily it could be used against any user — they refused to believe, or simply couldn’t imagine that it might actually apply to them.

Some of them even just said, “cool — that just means I’ll get more coupons for stuff I want to buy.”

Anonymous Coward says:

Technically, plague bacteria don't kill you: it's waste products

Everything you deny / downplay here is close enough for policy decisions.

I’m sure you’d also say that Google doesn’t give NSA "direct access" as Snowden stated, actually the data is already processed, therefore Snowden is wrong.

You had to start this piece by figuring out some way to hedge plain facts. — And you run it because in favor of corporate surveillance. So, you may not TECHNICALLY be defending Facebook here, but it IS THE SAME EFFECT.

Shel10 (profile) says:

Facebook Data Used by Obama Campaign

I recall that the Obama Campaign bragged about how they used Facebook data to get votes. The new media said that our new President was a believer in the use of new technologies and systems and that he would be a great leader.

Why is it a crime for Trump’s campaign to have used the same techniques???

McFortner (profile) says:

Re: Re: Facebook Data Used by Obama Campaign

From the Wikipedia article on the Washington Monthly:

“The politics of Washington Monthly are often considered center-left.[5][6][7] Founder Charles Peters refers to himself as a New Deal Democrat and advocates the use of government to address social problems. His columns also frequently emphasized the importance of a vigilant “fourth estate” in keeping government honest.”

https://en.wikipedia.org/wiki/Washington_Monthly (retrieved 21 March 2018 1817 UTC)

So, an avowed Leftist Democrat publisher can naturally be trusted to keep everything fair and unbiased. </sarcasm>

Thad (user link) says:

Re: Facebook Data Used by Obama Campaign

If you can produce a video of somebody working on behalf of the Obama campaign saying that he can blackmail politicians by hiring prostitutes to have sex with them, then you’re correct, what Obama did is exactly like what Trump did.

I suspect that if you did have a link to such a video, you would have already shared it, but hey, prove me wrong.

3D Face Analysis says:

Facebook shouldn't own the "copyright" to the database...

The real problem is that Facebook claims de facto ownership of the user data. In reality the owners of the data are the authors themselves, not Facebook. In copyright law, the owners of the data are the authors themselves, unless the copyright is transferred between parties. However, the only way that copyright could be transferred between parties is by a signed document, or in the case of work-for-hire law, the employer owns the employee’s copyrighted work. Virtually none of its users transferred copyright to Facebook that way. (Except for the extremely tiny 0.00000001% of data posted by Zuckerburg himself or his employees.) The users have not transferred their copyrights to Facebook. Therefore, Facebook does not own the data.

Only the owners of the copyright could sue people. However, because Facebook does not actually own the data, Facebook could not sue. People should be legally allowed to scrap Facebook public data without worrying about copyright infringement.

However, this is not the case. Facebook sued Power Ventures for scraping data from Facebook. Facebook, like LinkedIn, claims that only they could only sell data, and sues anyone else for selling "their" data without their authorization.

The web should be open. People should be free to scrap public data without getting sued. In this way more innovation and competition will take place.

Facebook might argue that they own the "collective work" for the selection or the arrangement of the data. But still they do not own it. Facebook does not "select" or "arrange" the user’s data. It’s not like there are moderators on Facebook who pick and choose which posts will be allowed to be published on the site. Facebook is NOT a "publisher". Facebook is NOT a "content publisher" by only means. Facebook is instead a service provider, like Gmail. It’s absurd to believe that Gmail owns the copyright of all of your email messages. It should be the same for Facebook. Facebook does not own the messages. Facebook does not own the database, because, like Gmail, they are not the author of the data. The authors of the database are the users themselves, not Facebook. Facebook does not "manage" or "compile" the data in the database. Please stop calling Facebook a "publisher" when because it isn’t. Facebook a service provider (like a post office), not a publisher (like a magazine).

Facebook might argue that they "owns" the database because its software "manipulates" the data, but that is still wrong. Yes software underlying Facebook might sorts data chronologically or alphabetically, but the mere act of sorting data is too trivial for copyright protection. There is no creativity in the mere act of sorting your friend’s posts reverse-chronologically in a feed or sorting your friends list alphabetically by name. Sorting or aggregating posts alphabetically or chronologically is a mechanical process that is to trivial for copyright protection.

See "503.03(a) Works-not originated by a human author."

In order to be entitled to copyright registration, a work must be the product of human authorship. Works produced by mechanical processes or random selection without any contribution by a human author are not registrable.

Facebook might claim that it "moderates" the database, so it could claim "ownership" of the the database. But this is still flawed. By "moderation" they mean deleting posts that are reported by users to violates its terms of use. For example deleting posts that contain advocates violence or deleting posts that are copyright infringement. But this is too trivial for copyright protection. Facebook deletes only a tiny minority of posts, and the act of "deleting" certain things from the database is too trivial and does not meet the threshold of originality for copyright protection.

Facebook might again argue that if they don’t claim copyright ownership of the user’s data, they wouldn’t have as much incentive to develop or maintain their software. But this argument is, again, absurd. It’s like claiming that if the post office does not claim copyright ownership of all of the mail that they deliver, the post office wouldn’t have any economic incentive to continue to exist.

Boojum (profile) says:

Why regulate "internet" differently?

What makes deep data mining work is the depth of the data that you have. It isn’t that you have data from Facebook, it’s that you have data from Facebook, Google, Safeway, the DMV, Red Robin, etc. The data that is mined doesn’t just come from social media, it comes from that nice discount card the grocery store gave you, and the gift card to a restaurant that you went to, and many other places that are not social media. If you are going to pass a law protecting privacy, then it has to cover all the OTHER sources of data that deep data mining companies use, not just social media.

Oh! And one other thing. That web browser that you are using for free? When the Mozilla/IE wars happened, the funding for browsers shifted from Users to Online Companies. This is one reason why you can get so much information about web browsing habits online, through cookies and what the computer sends to the server, which can be collected, which can be mined, etc. It is one reason why tracking pixels even work, because they can uniquely identify your computer/cell.. and then track you across multiple pages. So you would also want to pass regulations on what people can put on their web pages, such as tracking pixels and analytic javascript.

If you are serious about protecting peoples privacy and “private data” then you have to go a LOT further than Social Media.

Anonymous Coward says:

"What Facebook actually does is sell access to individuals…"

This is sophistry. They monetise user data, and, from what I can tell, are so careless about data governance that they permit raw, person-level data, to be downloaded by anyone claiming to be a developer and willing to click "agree". They didn’t set all of this up as a public service. Whether they charge for the data or take a slice of app revenue is inconsequential.

The issue is not just about Facebook and Cambridge and conservative political alliances. It’s about the power of data and who owns and controls it.

You rightly point out that users need to be able to control how their data is used (and, by implication, if it is used). But Facebook clearly doesn’t even believe that the data belong to users. Trying to gently nudge them in a direction that they see as fundamentally threatening their business model will never be successful. I can hear Zuckerberg calling us "dumb fucks" for even trying that from here.

Your conclusion seems to be that the current situation is bad but it’s better than any realistic alternative.

This is unacceptable. While Murdoch may be worse than Zuckerberg in intent, Facebook’s facilitation and profiteering from algorithmic culture hacking is worse in consequence than News Corp’s old school propaganda machine.

Relying on some unregulated "creative destruction" invisible hand is what allowed Microsoft and Google and Facebook and Amazon and their government allies to seize control of our personal data in the first place. It needs to be clear that it is citizens own their own data. GDPR may be problematic but it is a mostly good faith step in the right direction – and is better than America’s default position of reflexively submitting to corporate power in all things.

Mike Masnick (profile) says:

Re: Re:

This is sophistry. They monetise user data, and, from what I can tell, are so careless about data governance that they permit raw, person-level data, to be downloaded by anyone claiming to be a developer and willing to click "agree". They didn’t set all of this up as a public service. Whether they charge for the data or take a slice of app revenue is inconsequential.

Uh, as I explained it makes all the difference in the world. If you focus on smooshing the concepts together, then you falsely think the problem is Facebook didn’t lock down the data. Thus the policy prescription is lock down data more. And as I explain in the next post, that ACTUALLY MAKES THE PROBLEM WORSE and gives FACEBOOK EVEN MORE CONTROL.

That’s not a solution, and that’s the kind of counterproductive ideas that come out when people go "oh it’s all the same."

It’s NOT the same. And understanding the difference is vital to not fucking things up.

The issue is not just about Facebook and Cambridge and conservative political alliances. It’s about the power of data and who owns and controls it.

Yes, but when you conflate selling data and selling access, then your policy prescriptions will be TO GIVE FACEBOOK MORE OWNERSHIP AND CONTROL.

Why would you do that?

Your conclusion seems to be that the current situation is bad but it’s better than any realistic alternative.

No. It’s bad and there are better alternatives — but the only way we get to better alternatives is by properly understanding the problem — while you’re insisting on putting blinders on and tackling something that isn’t the problem.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...