July 23, 2024


Epicurean Science & Tech

The future of internet anonymity: Jeff Kosseff’s new book

15 min read
The future of internet anonymity: Jeff Kosseff’s new book

Jeff Kosseff’s last book turned out to be pretty prescient. He published “The Twenty-Six Words That Created The Internet,” a deep look at the history and future of Section 230, right as those 26 words became central to the regulatory fight over the future of the internet.

With his next book, Kosseff, a professor at the Naval Academy, may have done the same thing. The book is titled “The United States of Anonymous,” and it deals with the centuries-old argument about whether people should be allowed to say things without having to identify themselves. In the U.S., courts have given a lot of leeway and protection to anonymous speakers, but the internet has changed the equation, and companies and governments alike are still figuring out what to do.

Read also: an excerpt from Kosseff’s book, on how platforms like Facebook adopted real-name policies while Twitter and others allowed anonymity, and how both decisions changed social networking.

Kosseff joined the Source Code podcast to discuss his new book, how technologies like bulletin boards and Tor and facial recognition are changing the way we think about anonymity, and why he thinks that even though anonymity allows bad people to do bad things, he thinks it’s still worth preserving. And even fighting for.

You can hear our full conversation on the latest episode of the Source Code podcast, or by clicking on the player above. Below are excerpts from our conversation, edited for length and clarity.

Subscribe to the show: Apple Podcasts | Spotify | Overcast | Pocket Casts

David Pierce: Why this book? The last thing you wrote about was Section 230. And I think there are some clear lines between that and anonymity. But I’m curious what made you pick this as a subject.

I started to be really interested in anonymity, actually, because of the Section 230 book and the debate that was really happening around Section 230, especially right after the book published in 2019. A lot of the real defenders of Section 230 would often say, “All that Section 230 does is it says that you can’t sue the platform, but you can sue the poster.” And that seems like a pretty simple and clean explanation for Section 230. But as I started looking at a lot of the really difficult cases, what I saw was, that’s not necessarily true.

There’s a few different reasons: one, you might not be able to identify who the poster is. They might be using Tor, they might be using a coffee house Wi-Fi connection, they might be using their neighbor’s Wi-Fi connection. And even if they are there, they might be on a VPN. And then even if they are somehow identifiable by their IP address, what I found really interesting is that courts have set a very high standard for being able to subpoena the identity of an anonymous poster in a defamation case or some other sort of harm that has occurred. And that got me to really start looking at, why do we have this protection? And why did the court set such a high standard to just get someone’s name?

As I started looking at it, I saw you don’t just look at the modern internet, you have to go back to the 1950s and First Amendment cases involving the NAACP. Those are actually rooted in the American history of anonymous and pseudonymous speech going all the way back to colonial pamphlets and the Federalist Papers. And I found it so fascinating that we have this very strong — not absolute, but very strong — right to be anonymous, that goes back to the founding of our country. So that’s really what got me started on this book.

Issie Lapowsky: Like you said, you can trace the history all the way back to colonial times. But then it was really interesting to see how in the early 2000s, around the time of Yahoo bulletin boards, courts were really having to reinvent the standards by which they were going to assess these cases. So I was wondering if you could talk a little bit about the new dynamics that the internet age brought to these centuries-old discussions.

Yeah, it really increased the debate about anonymity, and it presented courts with new challenges that they hadn’t faced before. But again, the same dynamics were there.

What really started happening in the late 1990s was — and this almost all happened on Yahoo Finance — for every publicly traded company, there was a separate bulletin board where people could register under pseudonyms. They didn’t have to give their real names, but they had to provide an email address, and it logged IP addresses. And it was meant to talk about stock trading and financial news. But what you suddenly started to see was, often it was employees, sometimes it was investors or competitors who would go on to these bulletin boards and start posting some pretty harsh criticisms of the company’s executives. And by 2022 standards, that doesn’t seem pretty extraordinary. But in the late ’90s, this was a time when CEOs were not used to these anonymous employees going onto a public forum and criticizing them. It drove them nuts.

And so they figured out what they could do is file a lawsuit against John Does 1-20, for example. And then, as part of the early discovery, issue a subpoena to Yahoo, and Yahoo would give both the IP address and whatever email address they registered under. And oftentimes, that would be enough to identify them, if they used their name in their email address. But sometimes they would only have the IP address, and then the company would issue a second round of subpoenas to the ISP. And at that time, it was pretty straightforward: You’d usually be able to get the name of the person. They’d fire the employee and then dismiss the case. And this happened so many times.

Yahoo would just provide the information without giving any notice to the individuals. And they got a lot of pressure for this. There were groups like Epic, which really pressured Yahoo to start at least giving people notice, emailing them and say, “Hey, we got a subpoena, you’ve got two weeks to challenge it. And if not, we’re going to turn over your information.” And suddenly some people started to challenge it. You had people like Megan Gray, who was an associate at a law firm at the time, who started taking on a lot of these cases, and Paul Levy at Public Citizen, who started to really do a remarkable job litigating these cases.

By around 2001 to 2003, you started to have courts say, “OK, we’ve got to figure out what standards to set.” And based on these precedents from the previous 50 years that the Supreme Court and some lower courts had set, what they said is, “We’re not going to say that companies are barred from ever obtaining identifying information. But we’re going to set a really high standard, and a company or a plaintiff is going to have to show a very strong case, before they’re able to obtain the identifying information.” And this is based on the First Amendment right to anonymous speech that goes back for decades.

DP: There’s anonymity, and there’s privacy, which are related but not the same. And even anonymity is, to some extent, a balancing act between privacy and free speech. How do you think about where one ends and the other begins?

In the book, I draw on a definition from David Kaye, who’s a professor at the University of California Irvine. He wrote a report for the UN on anonymity, and he defines anonymity as “the condition of avoiding identification.” And I think that’s a really good starting point.

What these court cases have done is really created a culture of anonymity empowerment, where anonymity is not binary. It’s not that you either use your name or you don’t. It’s really a spectrum, where there’s the ability to at least somewhat control the identifying information that is associated with your speech.

It’s not that you either use your name or you don’t. It’s really a spectrum.

So anonymity is really about identity. Privacy is more about the underlying information. For health records, protecting the actual information and the health record, that’s privacy. Protecting who that is associated with is anonymity. There’s a lot of overlap, and one of the reasons why you might want anonymity is to protect privacy. But that’s not the only reason. You might want anonymity because it actually helps to shape the speech. The Federalist Papers, if it was just signed by Hamilton, Madison and Jay, that might have had a different impact than having the name of a long-deceased Roman emperor. So there’s that impact of anonymity. There’s also, frankly, just safety and legal reasons why you might very much want to avoid having your name associated with speech, especially if you’re criticizing powerful politicians or businesses. So that’s really the function that added anonymity plays: It really ultimately comes down to a power balance.

DP: The control piece of that is really interesting. And this is one place where I think the internet really has changed things: I’m basically helpless except to put a certain amount of information about myself out there in the world, and these companies and people who would like to know who I am have more and more mechanisms all the time, all the way up to facial recognition as I’m walking down the street, to figure out who I am. So it feels like the axis of power is just starting to tilt against the person who wants to stay anonymous, which means maybe the need to protect those people goes up over time.

I think that’s exactly right. And that’s why I argue that we need to preserve the First Amendment protections for anonymous speech. But that’s not sufficient anymore, because contrary to what we’ve heard for the past few years in the Section 230 debate, the First Amendment does not bind the purely private actions of private companies. It binds the government, and it binds the use of government powers, like court subpoenas.

But so much of our identifying information now is in the hands of private companies that do not have an obligation to comply with that First Amendment restriction. So they can — and we’ve seen this in a number of cases, particularly involving data brokers — freely trade information like biometrics and like geolocation data. And so I think privacy laws really need to catch up with the current technologies, and they need to incorporate the need to continue to protect anonymity. And I don’t think that either the current laws in the books or even the proposals have really fully accounted for that yet.

IL: Is there anywhere in the world where you see a country really reconciling this tension between privacy, free speech and anonymity in a way that is helpful, useful or instructive for the U.S.?

A little bit. I think that GDPR has some positive aspects from the anonymity front, but it doesn’t really go far enough. They provide some incentives to pseudonymize and anonymize data. But the problem is, they don’t get very concrete as to what it means to anonymize or pseudonymize data. And there’s research that goes back 20 years that shows if you have a data set with a few characteristics, like your hometown and your birth date, that’s going to be enough to be able to identify a lot of the population.

We’re starting to see that copied in laws like California’s and Virginia’s, where there’s some vague references to anonymization pseudonymization. But what I’d like to see is more effort, for example, to make the data fuzzier so you can’t do that.

What we’ve seen with GDPR, and in CCPA and some other laws and proposals, has been the ability to give people the ability to control their data, so they can request access and deletion of the data. And that’s good. I think that’s a good step. But the problem is, I don’t know how effective that is when you’re dealing with companies like data brokers. I mean, I work in this general field, and I have no idea which companies have my data. You’re requiring people to guess, who do I send this request to? And that’s a pretty big burden.

There are certain types of uses of identifying information that we’re not going to tolerate, and we’re going to ban it.

What I’d like to see, and I think a few jurisdictions have done this for the law enforcement use of facial recognition data, is to just say you can’t use it. There are certain types of uses of identifying information that we’re not going to tolerate, and we’re going to ban it. That’s the best way that I could see to be able to really continue to protect anonymity.

DP: One of the reasons I really like this subject is that the case against anonymity is very compelling. People use anonymity to do a lot of bad things on the internet! Many of which you document in the book! And I think the reason this argument has been so heated over the years is that if you argue in favor of anonymity, it’s very easy to accuse you of just trying to protect the bad guys. I think you err on the side — and correct me if I’m mischaracterizing this — of saying basically, the benefits outweigh the downsides, but we should still work on the downsides. But how have you grappled with the balance there, that part of accepting this is going to be just accepting that some bad things are going to happen as a result?

Well, some bad things will happen anonymously, but a lot of bad stuff also is happening under people’s real names. Facebook has always had a real name policy, which people can circumvent. Or, I mean, I would just use my local Nextdoor, which I’ve muted, because it’s so awful.

There’s actually some research that I cite in the book that has looked at how aggressive people are, commenting both under their real names and under pseudonyms. And what the research found was, they’re actually more aggressive when they’re commenting under their real names. That surprised me, but then when you think about it, it does make sense: They feel perhaps more ownership, and they feel like they have to get more defensive.

I spent a whole chapter talking about some particularly bad stuff that happens with someone who’s using Tor and VPN and all other sorts of stuff. But say we passed a law, what sort of legal requirement would have kept him from doing that? And I worry that some sort of real-name requirement, which has been proposed informally over the past few years, would prevent a whole lot of well-intentioned people who have valid reasons for speaking to sit down and keep from speaking at all. But the really bad actors, I don’t know how much that would actually prevent them from doing bad stuff.

DP: A running theme in your book is this evolving standard of when courts will decide people have voided their right to anonymity. And what was surprising to me is that I couldn’t quite put my finger on where that bar is. What is your sense of what it takes for a case, or a company who wants to unmask somebody, to clear that bar?

It depends on the type of case. So if it’s a defamation case, or a case where an employer is saying, “This was one of our employees and they disclosed trade secrets,” that’s a pretty high bar. It depends on the jurisdiction because the courts have come up with different tests, but the most important thing is that the plaintiff has to show that they have a very strong case. They can’t just say, “This was defamatory,” they have to demonstrate in their court documents why it’s defamatory and present evidence as to why it’s defamatory. And defamation is a high bar.

It’s relatively rare that a plaintiff is able to overcome that. And when they do, it’s generally for a very good reason. And I think that’s a good balance. I don’t think that there should be an absolute right to be anonymous — frankly, some of the really worst actors will probably be using Tor and other things where the subpoena is not going to help all that much. But the problem is that for other sorts of cases, they don’t have such strong protections.

I don’t think that there should be an absolute right to be anonymous.

With copyright cases, this started in the early 2000s, when the recording industry had the interesting strategy of suing listeners and trying to get to who had been sharing songs on peer-to-peer sites. And what came out of that was that copyright infringement itself, the judge said, is not First Amendment protected. And that standard, set in 2004, has been pretty widely adopted around the country for copyright cases. And for music cases, I think they would often settle for a few thousand dollars, typically, but the problem that I’ve seen is that over the past decade, that standard has been used most frequently in cases involving peer-to-peer sharing of legal pornography.

Some of the most prolific plaintiffs in the past decade in copyright John Doe cases have been companies that own the copyright to pornographic movies that are shared, usually on BitTorrent. And people settle for a good amount of money, because they frankly don’t want to end up having their name associated with this. It’s very embarrassing.

One judge called it an extortion scheme. And I get very worried when we’re using our court system in that way, and it really raises a number of both anonymity and privacy interests.

IL: One thing that I’ve been really concerned about and has gotten a good amount of coverage recently are these smear sites that have popped up. You know, She’s A Homewrecker Dot Com or whatever. I have a friend whose name was just plastered all over these sites. How has the court system come to refer to those situations, and the victims of these baseless online smears on all of these sites that have a monetary incentive to publicize this stuff? Has it been friendly to the plaintiffs in those cases?

It really depends on the site. Some sites both won’t log IP addresses and also have a policy of never taking anything down, and that gets to Section 230. (I actually couldn’t stay away fully from Section 230 in this book.) One of the proposals that I have is that you can still sue the poster, and even if they don’t show up, you’re never going to be able to get damages, but the court still can determine that, yes, this was defamatory.

This happened in a case in California where there was actually a lawsuit between the subject and the poster, and the poster never showed up in court. The court issued a default judgment saying “this is defamatory,” and as part of that ordered Yelp to take down the content because this is not protected content anymore. It’s defamatory. And Yelp went to the court and it went up to the California Supreme Court, which had a divided opinion, but essentially the plurality said, Section 230 protects Yelp from complying with this court order to take down material that’s adjudicated defamatory.

And I see some reasons for that. If it’s a default judgment, perhaps the court didn’t really look at whether it was defamatory. And Eugene Volokh has done great research where he’s found a number of forged court orders, because many platforms will voluntarily abide by court orders that something’s defamatory and take it down. And that’s problematic, but that’s a problem with people forging court orders. I’m not comfortable with Section 230 being used to avoid taking down content that’s adjudicated defamatory. This is different from just complaining to the site, and the site doesn’t have to spend any money defending itself.

It’s still a very big burden to file a defamation lawsuit. And there’s even some Streisand Effect issues where you might not want to bring the lawsuit because that may draw more attention to the harmful material. But at the same time, I speak with a lot of victims of various awful things that have happened online, and what I find most often is they’re not looking for money. What they want is to get this taken off the internet. I think there has to be a very high bar for that, because I don’t think we want to have a way to create an easy veto of anything you don’t like, but if something’s really awful, and actually is defamatory, I don’t think that we should have a legal system where it just stays up in perpetuity.