Battling Online Bots, Trolls and People
(Inside Science) -- Half a century ago in 1968, Philip K. Dick published the sci-fi classic Do Androids Dream of Electric Sheep? which would later become the movie "Blade Runner." In the book, Dick introduced a device called the Voigt-Kampff machine -- an interrogation tool specifically designed to tell an android from a human. Now life is imitating art, with bots posing as humans running rampant on the internet. But unlike the androids in Dick's novel, the bots today are without a body or even a face. They meddled in the 2016 U.S. presidential election, and some experts say they are poised to continue the assault in the 2018 midterms.
But if we can't round them up in a room and ask them a series of questions, how can we pick out the bots among an online crowd of billions?
I. Human Programmers and Detectives
"I can never go to sleep. But I can lay quietly and not make a peep."
- David, "A.I. Artificial Intelligence" (2001)
If someone is tweeting 24 hours a day, then it could be a human who drinks a lot of Red Bull -- but more likely, it's a bot.
Unlike physical robots, bots today are just computer algorithms that perform whatever tasks programmers design them to. A bot can do anything from sending spam to a million email addresses to bringing down a website by overwhelming its bandwidth to writing impressively coherent poems and posting them on Twitter.
In the case of Twitter bots trying to manipulate the political news cycle, looking at the time stamps of an account’s tweets is just one way to judge whether or not the hundred or so characters came from a human or a bot. There are many more subtle hints one can use to pick out bots pretending to be human.
Political scientists and computer engineers have been investigating political bots and trolls in social media since at least mid-2000s, long before Russia's cyber campaign to infiltrate social media platforms to influence the 2016 U.S. presidential election.
In a recently published study from the journal Big Data, researchers trained human coders to help identify Russian bots on Twitter in 2014 and 2015. The human coders -- 50 undergraduate students from Moscow -- were taught to look for clues such as if a user's avatar is blank or looks like a stock photo, if a username contains too many numbers, and if a profile description looks too generic. The coders also looked for more nuanced details, such as whether the ratio between an account's friends and followers looked abnormal, or if its tweets only contained text and never photos.
But "bots don't reveal themselves to be bots," said Joshua Tucker, political scientist from New York University and author of the paper. Without easy access to the truth, they did the next best thing -- by comparing answers among the human coders, which turned out to be extremely consistent with each other. If a Twitter account looks like a bot and sounds like a bot to 49 out of 50 people, then it probably is a bot.
In all, the 50 trained human coders sniffed out roughly 250 bots among a thousand or so Twitter accounts that tweeted about Russian politics during events surrounding Russia's annexation of Crimea. But a thousand accounts are only the tip of the Twitter iceberg. To investigate further, the researchers had to borrow the power of computers themselves.
Using machine learning algorithms combined with data curated by the human coders, the researchers trained computers to analyze almost a quarter of a million Twitter accounts that tweeted more than 15 million times in 2014 and 2015.
They found that out of the 250,000 accounts, 97,000 were active throughout the period, of which roughly two-thirds may be bots. On certain days, more than half of all tweets related to Russian politics were posted by these presumed bots. As expected, the bots were less likely to share their location or tweet from a mobile device, but they also tended to retweet more than humans did, and when they did, they tended to only retweet the text without including any links.
Contrary to what the researchers previously expected, these bots don't simply tweet and retweet intensively like the old-timey bots that started inundating online forums and comment threads with Viagra ads in the early 2000s. Instead, they seem to have developed some finesse in their approach. This suggests that there might have been a battle between the programmers at Twitter and the Russian bot architects, each trying to gain control over what kind of information is fed to other users' Twitter feeds.
This war between programmers and bot architects isn't anything new -- think of your email spam filter versus the "Nigerian Prince" -- but now this war has extended into every other corner of the internet, and in turn, every aspect of our modern-day life.
"One of the big things that we have noticed is the rise of headlines from RT (formerly Russia Today) and Sputnik when we search for news stories online. These are essentially state-sponsored media organizations from Russia," said NYU's Tucker. “Basically, you have Google search ranking algorithms trying to figure out how to avoid what these bots are doing, and you have bots trying to disguise themselves from Google."
"And the cat-and-mouse game goes on,” he said.
II. Human Trolls and Sponsors
"'More human than human' is our motto."
- Dr. Eldon Tyrell, "Blade Runner" (1982)
As advanced as bots are today, fake news on the internet is still mostly generated by humans. In a tug of war between bots and bot detectives, political actors can sometimes overpower each other by spending a little more money and hiring actual humans, commonly known as paid trolls, to influence information on the internet as well.
Paid trolls are humans who do things like advancing certain political agendas online, and they are much more difficult to detect than bots. After all, short of digging up someone's IP address, how can you tell if a YouTube comment is coming from the basement of the Kremlin or a bungalow in Indonesia?
The internet’s global accessibility and user anonymity has given power to bots and trolls unlike any other media that has come before. Long gone are the days of propaganda via airdropping leaflets and hacking into TV towers. Today, paid trolls can spew fake news directly into American homes from anywhere on the planet as long as there's an internet connection.
To combat exploitation by state-sponsored trolls, Facebook has begun labeling paid political ads and requiring users who run popular Facebook pages to be verified. While none of these protocols can completely stomp out misinformation on the internet, they do provide a little bit of protection, according to Robyn Caplan, a social media expert from the Data and Society Research Institute in New York City.
"If a website requires a phone number to verify an account, it means that in order to create a fake account you’ll need to buy a burner phone, which would be an additional investment," said Caplan.
Anonymity has long been a mainstay of the internet. A double-edged sword, anonymity encourages people to freely speak their minds without the fear of social scrutiny or even legal repercussion. Different countries have approached the issue of internet anonymity with different levels of control. The Chinese government, for example, now requires all users of Weibo -- the Chinese equivalent of Twitter within the Great Firewall -- to personally verify their accounts via the “Real Name Registration” program.
On the surface, the purpose of the "Real Name Registration" program is to stop the spread of misinformation on the internet, but some commentators see it as an attempt by the government to further control online speech. China already has stricter laws than the U.S. for its netizens, such as banning online platforms from creating, publishing or promoting content that's "damaging national honor and interests.”
"There's always going to be a trade-off between anonymity and accountability and free speech and civility. I don't think it makes sense to say that we should be on one side or the other --there's a sweet spot in the middle, but nobody knows where that sweet spot is,” said Rishab Nithyanand, also a researcher from the Data and Society Research Institute. “It’s a philosophical debate that needs to happen."
III. Human Legislators and Regulators
Copyright: American Institute of Physics
"There are 400,000 words in the English language and there are seven of them that you can't say on television. What a ratio that is!"
- George Carlin, "Seven Words You Can Never Say On Television" (1972)
Someone can swear up a storm on Twitter to millions of followers but not on a local radio show listened to by hundreds. Why?
"The internet has always been like the Wild West," said Thaddeus Hoffmeister, a social media law expert from the University of Dayton in Ohio.
Information on the internet enjoys a great deal more freedom when compared to TV and radio, which are more heavily regulated by the Federal Communications Commission. Each social media platform usually has its own set of community guidelines. For example, you can find certain kinds of adult content on Twitter that are not allowed on Facebook.
Earlier this year, the U.S. government took a new step in regulating content on the internet. A new law signed by President Donald Trump this April, called the Stop Enabling Sex Traffickers Act and Allow States and Victims to Fight Online Sex Trafficking Act, or FOSTA-SESTA, adds an exception to the existing Section 230 of the 1996 Communications Decency Act, which protects online platforms from being held accountable for what their users post.
Lawmakers named Backpage -- an online classified advertising website similar to Craigslist -- as one of the main motivations behind the law. The site had been sued repeatedly for knowingly harboring ads related to sex trafficking and child pornography, but lawyers had struggled to win lawsuits because of Section 230.
However, while the law might be backed by good intentions, Hoffmeister warned of its possible unintended consequences. Legal experts have criticized the language of the bill as being "extremely vague and broad." After the law was passed, many websites including Craigslist and Reddit shut down parts of their communities.
Further regulations may also help cement market control for existing tech giants such as Twitter and Facebook, which already have systems in place that monitor user content, according to Hoffmeister. According to Facebook's own report this month, the company has removed 21 million pieces of pornographic content in the first three months of 2018 alone. A smaller startup company may not have the same resources to keep up with stringent regulations, Hoffmeister said.
"When Zuckerberg was launching Facebook from his dorm, he didn't have to worry about these things. So, if we enact new regulations to control bots and fake news, how would this affect the next startup company?" said Hoffmeister.
Even if we introduce new laws and platform protocols to curb these malicious bots and trolls now, will it be too late? According to a recent study, negative partisanship -- the idea that people are not joining together to support their own but to oppose the other party -- is on the rise and is contributing to incivilities on social media. So, is this recent intensification of division just a natural development in the United States' two-party system or is there something else that's feeding the fire?
IV. Human Moderators and Enforcers
“A mob's always made up of people."
- Atticus Finch, To Kill a Mockingbird (1960)
Intense debates between Democratic and Republican frenemies are as American as fireworks on the Fourth of July, but some surveys indicate the political divide today is bigger than it has been in decades. To study the trend of political arguments on social media, Nithyanand looked into Reddit, an online news aggregate and discussion platform that had experienced a quick rise in popularity since the early 2010s.
Unlike on Facebook and Twitter, posts and comments on Reddit are constantly moderated by human volunteers as well as computer algorithms. All content on Reddit is sorted into smaller communities called subreddits, with names like /r/liberal or /r/conservative, each with its own set of rules, and automatic filters and human moderators that monitor and remove anything that violate those rules. Because of this, Reddit provides thousands of microcosms for researchers to investigate how each of these subreddit communities shapes its user behaviors.
After analyzing more than 100 million Reddit comments between 2015 and 2016, Nithyanand discovered a clear asymmetry between conservative and liberal subreddits. During that period, conservative subreddits saw a 1,600 percent increase in links shared from controversial outlets known for peddling conspiracy theories and fake news, while liberal subreddits experienced almost no such activity.
Although it is difficult to tell if the 1,600 percent increase is due to a conscious effort to overwhelmingly infiltrate conservative subreddits over liberal subreddits, Nithyanand believes that the difference in rules and moderating philosophies in different subreddits played a significant role in shaping the difference. Nithyanand found that posts from controversial outlets on liberal subreddits are usually quickly disabled or removed by moderators and automatic filters, and receive little to no participation from their users. In contrast, moderators from conservative subreddits use a relatively “hands-free” approach and allow their users more freedom to post and share content.
This lack of action by moderators in conservative subreddits also likely fueled the huge influx of comments, many of them offensive and hateful, coming from users associated with extremist subreddits, according to Nithyanand’s study. His study found that conservative subreddits experienced more than a sixtyfold increase in such activities leading up to the 2016 U.S. presidential election, while liberal subreddits only experienced a twofold increase in similar traffic.
Nithyanand's team also found troll accounts that launched inflammatory comments from both sides of the political spectrum. Because inflammatory comments tend to breed more comments, these uncivil arguments are sustained and echoed in subreddits where moderators are either reluctant or unable to effectively flag and remove the comments.
"We should help provide tools for already overworked moderators to make it easier for content moderation," wrote Nithyanand in an email.
However, at the end of the day, regardless of how invasive these bots and trolls are online, they cannot directly choose our government for us. We the people do.
V. Human Citizens and Voters
"R2-D2! You know better than to trust a strange computer!"
- C3PO, "Star Wars: Episode V - The Empire Strikes Back" (1980)
The power of bots and trolls lies in their ability to deceive humans. A recent report by RAND Corporation, an American nonprofit policy think tank, listed a number of recommendations for defending democracies in the advent of malicious bots and trolls on social media. The group's recommendations include technical approaches such as developing and implementing more effective identification and tagging methods for bots and trolls, as well as "offline" solutions such as educating the public to be more vigilant against fake news and rumors.
Instead of relying solely on social media companies to fight this fight alone, the report recommends that countries educate their citizens on media literacy, expand and improve on content created by local trustworthy news outlets, allowing it to better compete with foreign state-sponsored propaganda, and empower influencers on social media.
According to Tucker, the fights between opposing political sides over how information is shared online may trigger something like an arms race, with both sides constantly trying to one-up each other. It creates a new battleground where political power can be won or lost, through manipulating citizens who may remain unaware of the motives and actions by these actors, he said.
"You have regimes that are trying to figure out ways to use social media to enhance their power, but you also have opponents of those regimes who are trying to enhance their own ability to contest against them," said Tucker.
If "information is the currency of democracy" -- a quote inappropriately attributed to Thomas Jefferson on some websites -- then it’s understandable why political actors might seek to manipulate information on social media and the internet. At the end of July Facebook announced it had shut down 32 fake accounts spreading divisive online messages in the run-up to the midterm elections.
"There were a lot of ideologies involved in what the internet is supposed to be -- I think somebody put it this way -- it's supposed to give opacity to the powerless and bring transparency to the powerful. But the internet has evolved from that and continues to change," said Nithyanand. "We need to help the public understand that this is something they should actually care about.”