AI could make trolls think twice before retweeting offensive content

Hate speech has existed in the United States since the founding of our country. Despite efforts to move toward becoming a more progressive, inclusive nation, it seems as though the rise of social media has made hateful keyboard warriors the poster children for modern-day America. Many social media platforms have made attempts to counter hate speech using AI-driven tactics, but the number of hateful messages spread on sites like Twitter and Facebook remains steady. There might not be a perfect solution to the problem of hate speech on social media, but third-party companies are jumping in to take a stab at creating anti-hate bots to help clean up the content on some of the most popular social platforms.

We Counter Hate

Possible recently launched a new campaign called We Counter Hate that aims to curb the spread of hate speech on Twitter by turning retweets of hate speech into donations to an organization called Life After Hate. The company teamed up with Spredfast to train artificial intelligence to identify Twitter users spreading hateful messages. Once the AI detects a hateful tweet, human moderators step in to determine the appropriate response.

If a moderator decides the tweet picked up by the AI is, in fact, hate speech, they send a counter message that says, “This hate tweet is now being countered. Think twice before retweeting. For every retweet, a donation will be committed to a non-profit fighting for equality, inclusion, and diversity.” The message also links to the campaign’s website to provide additional information.

In an email to the VentureBeat team, Courtney Kaczak, PR director at Possible said, “This reply permanently marks these messages of hate and makes it clear to those who wish to spread hate speech that each retweet of this message equals a donation to U.S. non-profit Life After Hate, an organization that helps reform and remove people from violent extremist groups.”

The team selected Twitter as their first target because this particular platform seems to be the megaphone of choice for hate groups.

We did some digging and found an example of the bot at work.


How the machine determines hate speech

The team at Possible adapted Gregory Staunton’s 10 Stages of Genocide to build a system for identifying hate speech. Staunton’s report guided the team in understanding the process of classification and dehumanization. They took Staunton’s structure and condensed its points to include only those relevant to the Twittersphere. The team also added more contemporary situations found on social media like coded language.

Kaczak shared the table below to explain the AI’s structure for defining hate speech.


A look at the AI driving this campaign

Possible’s technology employs machine learning to analyze thousands of tweets and return hate classifications within milliseconds. Kaczak noted that the platform is flexible enough to allow moderators to adapt the technology as they identify new terminologies used by hate groups on social media.

The first step the company took in the process of bringing this campaign to life was building the machine. They leveraged enterprise-level AI platforms for natural language processing and image recognition APIs to review and interpret tweets in real time.

The next step in the process was to train the machine. This is where Possible worked with Spredfast to use their intelligent social listening platform to moderate incoming messages and categorize them into streams of hate speech. The team at Possible then feeds these streams into the machine on an ongoing basis so it can understand the linguistic nuances and continue learning.

Although AI can help the team filter through massive amounts of tweets to find potentially hateful messages, machines are not perfect at identifying all instances of hate. Some innocent messages could be misinterpreted by the machine as hate speech based on certain words or phrases they use. This is why human moderators step in to evaluate the machine’s work and respond only to tweets that actually include hateful content.

Where does the money come from?

Obviously, this campaign requires some serious funding to work. The whole idea runs on donations from the public. Those who are interested in the cause can pledge to donate a certain dollar amount per month that goes toward sponsoring donations to counter hateful tweets. Excluding the service fee collected by the online fundraising platform, Public Good, Kaczak says all donations go directly to the campaign’s beneficiary.

We Counter Hate also accepts suggestions from the public for Twitter handles they should watch for hate speech.

Will it work?

It’s hard to put faith in a third-party campaign aiming to end hate speech when Twitter itself can’t even seem to find an effective solution to stop hate speech on its platform. Possible has set a hefty goal and while their strategy is intriguing, the team will definitely have their work cut out for them as they try to make this concept a success. Here’s hoping their efforts put at least a small dent in the amount of hate speech we see in our Twitter feeds.

Social – VentureBeat

Comments are closed