Supriya Ghosh (Editor)

Hatebase

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Hatebase is a joint project of the Sentinel Project for Genocide Prevention and Mobiocracy that is described on its website as an "online repository of structured, multilingual, usage-based hate speech". It uses text analysis of speech and written content (including radio transcripts, transcripts of spoken web content, tweets, and articles) and identification of hate speech patterns within it to predict potential regional violence. The full source code for the API is available as open source code on GitHub

Contents

History

The introduction of Hatebase was announced on the Sentinel Project blog on March 25, 2013. The initiative is led by Timothy Quinn of Mobiocracy.

Description

In an article for Foreign Policy, Joshua Keating described Hatebase as follows: "There are two main features to Hatebase. The first is a Wikipedia-like interface which allows users to identify hate speech terms by region and the group they refer to. This could have some value for researchers, but Hatebase's developers are especially excited by the second main feature, which allows users to identify instances when they've heard these terms used." The example of the Rwandan Genocide was cited in that article and also in an article about Hatebase on Maclean's: in the months leading up to the genocide, radio stations attempted to dehumanize Tutsis to Hutus by repeatedly referring to the Tutsis as cockroaches.

The regional and multilingual focus of the site was deemed particularly useful for identifying words that could be construed as hate in some languages and contexts but that outsiders would not know of, such as the word "sakkiliya" in Sinhalese (the language in Sri Lanka) used to refer to a Tamil person as 'a very unhygienic or uncultured person' or the reference to Tutsis as cockroaches by the Rwandan radio stations, that an outsider may simply consider evidence that the region was suffering from a literal cockroach infestation. This relates to the challenge of identifying subtly different uses of the same or similar words, one of which connotes hate and the other doesn't. In the context of language that equates humans with pollution or stains, this is also called the human stain problem.

Another related challenge is to control for the ambient level of casual hate speech in society (such as YouTube comments): in some societies and contexts, hateful language may not be accompanied by or followed by violence, whereas in others, it might. For this reason, the evidence was only considered valuable in conjunction with other evidence about the risk and threat of violence, and the project concentrated its efforts on mapping hate speech in regions with a history of violence.

API

The Application programming interface for Hatebase is available on GitHub, along with all the source code. Information about the API can also be found at Programmable Web and Mashape.

Reception

The launch of Hatebase was covered in Wired Magazine and the story was picked up and discussed on Slashdot. Hatebase was also covered in Metro News, a Canadian publication. It was also covered in the Canadian weekly Maclean's.

Joshua Keating covered Hatebase in an article for Foreign Policy. A week later, the magazine published a response letter by Gwyneth Sutherlin, a doctoral candidate at the University of Bradford, pointing out potential problems and limitations of the approach used by Hatebase.

References

Hatebase Wikipedia