WHY THIS MATTERS IN BRIEF
As everyone finds their voice online everyone becomes a “news platform” able to spread any information they like, so big tech needs new tools.
In today’s world fake news and misinformation are such large problems that they threaten to destabilise countries and governments, undermine trust in one another, and ultimately threaten democracy itself. Therefore it’s a problem still looking for a solution. Now though researchers from the University of Sheffield in the UK have developed an Artificial Intelligence (AI) system that detects which social media users spread disinformation before they actually share it.
Suffice to say that could be game changing – even though it’d raise questions about the morality of censoring people before they say things which, ironically, just as we see in the world of pre-crime technology, which predicts who’s going to commit a crime before they actually do, would then mean we have to discuss the implications of pre-censoring people before they’ve done anything. As I always say, the future is an odd place full of increasingly odd must-have debates.
The team found that Twitter users who share content from unreliable sources mostly tweet about politics or religion, while those who repost trustworthy sources tweet more about their personal lives.
“We also found that the correlation between the use of impolite language and the spread of unreliable content can be attributed to high online political hostility,” said study co-author Dr Nikos Aletras, a lecturer in Natural Language Processing at the University of Sheffield.
The team reported their findings after analysing more than 1 million tweets from around 6,200 Twitter users.
They began by collecting posts from a list of news media accounts on Twitter, which had been classified as either trustworthy or deceptive in four categories: Clickbait, Hoax, Satire, and Propaganda.
They then used the Twitter public API to retrieve the most recent 3,200 tweets for each source, and filtered out any retweets to leave only original posts.
Next, they removed satirical sites such as The Onion that have humorous rather than deceptive purposes to produce a list of 251 trustworthy sources, such as the BBC and Reuters, and 159 unreliable sources, which included Infowars and Disclose.tv.
They then placed the roughly 6,200 Twitter users into two separate groups: those who have shared unreliable sources at least three times, and those who have only ever reposted stories from the trustworthy sites.
Finally, the researchers used the linguistic information in the tweets to train a series of models to forecast whether a user would likely spread disinformation.
Their most effective method used a neural model called T-BERT. The team says it can predict with 79.7 percent accuracy whether a user will repost unreliable sources in the future:
“This demonstrates that neural models can automatically unveil (non-linear) relationships between a user’s generated textual content (i.e., language use) in the data and the prevalence of that user retweeting from reliable or unreliable news sources in the future,” said their paper.
The team also performed a linguistic feature analysis to detect differences in language use between the two groups.
They found that users who shared unreliable sources were more likely to use words such as “liberal,” “government,” and “media,” and often referred to Islam or politics in the Middle East. In contrast, the users who shared trustworthy sources frequently tweeted about their social interactions and emotions, and often used words like “mood,” “wanna,” and “birthday.”
The researchers hope their findings will help social media giants combat disinformation.
“Studying and analysing the behaviour of users sharing content from unreliable news sources can help social media platforms to prevent the spread of fake news at the user level, complementing existing fact-checking methods that work on the post or the news source level,” said study co-author Yida Mu, a PhD student at the University of Sheffield.
You can read the full study in the journal PeerJ.