print preview Back News

Identifying «Fake News» more quickly

Imagine that Federal Councillor Viola Amherd called out on Twitter for a certain product to be bought or for money to be transferred to a particular account. Inconceivable? In the era of hacked social media accounts, unfortunately not impossible. In order to expose fake tweets quickly, armasuisse S+T has initiated a research project.

26.10.2020 | Dr. Gérôme Bovet, Head of Data Science, armasuisse Science and Technology

A hand is holding a smartphone. On the screen you can see the Twitter log-in page.
From 2021, armasuisse S+T will be researching how to identify fake messages on Twitter more quickly.

It is a problem of the Internet age, to which armasuisse Science and Technology (S+T) is devoting itself. In cooperation with the Zurich University of Applied Sciences (ZHAW), armasuisse S+T is pursuing the goal of making the social media landscape safer from fake news. This was triggered by the attack on Twitter last July when a fraud message was posted on the accounts of Barack Obama, Elon Musk and Bill Gates. The hackers wrote a fake message calling for a sum of the crypto currency Bitcoin to be bought and transferred to a particular account. They promised to return twice the amount.

Model learns how users write

From 2021, armasuisse S+T and the ZHAW will be working together on researching an algorithm designed to detect fake news on social media using Natural Language Processing (NLP). This method functions such that an algorithm learns how a user writes, and using personal features, recognises how a post from another person has been written and published in their name. For this purpose, a model is being trained using data records, in other words, published posts and tweets, to create an individual profile of the user. Among other things, the model takes into account the vocabulary used, the punctuation, the length of the sentences, etc. If a tweet is now sent which differs from the user profile created, this should be recognised as an anomaly. 

Screenshot of the posted tweet from Barack Obama's account after a hack attack on his Twitter account: I am giving back to my community due to Covid-19! All Bitcoin sent to my address below will be sent back doubled. If you send $1'000, I will send back $2'000!g
Screenshot of the posted tweet from Barack Obama's account after a hack attack on his Twitter account.
© Twitter Inc.

Twitter challenge

The NLP method is not new. What is new, however, is how the programme works with short texts. Software thus already exists in research which can detect, based on NLP, if texts do not come from the specified author. However, these texts are usually several pages long. In contrast, communication on Twitter is shorter – a tweet is restricted to 280 characters. The challenge is thus to teach the model to expose a fake user based on very few characters.

Well-known influencers seldom tweet alone. There is an often an entire team behind them who post in the name of the person. However, even this should not prevent the programme from recognising if tweets have been sent by a non-authorised person. For example, it should be able to establish if someone suddenly writes in an unusual style on a completely new topic which does not correspond to their normal profile.

Method goes further than that of Twitter and Co. 

Social media platform operators are also taking steps to track down fake messages. For this purpose, they often have the post meta data, such as the hashtags, links, time of post and number of characters, browsed through in order to reveal anomalies. If a user uses a link which refers to a phishing site, for example, the programme recognises this and can thus establish a fake user. With this method, the machine only recognises the structure of a post. With a NLP approach, the computer recognises and compares the contents, which is complex and requires new methods.

armasuisse S+T and the ZHAW will start their research activities in 2021. Between six and twelve months later, a demonstrator is expected to be created. Based on the research results, it will be decided whether to continue using the method.