It’s often been said that the coronavirus is not only a pandemic, but also an infodemic as well. From claims that the virus was deliberately manufactured by the CIA to folklore cures like drinking bleach cures the virus, there’s no shortage of disinformation being spread about COVID-19.
As the Director General of the World Health Organization, Dr Tedros Adhanom Ghebreyesus, said in a recent speech to the public:
“Fake news spreads faster and more easily than this virus, and is just as dangerous. That’s why we’re also working with search and media companies like Facebook, Google, Pinterest, Tencent, Twitter, TikTok, YouTube and others to counter the spread of rumours and misinformation. We call on all governments, companies and news organizations to work with us to sound the appropriate level of alarm, without fanning the flames of hysteria.”
How do these fake news stories successfully reach mainstream media and capture the hearts and minds of the public? And what role should data play today in the responsibility of the media and social media platforms to filter out this type of disinformation?
So, Let’s Start With What is “Fake News”?
You’ve probably heard the term fake news before, and we’ve covered how Webz.io datasets can be used to develop algorithmic models using natural language processing (NLP) to detect fake news.
But the term “fake news” should more accurately be broken into two major categories:
How Data Powers Web and Media Monitoring Services
With the rise of the internet, today’s disinformation is more complex than ever. A news story that could once reach an impressively wide circulation of 100,000 can now reach millions. Media monitoring now plays an important role in fighting the spread of disinformation by making it easier to distinguish between fact vs. fiction. Collecting the data has to be automated, and a combination of AI, NLP, and human analysis must be employed to identify misinformation.
A few organizations have risen to the occasion in the midst of this crisis:
- FirstDraft, a nonprofit aimed at fighting misinformation, developed their own news engine using a database of reliable news sources related to the coronavirus, along with education to reporters and the public.
- SocialTruth, a EU-based project, focuses on aggregating large volumes of datasets enriched with metadata to validate their reliability. Later they will use the data to focus on developing algorithms that identify fake news.
- CoronaCheck, a collaboration between Cornell University and Eurecom, an engineering school in France, provides a search engine where users can check claims about the spread of COVID-19 through a database of articles in English, Italian and French.
Here at Webz, we’ve dedicated ourselves to providing high-quality, structured data to media and monitoring organizations. Our data, along with other open source datasets, is being used to power FakeNewsCorpus. It’s a database of over 9 million news articles for researchers all over the world to use as a basis for their machine-learning and deep learning algorithms that will automatically detect fake news.
Winning the Battle Against Disinformation
Disinformation campaigns have been used as a weapon against enemy regimes for many decades and show no signs of disappearing any time soon. And since the internet has made it easier than ever to create and share these stories, disinformation campaigns have now spread to more than 70 different countries. With social media, lies today spread faster than the truth and become more real every time the lie is repeated. But one shining ray of hope is the technology we have to automatically gather web data. With this data, researchers all over the world are developing AI, machine-learning and natural language processing models to identify fake news. And this is what will significantly contribute to winning the battle against disinformation.
Want to learn more about gaining access to high-quality data for your news or media monitoring service? Schedule a call with our data experts today!