Researchers use data science to compare Soviet-era and modern U.S. writing to detect 'post-truth' journalism

The 2016 Oxford Dictionary word of the year was ‘post-truth,’ a notion meaning that the public is more greatly influenced by emotional appeals rather than objective facts.

While web-based media has increased the circulation and reach of fake news, the spreading of ‘alternative facts’ in an effort to assert political influence is not a new concept.

The same lack of traditional journalism standards that the United States saw during the last Presidential election happened in the Soviet Union, although in a vastly different context. State-sanctioned, official sources released content boasting about non-existent successful production and government programs.

In recent years, websites like Facebook and Google have taken efforts to detect fake news, but the extent to which they are effective is questionable. Presidential Fellows in Data Science Sarah McEleney and Alex Maxwell are investigating how well machine learning and other data science approaches work in detecting fake news in the ideologically distinct environments of the United States and the Soviet-era USSR.

Maxwell and McEleney worked over the course of the 2017-18 academic year fellowship to develop a code to identify false information and apply it to a large dataset of deceptive journalistic work from the U.S. and the USSR, selecting samples of fake news from both contexts.

Their ultimate goal is to see if there is a difference in how well the code works. Will dated fake news from the USSR be easier to detect, or will the code have more success finding false information in contemporary U.S. sources?

After conducting this data analysis on U.S. versus USSR fake news, the researchers plan to investigate how underground, citizen-created groups sought to undermine Soviet fake news. Little research has been done that considers these groups and their impact on political discourse and public understanding. Maxwell and McEleney hope to identify these trends through linguistic data analysis.

The researchers hope that their findings will help to improve data science-based fake news detection and understanding across various cultural contexts.

Completed in:

Data Science Institute
Office of Graduate and Postdoctoral Affairs