According to new research from the University of Nevada, Reno, online propaganda is a growing threat to domestic security, democratic institutions and public health. The two most recent, notable examples are the election-fraud misinformation that led to the 2021 insurrection at the U.S. Capitol and the anti-vaccination propaganda, which continues to undermine scientific efforts to turn the tide of COVID-19.
In a paper recently published in Expert Systems with Applications, Arash Barfar, assistant professor of information systems in the University’s College of Business, developed and tested a model for the automatic detection and explanation of propagandistic content on the Internet.
Barfar constructed a dataset containing nearly 205,000 articles from 39 propagandistic and 30 trustworthy news sources and computed 92 linguistic features for each article. He then built predictive models that detect online propaganda.
“In addition to its superior predictive performance, the final predictive model outran the baseline models, which is especially important for timely detection of computational propaganda that exploits bots and algorithms for audience targeting,” Barfar said. “It only takes a few seconds on a desktop computer to build the propaganda detection model from 205,000 news articles, each with nearly one hundred linguistic features.”
However, Barfar noted that there is often a trade-off between a complex model’s predictive performance and interoperability.
“Specifically, as complex models achieve state-of-the-art predictive performance, interpretation and explanation of their decisions become more difficult,” he said. “The inability to explain why a complex machine learning model makes a certain prediction can potentially lower the user’s trust in the model regardless of its accuracy.”
Motivated by this, Barfar drew on the coalitional game theory ideas to explain the impact of each linguistic aspect of a news article on its propagandistic score suggested by the model.
“The real-time graphical explanation of the propaganda score could in part enhance user trust in the judgments and make the models accepted at large,” Barfar said.
He further aggregated the contributions of each linguistic feature across all predictions to generate a high-level view of the important linguistic aspects that help unmask propagandistic content, thereby explaining how propagandists use the English language to sway people’s opinion or rally public support.
The applications of the proposed model add to the technological developments in countering propaganda in today’s American cyberspace. For instance, the model can examine text for propagandistic content, which helps find social media accounts contaminating the Internet with propaganda from unknown sources. In addition, the linguistic/game-theoretic approach in the study can also be applied to detecting and explaining anti-vaccination propaganda.