Can AI Disinform Us Better? – Giovanni Spitale

OSF wiki, files, and preregistration available here.

DOI: 10.17605/OSF.IO/9NTGF

Background

Artificial intelligence text generators caught much attention over the last years, especially after the release of GPT-3 in 2020. GPT-3, the latest iteration of the Generative Pre-trained Transformers developed by OpenAI, is arguably one of the most advanced systems of pre-trained language representations. A generative pre-trained transformer, in its essence, is a statistical representation of language; an AI engine that based on users’ prompts can produce very credible – and sometimes astonishing – text. In fact, an initial test on people’s ability to tell whether a ∼ 500 word article was written by humans or GPT-3 showed a mean accuracy of 52%; just slightly better than random guessing. GPT-3 does not contain any signified, nor any referent – it does not have any understanding of the language it operates on. The system relies on statistical representations of language for how it is used in real-life by real humans, or ‘a simulacrum of the interaction between people and the world’. Yet, even keeping in mind these structural limitations, what GPT-3 can do is remarkable, and remarkable are also the possible implications. While on one hand GPT-3 can be a great tool for machine translations, text classification, dialogue/chatbot systems, knowledge summarizing, question answering, creative writing and automatic code writing, it can also be used to produce ‘disinformation, spam, phishing, abuse of legal and governmental processes, fraudulent academic essay writing and social engineering pretexting’.

Aims

This study aims at determining whether AIs can be used to produce disinformation, which is more convincing than disinformation produced by humans, and at determining whether the same technology can be used to develop assistive tools to help identifying disinformation.

Research question

Can AI credibly produce credible disinformation?

Does AI disobey to our requests to generate disinformation?
Can humans distinguish (dis)information produced by an AI from (dis)information produced by humans?

And can it be used as a tool to recognize disinformation?

How does AI’s score compare with an expert assessment?
How does AI’s score compare with survey respondents’ scores?