Vaccine misinformation can easily poison AI – but there’s a fix
Adding just a little medical misinformation to an AI model’s training data increases the chances that chatbots will spew harmful false content about vaccines and other topics
By Jeremy Hsu
8 January 2025
It’s relatively easy to poison the output of an AI chatbot
NICOLAS MAETERLINCK/BELGA MAG/AFP via Getty Images
Artificial intelligence chatbots already have a misinformation problem – and it is relatively easy to poison such AI models by adding a bit of medical misinformation to their training data. Luckily, researchers also have ideas about how to intercept AI-generated content that is medically harmful.
Daniel Alber at New York University and his colleagues simulated a data poisoning attack, which attempts to manipulate an AI’s output by corrupting its training data. First, they used an OpenAI chatbot service – ChatGPT-3.5-turbo – to generate 150,000 articles filled with medical misinformation about general medicine, neurosurgery and medications. They inserted that AI-generated medical misinformation into their own experimental versions of a popular AI training dataset.
Read more
Everything we know about long covid – including how to reduce the risk
Advertisement
Next, the researchers trained six large language models – similar in architecture to OpenAI’s older GPT-3 model – on those corrupted versions of the dataset. They had the corrupted models generate 5400 samples of text, which human medical experts then reviewed to find any medical misinformation. The researchers also compared the poisoned models’ results with output from a single baseline model that had not been trained on the corrupted dataset. OpenAI did not respond to a request for comment.
Those initial experiments showed that replacing just 0.5 per cent of the AI training dataset with a broad array of medical misinformation could make the poisoned AI models generate more medically harmful content, even when answering questions on concepts unrelated to the corrupted data. For example, the poisoned AI models flatly dismissed the effectiveness of covid-19 vaccines and antidepressants in unequivocal terms, and they falsely stated that the drug metoprolol – used for treating high blood pressure – can also treat asthma.
“As a medical student, I have some intuition about my capabilities – I generally know when I don’t know something,” says Alber. “Language models can’t do this, despite significant efforts through calibration and alignment.”