Definition
Real-World Incidents
Risk Categories
Safeguards
- Technical Safeguards
- Institutional Safeguards
See Also
References

Harmful Misinformation Hallucination

A harmful misinformation hallucination occurs when an AI system generates false information that poses direct risks to human health, safety, legal standing, financial well-being, or reputation. This form of AI hallucination is distinguished from other types not by its technical mechanism but by its potential consequences: the hallucinated output, if acted upon, can cause real-world harm.

Definition

Harmful misinformation hallucinations are AI-generated outputs that contain false claims capable of causing damage when trusted and acted upon by users. These hallucinations are especially dangerous in high-stakes domains such as healthcare, law, emergency response, and financial advice, where incorrect information can lead to physical harm, legal liability, or loss of life ¹⁾.

A Harvard Kennedy School framework characterizes these as a distinct form of misinformation that differs from traditional human-generated misinformation because AI lacks intent or epistemic awareness, meaning it cannot recognize or prevent the generation of harmful content ²⁾.

Real-World Incidents

Medical Hallucinations: OpenAI Whisper

OpenAI's Whisper speech-to-text model, adopted by over 30,000 medical workers and 40 health systems through the Nabla platform, was found to fabricate medical information in transcriptions. The model inserted false words, phrases, race attributions, violent rhetoric, and non-existent medical treatments into audio transcriptions, even when processing hospital recordings. One machine learning engineer discovered hallucinations in half of over 100 hours of Whisper transcriptions examined. Another researcher who analyzed 26,000 transcripts found hallucinations in almost all of them ³⁾. This occurred despite OpenAI's own warnings that Whisper should not be used in “high-risk domains” ⁴⁾.

Microsoft Copilot: Dangerous Medical Advice

The OECD AI Incident Monitor documented cases where Microsoft Copilot provided dangerous medical advice to users, generating recommendations that could lead to harmful health decisions if followed without professional medical consultation ⁵⁾.

Microsoft Copilot: False Criminal Accusations

German journalist Martin Bernklau, who worked as a court reporter, queried Microsoft's Copilot with his own name and location. The chatbot falsely accused him of being an escapee from a psychiatric institution, a convicted child abuser, and a conman preying on widowers. The AI had conflated Bernklau with the subjects of crimes he had covered in his reporting career ⁶⁾.

Legal System Harm: Mata v. Avianca

In Mata v. Avianca, attorneys submitted a brief containing entirely fabricated legal citations generated by ChatGPT. The fabricated authorities, if accepted by the court, could have affected the outcome of the case and harmed the opposing party. The incident demonstrated how AI hallucinations in legal contexts can compromise the integrity of the judicial process ⁷⁾.

Financial Market Impact

Google's Bard chatbot incorrectly stated that the James Webb Space Telescope took the first images of an exoplanet. This factual error, made during a public product demonstration, contributed to a reported $100 billion drop in Alphabet's market capitalization, demonstrating how AI hallucinations can have significant financial consequences ⁸⁾.

News and Emergency Misinformation

News bots hallucinating responses to queries about developing emergencies have been documented spreading unchecked falsehoods that undermine crisis mitigation efforts. In situations where accurate information is critical for public safety, AI-generated misinformation can directly endanger lives ⁹⁾.

Gemini Historical Image Generation

Google paused image generation in Gemini involving people after the model generated historically inaccurate images, including depicting Nazi-era German soldiers as people of color. While the model was attempting to apply diversity principles, the result was historically false content that generated significant controversy ¹⁰⁾.

Risk Categories

Health risks: Incorrect medical diagnoses, fabricated treatments, false drug interactions, or invented medical procedures that patients might follow ¹¹⁾.
Legal risks: Fabricated case law, incorrect interpretations of statutes, or false legal advice that could lead to adverse legal outcomes for individuals.
Reputational harm: False accusations or fabricated biographical details about real people, as demonstrated by the Bernklau incident.
Financial risks: Incorrect financial data, fabricated market information, or false investment advice that could lead to significant monetary losses.
Public safety risks: False emergency information, incorrect safety procedures, or fabricated crisis guidance.

Safeguards

Technical Safeguards

Retrieval-Augmented Generation (RAG): Grounding outputs in verified, domain-specific data sources reduces the probability of harmful fabrications ¹²⁾.
Domain-specific guardrails: Technical frameworks that validate outputs against trusted data sources before presenting them to users, particularly in medical and legal applications ¹³⁾.
Content moderation and filtering: Post-generation filters that flag potentially harmful claims before they reach end users.
Confidence calibration: Systems that withhold responses when model confidence falls below safety thresholds in high-risk domains.

Institutional Safeguards

Mandatory human review: Professional oversight for all AI outputs in healthcare, legal, and safety-critical contexts.
Regulatory frameworks: Government and industry standards for AI deployment in high-risk domains.
User education: Training users to critically evaluate AI outputs rather than treating them as authoritative ¹⁴⁾.
Disclosure requirements: Mandating that AI-generated content be labeled as such, particularly in professional and public-facing contexts.