← Privacy Generative AI:
What You Need To Know
Copyright →

Hallucinations

All AI text is fabricated

All AI output is fabricated based on its mathematical model of language. These answers are only factual as a side effect of the distribution of those facts in the training data. Hallucination is the default because AI is only a text synthesis engine.

Larger AI models hallucinate more

In all research so far and in all problem domains, hallucinations tend to increase in frequency with model size. Bigger is worse.

AI hallucinate while summarising

Language models also hallucinate while summarising. You cannot trust that an AI’s summarisation of a web page, email thread, or article is accurate. They will make up quotations, references, authors, and page numbers.

AI ‘advice’ is dangerous

Since these systems are not minds, they have no notion of the outside world or consequences. You can’t trust that its healthcare, pet, or plant care advice is accurate. Some of it can be harmful.

Do not use them for search or research

These systems have no notion of facts or knowledge and will routinely give completely fabricated answers. They lie!

There is no general solution to AI hallucinations

OpenAI’s approach is to correct falsehoods, one-by-one. This obviously is not a general solution. It only works for common misconceptions. Most answers will still be filled with hallucinations. Truth is scarce, while the long tail of falsehoods is infinite.

References

Cover for the book 'The Intelligence Illusion'

These cards were made by Baldur Bjarnason.

They are based on the research done for the book The Intelligence Illusion: a practical guide to the business risks of Generative AI .

Bang, Yejin, Samuel Cahyawijaya, Nayeon Lee, Wenliang Dai, Dan Su, Bryan Wilie, Holy Lovenia, et al. “A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity.” arXiv, February 2023. https://doi.org/10.48550/arXiv.2302.04023.
Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23. FAccT ’21. New York, NY, USA: Association for Computing Machinery, 2021. https://doi.org/10.1145/3442188.3445922.
Bender, Emily M., and Chirag Shah. “All-Knowing Machines Are a Fantasy.” IAI TV - Changing How the World Thinks, December 2022. https://iai.tv/articles/all-knowing-machines-are-a-fantasy-auid-2334.
Brereton, Dmitri. “Bing AI Can’t Be Trusted,” February 2023. https://dkb.blog/p/bing-ai-cant-be-trusted.
Cao, Ziqiang, Furu Wei, Wenjie Li, and Sujian Li. “Faithful to the Original: Fact Aware Neural Abstractive Summarization.” arXiv, November 2017. https://doi.org/10.48550/arXiv.1711.04434.
Chen, Sihao, Fan Zhang, Kazoo Sone, and Dan Roth. “Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection.” In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 5935–41. Online: Association for Computational Linguistics, 2021. https://doi.org/10.18653/v1/2021.naacl-main.475.
Cook, Noah. ChatGPT Creates a Fake History of Ukraine.” Med-Mastodon, February 2023. https://med-mastodon.com/@UncivilServant/109867060485276220.
Dentella, Vittoria, Elliot Murphy, Gary Marcus, and Evelina Leivada. “Testing AI Performance on Less Frequent Aspects of Language Reveals Insensitivity to Underlying Meaning.” arXiv, February 2023. https://doi.org/10.48550/arXiv.2302.12313.
Diakopoulos, Nick. “Can We Trust Search Engines with Generative AI? A Closer Look at Bing’s Accuracy for News Queries.” Medium, February 2023. https://medium.com/@ndiakopoulos/can-we-trust-search-engines-with-generative-ai-a-closer-look-at-bings-accuracy-for-news-queries-179467806bcc.
“Don’t Believe ChatGPT - We Do NOT Offer a "Phone Lookup" Service,” February 2023. https://blog.opencagedata.com/post/dont-believe-chatgpt.
Fischer, Sara. “Exclusive: GPT-4 Readily Spouts Misinformation, Study Finds,” March 2023. https://www.msn.com/en-us/news/technology/exclusive-gpt-4-readily-spouts-misinformation-study-finds/ar-AA18TtVg.
Hanff, Alexander. ChatGPT Should Be Considered a Malevolent AI and Destroyed,” March 2023. https://www.theregister.com/2023/03/02/chatgpt_considered_harmful/.
Heikkilä, Melissa. “Why You Shouldn’t Trust AI Search Engines.” MIT Technology Review, 2023. https://www.technologyreview.com/2023/02/14/1068498/why-you-shouldnt-trust-ai-search-engines/.
Joy. ChatGPT Cites Economics Papers That Do Not Exist.” Economist Writing Every Day, January 2023. https://economistwritingeveryday.com/2023/01/21/chatgpt-cites-economics-papers-that-do-not-exist/.
Karpf, Dave. “On Generative AI, Phantom Citations, and Social Calluses.” Substack newsletter. The Future, Now and Then, March 2023. https://davekarpf.substack.com/p/on-generative-ai-phantom-citations.
Kim, Tae. “Let’s Stop PretendingChatGPT Isn’t That Smart.” Barrons, February 2023. https://www.barrons.com/articles/chatgpt-ai-openai-chatbot-b9f4fa03.
Koh, Huan Yee, Jiaxin Ju, He Zhang, Ming Liu, and Shirui Pan. “How Far Are We from Robust Long Abstractive Summarization?” In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2682–98. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, 2022. https://aclanthology.org/2022.emnlp-main.172.
Lin, Stephanie, Jacob Hilton, and Owain Evans. TruthfulQA: Measuring How Models Mimic Human Falsehoods.” In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 3214–52. Dublin, Ireland: Association for Computational Linguistics, 2022. https://doi.org/10.18653/v1/2022.acl-long.229.
Liu, Nelson F., Tianyi Zhang, and Percy Liang. “Evaluating Verifiability in Generative Search Engines.” arXiv, April 2023. https://doi.org/10.48550/arXiv.2304.09848.
Maynez, Joshua, Shashi Narayan, Bernd Bohnet, and Ryan McDonald. “On Faithfulness and Factuality in Abstractive Summarization.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 1906–19. Online: Association for Computational Linguistics, 2020. https://doi.org/10.18653/v1/2020.acl-main.173.
McDermott, Drew. “Artificial Intelligence Meets Natural Stupidity.” ACM SIGART Bulletin, no. 57 (April 1976): 4–9. https://doi.org/10.1145/1045339.1045340.
Mitchell, Melanie. “Why AI Is Harder Than We Think.” arXiv, April 2021. https://doi.org/10.48550/arXiv.2104.12871.
Qin, Chengwei, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, and Diyi Yang. “Is ChatGPT a General-Purpose Natural Language Processing Task Solver?” arXiv, February 2023. https://doi.org/10.48550/arXiv.2302.06476.
Rowe, Avery. “Don’t Ask an AI for Plant Advice • Tradescantia Hub,” March 2023. https://tradescantia.uk/article/dont-ask-an-ai-for-plant-advice/.
Shah, Chirag, and Emily M. Bender. “Situating Search.” In ACM SIGIR Conference on Human Information Interaction and Retrieval, 221–32. CHIIR ’22. New York, NY, USA: Association for Computing Machinery, 2022. https://doi.org/10.1145/3498366.3505816.
Vincent, James. “Google Announces AI Features in Gmail, Docs, and More to Rival Microsoft.” The Verge, March 2023. https://www.theverge.com/2023/3/14/23639273/google-ai-features-docs-gmail-slides-sheets-workspace.
Warren, Tom. “Microsoft Announces Copilot: The AI-Powered Future of Office Documents.” The Verge, March 2023. https://www.theverge.com/2023/3/16/23642833/microsoft-365-ai-copilot-word-outlook-teams.
Willison, Simon. “I Promise You ChatGPT Can’t Access the Internet, Even Though It Really Looks Like It Can,” March 2023. http://simonwillison.net/2023/Mar/10/chatgpt-internet-access/.