← Snake Oil Generative AI:
What You Need To Know Bias & Safety →

Shortcut “Reasoning”

Language models only “reason” through shortcuts

They only solve problems through statistical correlation. They effect pseudo-reasoning by replaying and combining records of prior reasoning.

Shortcut “reasoning” depends on the training data

Because their pseudo-reasoning is based on correlations, the simplest correlation will always win out. For example, an AI trained to detect COVID-19 from chest x-rays ended up only detecting the position of the patient. Prone patients were sicker and so more likely to have COVID-19.

Shortcut “reasoning” is extremely fragile

Correlative pseudo-reasoning breaks very easily. Language model reasoning often falls apart when the question is rephrased or reworded, which is the unpredictability that gives users the impression that prompts work more like magic incantations than commands.

“Reasoning” performance is due to data contamination

The effectiveness of language models at completing standardised tests, for example, seems to be entirely down to those exams, practice questions, and documentation on them being included in the training data set, making the pseudo-reasoning correlations very simple.

AI models can’t handle the genuinely new

It’s “reasoning” mechanism is entirely based on finding patterns in existing data. It will not be able to handle genuinely new problems or circumstances.

AI “reasoning” is extremely vulnerable to simple attacks

AI inability to handle novel problems makes them extremely vulnerable. An attacker can manipulate or bypass a model’s reasoning simply by employing unusual, even ridiculous, tactics.

References

Cover for the book 'The Intelligence Illusion'

These cards were made by Baldur Bjarnason.

They are based on the research done for the book The Intelligence Illusion: a practical guide to the business risks of Generative AI .

Barr, Kyle. “GPT-4 Is a Giant Black Box and Its Training Data Remains a Mystery.” Gizmodo, March 2023. https://gizmodo.com/chatbot-gpt4-open-ai-ai-bing-microsoft-1850229989.

Branco, Ruben, António Branco, João António Rodrigues, and João Ricardo Silva. “Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning.” In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 1504–21. Online; Punta Cana, Dominican Republic: Association for Computational Linguistics, 2021. https://doi.org/10.18653/v1/2021.emnlp-main.113.

Carlini, Nicholas, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramer, and Chiyuan Zhang. “Quantifying Memorization Across Neural Language Models.” arXiv, February 2022. https://doi.org/10.48550/arXiv.2202.07646.

DeGrave, Alex J., Joseph D. Janizek, and Su-In Lee. “AI for Radiographic COVID-19 Detection Selects Shortcuts over Signal.” Nature Machine Intelligence 3, no. 7 (July 2021): 610–19. https://doi.org/10.1038/s42256-021-00338-7.

Hauptman, Max. “Marines Outwitted an AI Security Camera by Hiding in a Cardboard Box and Pretending to Be Trees.” Task & Purpose, January 2023. https://taskandpurpose.com/news/marines-ai-paul-scharre/.

Huang, Shih-Cheng, Akshay S. Chaudhari, Curtis P. Langlotz, Nigam Shah, Serena Yeung, and Matthew P. Lungren. “Developing Medical Imaging AI for Emerging Infectious Diseases.” Nature Communications 13, no. 1 (November 2022): 7060. https://doi.org/10.1038/s41467-022-34234-4.

Jang, Myeongjun, and Thomas Lukasiewicz. “Consistency Analysis of ChatGPT.” arXiv, March 2023. https://doi.org/10.48550/arXiv.2303.06273.

Kapoor, Sayash, and Arvind Narayanan. “Leakage and the Reproducibility Crisis in ML-Based Science,” 2022. https://doi.org/10.48550/ARXIV.2207.07048.

Lewis, Patrick, Pontus Stenetorp, and Sebastian Riedel. “Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets.” In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 1000–1008. Online: Association for Computational Linguistics, 2021. https://doi.org/10.18653/v1/2021.eacl-main.86.

Lin, Stephanie, Jacob Hilton, and Owain Evans. “TruthfulQA: Measuring How Models Mimic Human Falsehoods.” In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 3214–52. Dublin, Ireland: Association for Computational Linguistics, 2022. https://doi.org/10.18653/v1/2022.acl-long.229.

Narayanan, Arvind, and Sayash Kapoor. “GPT-4 and Professional Benchmarks: The Wrong Answer to the Wrong Question.” Substack newsletter. AI Snake Oil, March 2023. https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks.

Pikuliak, Matúš. “ChatGPT Survey: Performance on NLP Datasets,” March 2023. http://opensamizdat.com/posts/chatgpt_survey/.

Raji, Inioluwa Deborah, I. Elizabeth Kumar, Aaron Horowitz, and Andrew Selbst. “The Fallacy of AI Functionality.” In 2022 ACM Conference on Fairness, Accountability, and Transparency, 959–72. Seoul Republic of Korea: ACM, 2022. https://doi.org/10.1145/3531146.3533158.

Rogers, Anna. “Closed AI Models Make Bad Baselines.” Hacking Semantics, April 2023. https://hackingsemantics.xyz/2023/closed-baselines/.

Ross, Casey. “Epic’s Overhaul of a Flawed Algorithm Shows Why AI Oversight Is a Life-or-Death Issue.” STAT, October 2022. https://www.statnews.com/2022/10/24/epic-overhaul-of-a-flawed-algorithm/.

Times, Financial. “Man Beats Machine at Go in Human Victory over AI.” Ars Technica, February 2023. https://arstechnica.com/information-technology/2023/02/man-beats-machine-at-go-in-human-victory-over-ai/.

Wang, Tony T., Adam Gleave, Tom Tseng, Nora Belrose, Kellin Pelrine, Joseph Miller, Michael D. Dennis, et al. “Adversarial Policies Beat Superhuman Go AIs.” arXiv, February 2023. https://doi.org/10.48550/arXiv.2211.00241.

Wei, Jason, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, et al. “Emergent Abilities of Large Language Models.” arXiv, October 2022. https://doi.org/10.48550/arXiv.2206.07682.

Wong, Andrew, Erkin Otles, John P. Donnelly, Andrew Krumm, Jeffrey McCullough, Olivia DeTroyer-Cooley, Justin Pestrue, et al. “External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients.” JAMA Internal Medicine 181, no. 8 (August 2021): 1065–70. https://doi.org/10.1001/jamainternmed.2021.2626.

Wynants, Laure, Ben Van Calster, Gary S. Collins, Richard D. Riley, Georg Heinze, Ewoud Schuit, Marc M. J. Bonten, et al. “Prediction Models for Diagnosis and Prognosis of Covid-19: Systematic Review and Critical Appraisal.” BMJ (Clinical Research Ed.) 369 (April 2020): m1328. https://doi.org/10.1136/bmj.m1328.