News

Artificial Intelligence Vision Models Misperceive Nonexistent Optical Illusions

When is a duck not similar to a rabbit? When it becomes a canard (French for "duck") instead.

, and Administrator

2025 September 3 . 9:57 PM

2 min read

Artificial intelligence vision models perceive non-existent optical illusions

Artificial Intelligence Vision Models Misperceive Nonexistent Optical Illusions

In a groundbreaking study, Tomer Ullman, an associate professor at Harvard's Department of Psychology, has delved into the intriguing world of AI perception, exploring how vision-language models often misidentify optical illusions in images that do not actually contain them. The research, titled "The Illusion-Illusion: Vision Language Models See Illusions Where There are None," sheds light on the discrepancy between human perception and AI models' tendency to see illusions even in unambiguous images.

Ullman's study evaluates several AI models, including GPT4o, Claude 3, Gemini Pro Vision (Gemini 1.5), miniGPT, Qwen-VL, InstructBLIP, BLIP2, and LLaVA-1.5. Among these, the three leading commercial models—GPT-4, Claude 3, and Gemini 1.5—show a somewhat better recognition of actual illusions but still frequently misidentify non-illusion images as illusions.

One notable example of this phenomenon is ChatGPT, based on GPT-5, which incorrectly identified a clear image of a duck as the duck-rabbit optical illusion, a well-known illusion that can be seen as either a duck or a rabbit. This demonstrates the "illusion-illusion" effect, where the model perceives illusion-like effects where humans see none.

Ullman argues that the term "hallucination" should not be used to describe models misidentifying optical illusions and instead suggests that the mistake made by AI models is related to Cognitive Reflection Tasks, where models falsely identify an image as an illusion and go off based on that.

The professor cautions that the mixed results should not be interpreted as a sign that those models are better at not deceiving themselves, but rather their visual acuity is just not that great. He also emphasises the need for a closer scrutiny of the disconnect between vision and language in current vision language models, particularly in light of their deployment in robotics and other AI services.

Ullman's research underscores the importance of understanding the limitations of AI models and the need for continued research to bridge the gap between human visual reasoning and AI systems. The data associated with Ullman's paper has been published online for further analysis and discussion.

[1] Ullman, T. (2023). The Illusion-Illusion: Vision Language Models See Illusions Where There are None. Harvard University, Department of Psychology. Retrieved from https://psychology.harvard.edu/ullman/publications/2023_The_Illusion-Illusion.pdf

Latest

This is the aerial view of a city. in this we can see buildings, towers, motor vehicles,...

Lifestyle

Romania's IPTV: The Future of Viewing Experiences

IPTV is revolutionizing Romania's content consumption. Engage with live polls, AR, and personalized content on your mobile devices. The future is here.

, and Administrator

2025 October 9

In the picture we can see a car engine with pipes, battery in it.

Climate-change

China Boosts EV Safety from 2026 with Mandatory Impact Tests and 'Battery Bazooka'

China's new EV safety rules promise tougher testing. The 'battery bazooka' could revolutionize fire prevention worldwide.

, and Administrator

2025 October 9

This is a paper. On this something is written.

War-and-conflicts

EU Committee Visits Taiwan Amid Rising Hybrid Threats and China Tensions

EU committee visits Taiwan to align against hybrid threats. President Lai Ching-te warns of increasing threats to both Taiwan and the EU.

, and Administrator

2025 October 9

In this image we can see there is a tool box with so many tools in it.

Stay Safe Online with Wise Learner Hub

CyberCX Speeds Up Essential Eight Compliance with New Solution

CyberCX's new solution cuts Essential Eight compliance time from months to days. It's a game-changer for organisations looking to bolster their cybersecurity fundamentals.

, and Administrator

2025 October 9

Artificial Intelligence Vision Models Misperceive Nonexistent Optical Illusions

Artificial Intelligence Vision Models Misperceive Nonexistent Optical Illusions

Read also:

Related

Latest