How Far Are We from Intelligent Visual Deductive Reasoning?

This paper was accepted on the How Far Are We from AGI? workshop at ICLR 2024.

Imaginative and prescient-Language Fashions (VLMs) reminiscent of GPT-4V have just lately demonstrated unimaginable strides on various imaginative and prescient language duties. We dig into vision-based deductive reasoning, a extra subtle however much less explored realm, and discover beforehand unexposed blindspots within the present SOTA VLMs. Particularly, we leverage Raven’s Progressive Matrices (RPMs), to evaluate VLMs’ skills to carry out multi-hop relational and deductive reasoning relying solely on visible clues. We carry out complete evaluations of a number of standard VLMs using normal methods reminiscent of in-context studying, self-consistency, and Chain-of-thoughts (CoT) on three various datasets, together with the Mensa IQ take a look at, IntelligenceTest, and RAVEN. The outcomes reveal that regardless of the spectacular capabilities of LLMs in text-based reasoning, we’re nonetheless removed from attaining comparable proficiency in visible deductive reasoning. We discovered that sure normal methods which are efficient when utilized to LLMs don’t seamlessly translate to the challenges offered by visible reasoning duties. Furthermore, an in depth evaluation reveals that VLMs battle to unravel these duties primarily as a result of they’re unable to understand and comprehend a number of, confounding summary patterns in RPM examples.

Source link

How Far Are We from Intelligent Visual Deductive Reasoning?

Helping robots grasp the unpredictable | MIT News

A technique for more effective multipurpose robots | MIT News

CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement

Global Automation and Robotics Executive Breakfast

Augmented Intelligence and Human-Robot Collaboration | RobotShop Community

Recommended For You

Helping robots grasp the unpredictable | MIT News

A technique for more effective multipurpose robots | MIT News

CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement

Function Calling at the Edge – The Berkeley Artificial Intelligence Research Blog

Building High-Performing Computer Vision Models with Encord Active and neptune.ai

Augmented Intelligence and Human-Robot Collaboration | RobotShop Community

A miniature wireless robot that can effectively move through tubular structures

Amid a world of evolving AI, a Las Vegas man brings his creations to life

Leave a Reply Cancel reply

HPI-MIT design research collaboration creates powerful teams | MIT News

Exploring frontiers of mechanical engineering | MIT News

MIT faculty, instructors, students experiment with generative AI in teaching and learning | MIT News

Creating bespoke programming languages for efficient visual AI systems | MIT News

The Current State of AI! (My Personal News Recap)

Japan Releases Fully Functioning Female Robots

The $15,000 A.I. From 1983

DO NOT Use ChatGPT To Do This

Forward Chaining in Artificial Intelligence | Forward Chaining in Artificial Intelligence Example

Surveillance robot could improve rehabilitation for patients with lower limb weakness

Helping robots grasp the unpredictable | MIT News

Vention, NVIDIA partner to bring automation and AI to small manufacturers

Adapta Robotics execs explain development strategies for testing and inventory robots

Supercharging Large Language Models with Multi-token Prediction

Why Are More People Using This Buyer’s Guide?

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password

How Far Are We from Intelligent Visual Deductive Reasoning?

You might also like

Global Automation and Robotics Executive Breakfast

Augmented Intelligence and Human-Robot Collaboration | RobotShop Community

Recommended For You

Leave a Reply Cancel reply

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password