What Algorithms can Transformers Learn? A Study in Length Generalization

This paper was accepted on the MATH workshop at NeurIPS 2023.

Giant language fashions exhibit stunning emergent generalization properties, but additionally battle on many easy reasoning duties reminiscent of arithmetic and parity. This raises the query of if and when Transformer fashions can study the true algorithm for fixing a job. We research the scope of Transformers’ skills within the particular setting of size generalization on algorithmic duties. Right here, we suggest a unifying framework to grasp when and the way Transformers can exhibit sturdy size generalization on a given job. Particularly, we leverage RASP (Weiss et al., 2021) — a programming language designed for the computational mannequin of a Transformer — and introduce the RASP-Generalization Conjecture: Transformers are inclined to size generalize on a job if the duty will be solved by a brief RASP program which works for all enter lengths. This straightforward conjecture remarkably captures most recognized cases of size generalization on algorithmic duties. Furthermore, we leverage our insights to drastically enhance generalization efficiency on historically laborious duties (reminiscent of parity and addition). On the theoretical aspect, we give a easy instance the place the “min-degree-interpolator” mannequin of studying from Abbe et al. (2023) doesn’t accurately predict Transformers’ out-of-distribution habits, however our conjecture does. General, our work gives a novel perspective on the mechanisms of compositional generalization and the algorithmic capabilities of Transformers.

Source link

What Algorithms can Transformers Learn? A Study in Length Generalization

Eliminating Vector Quantization: Diffusion-Based Autoregressive AI Models for Image Generation

Voyage Multilingual 2 Embedding Evaluation | by Lars Wiik | Jun, 2024

Eric Evans receives Department of Defense Medal for Distinguished Public Service | MIT News

What does the future hold for generative AI? | MIT News

Accelerate data preparation for ML in Amazon SageMaker Canvas

Recommended For You

Eliminating Vector Quantization: Diffusion-Based Autoregressive AI Models for Image Generation

Voyage Multilingual 2 Embedding Evaluation | by Lars Wiik | Jun, 2024

Eric Evans receives Department of Defense Medal for Distinguished Public Service | MIT News

Imperva optimizes SQL generation from natural language using Amazon Bedrock

AI in Manufacturing: Overcoming Data and Talent Barriers

Accelerate data preparation for ML in Amazon SageMaker Canvas

Rice husk and recycled newspaper may be the eco-friendly insulation material of the future - Science & research news

ADAR Editor in Chief debunks common myths on substance abuse disorder - Science & research news

Leave a Reply Cancel reply

Helping robots grasp the unpredictable | MIT News

A technique for more effective multipurpose robots | MIT News

The Current State of AI! (My Personal News Recap)

Robotics investments reach $418M in November 2023

2024 World Battery & Energy Storage Industry Expo (WBE)

MIT faculty, instructors, students experiment with generative AI in teaching and learning | MIT News

What is AI – Artificial Intelligence in Telugu | Future of AI | TeluguBadi

Zion Solutions Group Joins Forces with Locus Robotics to Supercharge Warehouse Productivity

A method to enable safe mobile robot navigation in dynamic environments

Robot Talk Episode 90 – Robotically Augmented People

Eliminating Vector Quantization: Diffusion-Based Autoregressive AI Models for Image Generation

RBR50 Spotlight: Slip Robotics minimizes trailer loading times with simple approach

Voyage Multilingual 2 Embedding Evaluation | by Lars Wiik | Jun, 2024

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password

What Algorithms can Transformers Learn? A Study in Length Generalization

You might also like

What does the future hold for generative AI? | MIT News

Accelerate data preparation for ML in Amazon SageMaker Canvas

Recommended For You

Leave a Reply Cancel reply

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password