MatFormer: Nested Transformer for Elastic Inference

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

MatFormer: Nested Transformer for Elastic Inference

Listen for free

View show details

LIMITED TIME OFFER | Get 2 Months for ₹5/month

About this listen

In a collaboration between Google DeepMind, University of Texas at Austin, University of Washington and Harvard published on December 2024 researchers introduce MatFormer, a novel elastic Transformer architecture designed to improve the efficiency of large-scale foundation models. Unlike traditional models that require independent training for different sizes, this framework allows a single universal model to provide hundreds of smaller, accurate submodels without any additional training. This is achieved by embedding a nested "matryoshka" structure within the transformer blocks, allowing layers and attention heads to be adjusted based on available compute resources. The authors also propose a Mix’n’Match heuristic to identify the most effective submodel configurations for specific latency or hardware constraints. Their research demonstrates that MatFormer maintains high performance across various tasks, offering improved consistency between large and small models during deployment. Consequently, this approach enhances techniques like speculative decoding and image retrieval while significantly reducing the memory and cost overhead of serving AI models.

Source:

2024MatFormer: Nested Transformer for Elastic InferenceGoogle DeepMind, University of Texas at Austin, University of Washington, Harvard UniversityDevvrit, Sneha Kudugunta, Aditya Kusupati, Tim Dettmers, Kaifeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, Prateek Jainhttps://arxiv.org/pdf/2310.07707

No reviews yet