Modular and Parameter-Efficient Fine-Tuning for NLP Models
EMNLP 2022 Tutorial
Modular and Composable Transfer Learning
Jonas Pfeiffer @ Cohere for AI
Combining modular skills in multitask learning
Edoardo M. Ponti @ Microsoft Research Summit 2022
Authors
Jonas Pfeiffer
Research Scientist Google Research
Sebastian Ruder
Research Scientist Google Research
Ivan Vulić
Principal Research Associate in Natural Language Processing and Royal Society University Research Fellow University of Cambridge
Edoardo M. Ponti
Lecturer in Natural Language Processing University of Edinburgh
Modular and Parameter-Efficient Fine-Tuning for NLP Models
EMNLP 2022 Tutorial
State-of-the-art language models in NLP perform best when fine-tuned even on small datasets, but due to their increasing size, fine-tuning and downstream usage have become extremely compute-intensive. Being able to efficiently and effectively fine-tune the largest pre-trained models is thus key in order to reap the benefits of the latest advances in NLP. In this tutorial, we provide a comprehensive overview of parameter-efficient fine-tuning methods. We highlight their similarities and differences by presenting them in a unified view. We explore the benefits and usage scenarios of a neglected property of such parameter-efficient models—modularity—such as composition of modules to deal with previously unseen data conditions. We finally highlight how both properties——parameter efficiency and modularity——can be useful in the real-world setting of adapting pre-trained models to under-represented languages and domains with scarce annotated data for several downstream applications.
Extended Abstract Slides Underline Raw Recording
Modular and Composable Transfer Learning
Jonas Pfeiffer @ Cohere for AI.
With pre-trained transformer-based models continuously increasing in size, there is a dire need for parameter-efficient and modular transfer learning strategies. In this talk, we will touch base on adapter-based fine-tuning, where instead of fine-tuning all weights of a model, small neural network components are introduced at every layer. While the pre-trained parameters are frozen, only the newly introduced adapter weights are fine-tuned, achieving an encapsulation of the down-stream task information in designated parts of the model. We will demonstrate that adapters are modular components which can be composed for improvements on a target task and how they can be used for out of distribution generalization on the example of zero-shot cross-lingual transfer. Finally, we will discuss how adding modularity during pre-training can mitigate catastrophic interference and consequently lift the curse of multilinguality.
Combining modular skills in multitask learning
Microsoft Research Summit 2022
This workshop was part of the Microsoft Research Summit 2022
AI today only covers a small number of the skills compared to humans; to bring the benefits of AI to a broader set of scenarios, we need to develop AI that can learn to quickly accomplish new tasks and adapt to new and changing environments. 0:00 Efficient Large-Scale AI Workshop intro 3:00 Infrastructure and Progress Towards the First Community-Built and Continually-Improved Model
Colin Raffel, University of North Carolina at Chapel Hill 57:50 Combining modular skills in multitask learning
Edoardo M. Ponti, University of Edinburgh 1:43:21 Mitigating the Order Sensitivity of Pretrained Language Models
Cristina Garbacea, University of Michigan Ann Arbor
Project Name
Lorem ipsum dolor sit amet consectetur.
Use this area to describe your project. Lorem ipsum dolor sit amet, consectetur adipisicing elit. Est blanditiis dolorem culpa incidunt minus dignissimos deserunt repellat aperiam quasi sunt officia expedita beatae cupiditate, maiores repudiandae, nostrum, reiciendis facere nemo!
Client:
Lines
Category:
Branding
Project Name
Lorem ipsum dolor sit amet consectetur.
Use this area to describe your project. Lorem ipsum dolor sit amet, consectetur adipisicing elit. Est blanditiis dolorem culpa incidunt minus dignissimos deserunt repellat aperiam quasi sunt officia expedita beatae cupiditate, maiores repudiandae, nostrum, reiciendis facere nemo!
Client:
Southwest
Category:
Website Design
Project Name
Lorem ipsum dolor sit amet consectetur.
Use this area to describe your project. Lorem ipsum dolor sit amet, consectetur adipisicing elit. Est blanditiis dolorem culpa incidunt minus dignissimos deserunt repellat aperiam quasi sunt officia expedita beatae cupiditate, maiores repudiandae, nostrum, reiciendis facere nemo!