Low-rank finetuning for LLMs: A fairness perspective

Saswat Das, Marco Romanelli (NYU), Cuong Tran (Dyania Health), Zarreen Reza (OpenMined), Bhavya Kailkhura (LLNL), Ferdinando Fioretto (UVA)

May, 2024

Abstract

Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models (LLMs) due to their reduced computational and memory requirements. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. Our findings reveal that there are cases in which low-rank fine-tuning falls short in learning such shifts. This, in turn, produces non-negligible side effects, especially when fine-tuning is adopted for toxicity mitigation in pre-trained models, or in scenarios where it is important to provide fair models. Through comprehensive empirical evidence on several models, datasets, and tasks, we show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors. We also show that this extends to sequential decision-making tasks, emphasizing the need for careful evaluation to promote responsible LLMs development.

Type

Conference paper

Publication

Accepted at CoLoRAI @ AAAI-25

Low-rank finetuning for LLMs: A fairness perspective

Abstract

Saswat Das

PhD Student in Computer Science