In the paper titled ‘Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark,’ researchers put the spotlight on the challenges associated with fine-tuning pre-trained LLMs with standard first-order (FO) optimizers due to the considerable memory overhead. With the growth of LLMs, memory efficiency during fine-tuning has become a significant bottleneck, particularly for applications requiring on-device training.
The researchers propose a shift towards zeroth-order (ZO) optimization—a BP-free method that reduces memory costs. Their comprehensive study spans five LLM families, three task complexities, and an array of fine-tuning schemes, uncovering essential optimization principles that were previously untapped.
*Key Highlights:
This paper is a significant contribution to the field, offering a practical alternative to conventional methods for improving the memory efficiency of LLM fine-tuning. The proposed solutions can facilitate new research endeavors, especially for on-device machine learning tasks where memory resources are limited.
Explore the technical details and access their code repository: ZO Bench GitHub.