Fine-Tuning

Definition

Fine-tuning is the process of adapting a pretrained model to a specific task or behavior by continuing training on task-specific data, typically with a smaller learning rate. It is the second phase of the pretrain-then-fine-tune paradigm that defines modern NLP.

Key Intuition

A pretrained model already understands language; fine-tuning teaches it what to do with that understanding. Rather than learning from scratch, the model makes small adjustments to its existing representations, requiring far less data and compute than pretraining.

History/Origin

gpt-1 (Radford et al., 2018) established the pretrain-then-fine-tune paradigm for NLP, showing that a single pretrained model could be fine-tuned with minimal architectural modification across diverse tasks. bert made fine-tuning the standard approach, with task-specific heads added on top of pretrained representations. As models grew larger, full fine-tuning became expensive, motivating parameter-efficient methods like low-rank-adaptation (LoRA; see lora). Supervised fine-tuning (SFT) also became a key step in the rlhf pipeline, as demonstrated by instructgpt.

Relationship to Other Concepts

Fine-tuning builds on pretraining and is the primary mechanism for transfer-learning. low-rank-adaptation and other parameter-efficient methods (adapters, prefix tuning) reduce fine-tuning cost. In the alignment pipeline, SFT on demonstration data precedes rlhf or direct-preference-optimization. in-context-learning can be seen as an alternative to fine-tuning that requires no gradient updates.

Notable Results

GPT-1 fine-tuned on 9 of 12 benchmarks with the same architecture, demonstrating broad transferability. LoRA achieved comparable quality to full fine-tuning of GPT-3 175B while training only 0.01% of parameters. InstructGPT showed that fine-tuning with human feedback data dramatically improved helpfulness and safety.

Open Questions

When parameter-efficient fine-tuning degrades compared to full fine-tuning and why.
How to fine-tune without catastrophic forgetting of pretrained knowledge.
Optimal strategies for multi-task and continual fine-tuning.

Sources

Improving Language Understanding by Generative Pre-Training (File, URL)
LoRA: Low-Rank Adaptation of Large Language Models (File, DOI)
Training language models to follow instructions with human feedback (File, DOI)

AI Research Wiki

Explorer