Fine-Tuning

Also known as: Supervised fine-tuning, SFT, LoRA fine-tuning

Continuing to train a pre-trained model on your own data to specialise its behaviour, tone or domain knowledge.

Fine-tuning takes a pre-trained model and trains it further on a smaller, curated dataset. The most common form is supervised fine-tuning (SFT), where the model is shown input/output pairs and learns to imitate them. Reinforcement learning approaches (RLHF, DPO, RLAIF) further shape model behaviour using preference data.

In 2026, full fine-tuning of frontier models is rare — most teams use parameter-efficient methods like LoRA, which trains a small adapter on top of frozen weights, or prompt RAG instead. Fine-tuning is the right tool when you need a consistent format, a specific tone, or a narrow skill (classification, extraction) at lower latency and cost.

Hosted fine-tuning is available from OpenAI, Anthropic, Google and most open-weight providers (Together, Fireworks, Replicate) for popular models like Llama and Mistral.

Related terms

More to explore

Other wiki entries that touch on Fine-Tuning.