LoRA
Two small low-rank matrices that compose with frozen base weights. 99% of the accuracy of a full fine-tune, 1% of the VRAM, and the adapter file is small enough to ship as a build artifact. The first method to reach for, the one to beat before considering anything else.
Single-task or single-tenant adaptation, multi-tenant SaaS with per-customer adapters, anywhere you want to keep the base current with provider updates, anything where memory is tight.
When you need 10x more model capacity (full SFT proven necessary in benchmarks), when the LoRA signal is being washed out by the base (try DoRA), when reasoning is the goal (consider GRPO/RFT instead).