Proof of concept, narrow specialist, before-data shipping. The cheapest path is usually not fine-tuning.
- Method
- Few-shot + DSPy / GEPA prompt optimization
- Hardware
- Inference only
- Indicative cost
- $0 to $100 in API spend
The right method changes with the data volume and the compute budget. We name four tiers, the method for each, the hardware that fits, and an honest cost range.
Each tier card carries a different brand gradient so the eye can scan across at a glance. The method, hardware, and indicative cost are the durable parts. Pricing moves quarterly; we re-validate every engagement.
Proof of concept, narrow specialist, before-data shipping. The cheapest path is usually not fine-tuning.
Most enterprise fine-tunes land here: support tuning, style alignment, structured extraction, per-customer adapters. LoRA's sweet spot.
Multi-tenant SaaS with diverse customer data, deep domain adaptation, models that need cross-task generalization.
Continued pretraining for foreign vocabulary, full SFT on hard reasoning, GRPO/RFT runs that need thousands of rollouts, custom model builds.
The 2025-2026 supply has shifted. H100 stays the workhorse, H200 broadly available, B200 shipping with ~2.5x H100 training performance.
Single-GPU LoRA on 7B base, QLoRA on 13B. Cheap iteration.
Workhorse. FP16 LoRA on 7B-13B, QLoRA on 70B. 8x H100 for FSDP SFT to 70B.
New default for 70B+ FP16 fine-tunes on a single card. Broadly available across 24+ providers.
~2.5x H100 training perf. CPT, full SFT at 100B+, frontier RFT runs.
SFT, LoRA, QLoRA, DoRA, DPO, SimPO, ORPO, KTO, GRPO/RFT, distillation, model merging. Every named technique with when it earns the build.
Read moreSourcing, PII redaction (Presidio), synthetic data (Distilabel, Nemotron), DEITA quality scoring, MinHash + SemDedup, labeling vendors, feedback loops.
Read moreOpenAI RFT, Anthropic on Bedrock, Vertex, Azure Foundry, Databricks Mosaic, Together, Predibase, NeMo Customizer, Modal, Lambda. Side-by-side with our take.
Read moreContinued pretraining, SFT, preference optimization (DPO, SimPO, ORPO), reasoning distillation (R1 lineage), model merging (TIES, DARE). The full build pipeline.
Read moreEU AI Act (Article 25 substantial-modification trap), GDPR, HIPAA, FedRAMP, Colorado AI Act, India DPDP, China GenAI Measures. Region-by-region for tuned LLMs.
Read more