Kensink Labs
PlatformsDirect LLM · vendor-neutralProduction grade
FINE-TUNING · VENDORS

Twelve fine-tuning platforms. Ranked by what each actually does well.

Managed APIs, serverless multi-LoRA, BYO GPU clusters, on-prem NIM. The 2026 vendor landscape with pricing, residency, audit, and the one we'd pick for each shape of project.

OpenAIAnthropicGoogle VertexAWS BedrockAzureTogetherPredibaseModal
Vendors compared
12 (managed + BYO)
Default managed
Together AI or Predibase serverless
Default RFT
OpenAI RFT (o4-mini)
On-prem default
NVIDIA NeMo Customizer
[SIDE BY SIDE]

The matrix.

Pricing snapshot is mid-2026. Numbers move quarterly; we re-validate every engagement. The qualitative columns (base models, methods, residency) are the durable parts of the comparison.

Twelve platforms, side by side.

Pricing is mid-2026 and moves quarterly; the qualitative columns (bases, methods, residency) are the durable parts.

Vendor
Base models
Methods
Pricing
Deployment
Residency
OpenAI fine-tuning + RFT
gpt-4.1, 4.1-mini, 4.1-nano, gpt-4o, o4-mini (RFT)SFT + RFT (GRPO)$25/1M tokens SFT (gpt-4.1); $100/hr RFT, $5k/job capManaged in-platform endpointsUS (default), EU via Azure
Anthropic Claude (via Bedrock)
Claude 3 Haiku only (GA Nov 2024)SFTBedrock fine-tune pricing + Provisioned ThroughputAWS Bedrock, requires Provisioned ThroughputUS, EU regions
Google Vertex AI
Gemini 2.5 Pro / Flash / Flash-LiteSFT + preference tuning (DPO-style)Per training token + 1.5x base inference for tunedVertex AI endpointsUS, EU, asia-* regions
AWS Bedrock
Bedrock-supported models + Custom Model Import (any HF model)Managed SFT + Custom Model ImportPer token managed; $0.0785/min/CMU for importsBedrock endpoints, Provisioned Throughput for tunedAll AWS regions
Azure AI Foundry
GPT-4.1, 4.1-mini, o-series (RFT)SFT + RFT (mirrors OpenAI)Mirrors OpenAI; $100/hr RFT o4-miniStandard, Global Standard, Provisioned ThroughputAzure global regions
Databricks Mosaic AI
Llama 3, Mistral, DBRXFull SFT, LoRA, DPOServerless H100 with InfiniBand, ~10x lower than proprietary per DatabricksUnity Catalog governance + Model ServingDatabricks regions
Together AI
Our default
Open models up to 100BLoRA, full SFT, DPO$0.48 / $1.20 (16B); $1.50 / $3.75 (17-69B); $2.90 / $7.25 (70-100B) per 1M tokensServerless multi-LoRA, dedicated endpointsUS (default), EU on request
Predibase
Multi-tenant SaaS
Llama, Mistral, Qwen, Gemma, PhiLoRA, RFT, DPO, KTOServerless at base-model per-token; Turbo LoRA add-onLoRAX multi-adapter serving, VPCUS + VPC anywhere
NVIDIA NeMo Customizer
On-prem default
Llama, Mistral, Nemotron, customLoRA, P-tuning, full SFT, DPO, GRPOOn-prem (compute is yours)Kubernetes + NIM via NIM Operator 2.0Your data center
Modal
Any (BYO training script)Any (PyTorch, Unsloth, Axolotl)A10G $1.10/hr, H100 ~$3.95/hr, B200 clusters availableServerless Python, schedule 128 B200s in one lineUS (default), EU on request
Lambda Labs
Any (BYO training script)Any1-Click Clusters: $4.49/GPU-hr, 1-week minimum, no egress16 to 512 H100s, InfiniBand 400 Gb/sUS
HuggingFace TRL + AutoTrain
Any open base on the HubSFT, RM, DPO, GRPO (TRL v1.0)Open source; AutoTrain managed pay-per-useInference Endpoints or self-hostedHub-hosted (US/EU) or self-hosted

Highlighted rows are our default picks for the most common project shapes.

[OUR DEFAULT PICKS]

What we recommend by shape of project.

Predibase or Together

Multi-tenant SaaS, hundreds of customer adapters

Serverless multi-LoRA at base-model per-token pricing. LoRAX-backed, Turbo LoRA for speedup. Cheapest path to ship per-customer fine-tunes.

OpenAI RFT (o4-mini)

Reasoning fine-tune, math + code + tool use

$100/hr, $5k cap per job. Verifier in the loop. The fastest path to a measurable reasoning lift if your data fits the o-series.

NVIDIA NeMo Customizer + NIM

On-prem, regulated industry, sovereign weights

Kubernetes-native, NIM Operator 2.0 for serving, full method coverage (LoRA, SFT, DPO, GRPO). The enterprise on-prem default.

Modal + Unsloth + Axolotl

70B+ fine-tune on a tight budget

QLoRA on a single 48GB GPU via Unsloth (2x faster, 70% less VRAM). Modal's per-second billing matches the iteration loop.

AWS Bedrock Frankfurt (Claude 3 Haiku)

EU residency + Claude voice

Only path to fine-tuned Claude. EU region, Provisioned Throughput. Good when the use case specifically needs Claude.

HuggingFace TRL + Lambda 1-Click

Open ecosystem, full control

TRL v1.0 unifies SFT, RM, DPO, GRPO. Lambda 1-Click for the GPUs. The recipe we recommend when no vendor lock-in is acceptable.

[WHAT YOU GET]

What we leave at handoff.

1 audit
Vendor selection memo, signed
1 setup
Account, BAA/DPA, residency configured
1 run
First fine-tune, eval-gated, deployed
Exit
Weights exportable, no lock-in
[COMMON QUESTIONS]

What buyers ask before they sign.

We need EU residency. Who's the right choice?
Vertex AI europe-west and Mistral La Plateforme are the most mature with both regional residency and customer-managed keys. AWS Bedrock Frankfurt is solid for Claude 3 Haiku fine-tunes and Custom Model Import. Azure West Europe mirrors OpenAI. For self-hosted on EU soil, Modal and Lambda both offer EU regions on request.
OpenAI RFT or self-hosted GRPO?
OpenAI RFT is faster to start, the bill is capped ($5k per job), and it runs on closed o-series. Self-hosted GRPO requires a rollout server and verifier infra, can run on any open base, and stays in your VPC. Pick by data sensitivity and base-model preference. For most regulated industries, self-hosted wins.
Why do you recommend Predibase or Together by default?
Serverless multi-LoRA at base-model per-token pricing is the cheapest path for SaaS products with hundreds of customer adapters. Predibase ships Turbo LoRA (speculative decoding) and LoRAX (open source) for thousands of adapters per GPU. Together has the broadest open-model coverage and the cleanest DPO pricing.
Anthropic fine-tuning — when is Claude 3 Haiku the right answer?
Narrowly. Claude 3 Haiku is the only Claude model with fine-tuning, available only through Bedrock, text-only, 32k context. It's a good choice when you specifically want Claude's voice in a tuned model and Provisioned Throughput cost is acceptable. For most projects we tune an open base instead.
FINE-TUNING · VENDORS · KENSINK LABS

Bring the use case. We will pick the vendor.

Independent benchmark, residency-aware, BAA/DPA negotiated, deployment hardened. Sized to the scope, scoped to the audit, signed at the artifact.