Pick the merge method
TIES for general consolidation. DARE for noisy LoRAs. SLERP for combining two models. Task arithmetic for adding or subtracting behaviours.
Merging combines multiple fine-tunes into one model by averaging, trimming, or arithmetic on the deltas. Model soup averages, TIES trims and resolves sign conflicts, DARE drops and rescales delta parameters. Production use: stitch task-specific LoRAs into a single deployable.
If three task-specific fine-tunes need to coexist in one model (legal analysis + customer support + structured extraction), retraining a unified model is expensive. Merging combines the three into one set of weights in minutes, no GPU training needed.
mergekit YAML config, one command, eval the merged checkpoint.
TIES for general consolidation. DARE for noisy LoRAs. SLERP for combining two models. Task arithmetic for adding or subtracting behaviours.
YAML lists the source models, weights, and method. Run mergekit-yaml. Output is a merged model checkpoint.
Merging can degrade one task while combining the others. Eval all of them against the original golden sets.
Multi-skill consolidation, serving cost reduction (one model vs N adapters), behaviour composition (combining a refusal-tuned model with a code-tuned model).
When the source fine-tunes are deeply incompatible (different bases, different vocabularies), when per-task accuracy is the project's whole goal.
We merge when serving N adapters would cost more than the merge degrades any one task. The TIES method is our default.
Sourcing, PII redaction (Presidio), synthetic data (Distilabel, Nemotron), DEITA quality scoring, MinHash + SemDedup, labeling vendors, feedback loops.
Read moreOpenAI RFT, Anthropic on Bedrock, Vertex, Azure Foundry, Databricks Mosaic, Together, Predibase, NeMo Customizer, Modal, Lambda. Side-by-side with our take.
Read moreUnder 1k examples to over 1M, single A10G to 128 B200. Indicative cost, recommended method, hardware tier.
Read moreContinued pretraining, SFT, preference optimization (DPO, SimPO, ORPO), reasoning distillation (R1 lineage), model merging (TIES, DARE). The full build pipeline.
Read moreEU AI Act (Article 25 substantial-modification trap), GDPR, HIPAA, FedRAMP, Colorado AI Act, India DPDP, China GenAI Measures. Region-by-region for tuned LLMs.
Read more