# Kensink Labs > A small lab of senior engineers shipping production AI for SaaS companies. No framework lock-in, no orchestration vendors. Every page below is available as Markdown by appending `.md` to its URL. ## Pages - [Build your next startup with AI Engineers - Kensink Labs](https://www.kensink.com/index.md): Hire professionals equipped with agentic AI to build your next startup project. Kensink can be your interim CTO, Team Lead and Experienced Managers for technology rich efficient business project. ## Case studies - [Cases](https://www.kensink.com/cases.md): Production AI and engineering systems we've shipped across banking, telecom, payments, LegalTech, marketplaces, and more. Real outcomes, measured. - [3rdbell: First regional VOD platform, with enterprise DRM. Acquired.](https://www.kensink.com/cases/3rdbell.md): South Asia's first VOD streaming platform at high concurrency. Designed on AWS with a commercial streaming server, enterprise-grade DRM, media chunking, and content protection that satisfied regional rights holders. This led to a strategic acquisition. - [Affidavit Mapp: Court-ready bank statements, days to minutes. 99.7% data integrity.](https://www.kensink.com/cases/affidavit-mapp.md): Family-law firms drown in bank statements. We built a secure OCR + LLM pipeline that turns raw PDFs into court-admissible reports in minutes, not days. It uses strict grounding, full chain-of-custody, and hits 99.7% data integrity measured against ground truth. - [AIcoach.pw: AI as augmented team, not one-shot trick. Across the whole SDLC.](https://www.kensink.com/cases/aicoach-pw.md): Enterprise AI adoption usually starts as a 'pilot' and ends as a slide deck. AIcoach.pw is our practice for treating AI as an augmented team across the entire software development lifecycle: development, review, testing, security, ops. Built on the conviction that AI changes the team, not just the tooling. - [AICoach: Onboarding that pays for itself by week one. +18 pt activation.](https://www.kensink.com/cases/aicoach.md): Static onboarding forms were killing AICoach's activation. We replaced them with an adaptive LLM flow that listens to the user's goal, experience, and constraints, and shipped the eval suite the team now owns. +18 percentage points on week-one activation, measured against the previous flow. - [CWK Experience: Brand operations for founders, run by AI. Kaia, the brand's AI guide.](https://www.kensink.com/cases/cwk-experience.md): CWK's mission, 'the manager for entrepreneurs', needed a platform that integrated AI assistant orchestration, brand-content workflows, and a service-delivery model for non-technical operators. Kensink built it as their technology partner. - [Foodpanda BD: Order pipeline across CRM, payments, and supply chain. Multi-system integration.](https://www.kensink.com/cases/foodpanda-bd.md): Foodpanda's Bangladesh expansion required CRM, supply-chain, and payment-gateway integration that fit local market constraints. We led the integration delivery as consultant, handling vendor coordination, payment-gateway local quirks, and the supply-chain logic the global stack didn't account for. - [Grameenphone digital stack: Complete telco digital stack, now operated by Genex Infosys. Vendor delivery at MNO scale.](https://www.kensink.com/cases/grameenphone-stack.md): Grameenphone is Bangladesh's largest MNO. We led delivery of the complete digital solutions stack as enterprise solutions vendor. The same stack is now operated by Genex Infosys. Multi-year engagement, MNO-scale workloads, full vendor lifecycle from RFP through go-live and handover. - [Khushi POS: Seven branches, one offline-first POS. Commercial POS hardware + sub-second sync.](https://www.kensink.com/cases/khushi-pos.md): Retail POS systems break in the worst place: at the checkout counter when the internet goes down. We built Khushi POS on commercial POS hardware with offline-first architecture, real-time inventory sync across seven branches, and a deployment story that doesn't require a tech team on site. - [Krishok: Seasonal cattle marketplace, seven-day sprint. Cloudflare-only · zero servers.](https://www.kensink.com/cases/krishok-bd.md): Krishok.bd opened a seasonal cattle discovery platform for Eid ul Adha 2026 in Bangladesh. Aggregated public listings from four sources, hosted self-listings from farmers and brokers, and never touched a payment. Cloudflare-only architecture, BN + EN from day 1, seven-day sprint from architecture to launch. - [LearnBuddy AI: An AI tutor that listens, then asks the kid to teach it back. Cloudflare-edge · multi-agent · BN + EN.](https://www.kensink.com/cases/learnbuddy.md): LearnBuddy AI is a cross-platform after-school learning app for children 6 to 14 built around the Feynman teach-back method. Kids learn a concept, explain it back to Buddy by voice, get a feedback report in under five seconds, and refine until mastery. Voice-first Explain It loop with Whisper, homework scanning via Vision OCR, a multi-agent supervisor + sub-agent prompt system running on Cloudflare AI Gateway, and a parent dashboard that finally answers 'did my kid actually learn anything tonight?' Fully edge-native, bilingual day one, eval-gated before every release. - [Mevrik: Tier-1 CX at billion-message scale. 99.9% uptime.](https://www.kensink.com/cases/mevrik.md): Multi-tenant Digital CX platform deployed inside the largest banks (Pubali Bank, City Bank PLC) and telcos (Grameenphone, Robi) in Bangladesh. Event-driven microservices on AWS EKS, Kafka backbone, billions of messages on the Grameenphone instance. - [Nuforce: Payment-integrated B2B SaaS, shipped end-to-end. 200+ accounts on one ledger.](https://www.kensink.com/cases/nuforce.md): Field-service management built around three payment processors abstracted behind one unified ledger, with tokenisation, idempotent capture/refund, reconciliation, and dispute handling. We took it from architecture to 200+ B2B accounts in production. - [OpenAPI Studio: End-to-end API visibility, security to automation. Built on a conviction.](https://www.kensink.com/cases/openapi-studio.md): Most teams treat APIs as the seam between systems. We think APIs are the system. OpenAPI Studio gives engineering teams end-to-end visibility across their API estate, from security posture to runtime observability to automated quality checks. - [Shomadhan: South Asia's Angie's List, built and exited. Acquired.](https://www.kensink.com/cases/shomadhan.md): A pioneering hyper-local two-sided marketplace connecting service providers with consumers across South Asia. From architecture to launch to acquisition, Kensink owned the full lifecycle as CTO and lead architect. - [SMB Supply Chain: Multi-tier supply chain, finally visible. ERP + WMS + CRM unified.](https://www.kensink.com/cases/smb-supply-chain.md): Export-oriented manufacturers run on five disconnected systems. We designed integration patterns across ERP, WMS, and CRM, then built the asset-tracking layer that solves the multi-tier delivery + parts-verification problem at the factory floor. - [Sootcase: Group itineraries, finally collaborative. First GenAI travel planner shipped to both surfaces.](https://www.kensink.com/cases/sootcase.md): Group travel planning is a coordination disaster. Sootcase set out to fix it with GenAI: collaborative itineraries that adapt to everyone's constraints. We led architecture and delivery from concept to synchronized web + mobile launch. ## Field notes (blog) - [Field notes](https://www.kensink.com/blog.md): Notes from the field. Engineering, eval, and the boring infrastructure of production AI. - [Dispatch, don't paste. How CEOs brief engineers in the age of AI.](https://www.kensink.com/blog/dispatch-dont-paste.md): AI gives you the tip of the iceberg in thirty seconds. It is a beautiful tip, a competent technical sketch by any measure, and we use the same models to draft them. It is also about ten percent of what your project needs. Here is a friendly tour of the other ninety, and the K-Framework map we use to navigate it. - [The Frontier Firm playbook: a twelve-week field guide.](https://www.kensink.com/blog/frontier-firm-playbook.md): How a 200-person company restructures its org chart around AI agents without firing anyone, without losing a year of credibility, and without buying a transformation deck from a consultancy. - [How to calculate ROI for an LLM project.](https://www.kensink.com/blog/llm-roi.md): A CFO-friendly framework for sizing the payback on a direct-LLM agent before you commit a budget. Five inputs, one output, no consultants required. - [RAG vs fine-tuning: which is right for you?](https://www.kensink.com/blog/rag-vs-fine-tuning.md): A short, opinionated decision tree. Most teams need RAG. A small minority benefit from fine-tuning. Here is how we tell them apart. - [What an LLM is really doing. A field guide to the transformer, one sentence at a time.](https://www.kensink.com/blog/what-an-llm-is-really-doing.md): Strip away the mystique and a large language model does exactly one thing: it guesses the next token. Everything else is machinery built to make that guess good. We follow a single sentence through that machinery, stage by stage, explaining each part by the problem it solves. - [Why most AI projects fail, and what we do differently.](https://www.kensink.com/blog/why-ai-projects-fail.md): Six failure patterns we see in every audit, ranked by how much money they cost, plus the cheap fixes that prevent them. ## Model guides - [LLM Models We Integrate: Kensink Labs](https://www.kensink.com/models.md): Direct integration with the models that matter: OpenAI GPT, Anthropic Claude, Google Gemini, Llama, Mistral, embeddings, and speech-to-text. - [Claude Integration Services: Kensink Labs](https://www.kensink.com/models/claude.md): Direct Anthropic Claude integration from a senior lab: evals, prompt governance, cost control, vendor-neutral abstraction. Full source ownership. - [Claude Fable 5: Pricing, Benchmarks, and When to Use It](https://www.kensink.com/models/claude/fable-5.md): A senior lab's read on Claude Fable 5, Anthropic's new flagship tier above Opus: pricing, benchmarks vs Opus 4.8 and GPT-5.5, what changes in the API, and when the premium pays off. - [Claude Opus 4.8: Pricing, Benchmarks, and What It Changes](https://www.kensink.com/models/claude/opus-4-8.md): A senior lab's read on Claude Opus 4.8: pricing, benchmarks vs Opus 4.7 and GPT-5.5, alignment improvements, dynamic workflows, fast mode, and what changes for production agents. - [DeepSeek Integration Services: Kensink Labs](https://www.kensink.com/models/deepseek.md): DeepSeek open-weight model integration from a senior lab: eval-tested, self-hostable, cost-aware. Full source ownership. - [Embeddings & Semantic Search: Kensink Labs](https://www.kensink.com/models/embeddings.md): Embedding pipelines from a senior lab: model selection, chunking, vector storage, eval-tuned for RAG and search. Full source ownership. - [Sakana Fugu Integration Services: Kensink Labs](https://www.kensink.com/models/fugu.md): Direct Sakana Fugu integration from a senior lab: a multi-agent orchestration model (Fugu and Fugu Ultra) behind a vendor-neutral abstraction, eval-gated, with a fallback you control. Full source ownership. - [Fugu Ultra (Sakana AI): Pricing, Benchmarks, and How the Orchestration Works](https://www.kensink.com/models/fugu/fugu-ultra.md): A senior lab's read on Sakana's Fugu Ultra: a multi-agent system delivered as one model, benchmarks vs Claude and GPT, $5 / $30 pricing, the proprietary routing trade-off, and frontier capability without single-vendor or export-control risk. - [Fugu (Sakana AI): Pricing, Benchmarks, and How the Orchestration Works](https://www.kensink.com/models/fugu/fugu.md): A senior lab's read on Sakana's Fugu: a low-latency multi-agent system delivered as one model, benchmarks vs Claude and GPT, rate-tracking pricing, the proprietary-pool trade-off, and frontier results without single-vendor risk. - [Google Gemini Integration: Kensink Labs](https://www.kensink.com/models/gemini.md): Direct Google Gemini integration from a senior lab: multimodal, large-context, eval-tested, vendor-neutral. Full source ownership. - [Kimi (Moonshot AI) Integration Services: Kensink Labs](https://www.kensink.com/models/kimi.md): Direct Moonshot Kimi integration from a senior lab: open-weight models, frontier agentic coding, low cost, hosted or self-hosted, eval-gated and vendor-neutral. Full source ownership. - [Kimi K2.7: Pricing, Benchmarks, and the Cursor Composer Debate](https://www.kensink.com/models/kimi/k2-7.md): A senior lab's read on Moonshot's Kimi K2.7: open weights, agentic-coding benchmarks vs Claude and GPT, hosted pricing a fraction of closed leaders, and the Cursor Composer open-weight base controversy. - [Llama & Open-Weight LLM Hosting: Kensink Labs](https://www.kensink.com/models/llama.md): Self-hosted Llama and open-weight models from a senior lab: deployment, tuning, eval-tested, data residency. Full source ownership. - [Mistral Integration Services: Kensink Labs](https://www.kensink.com/models/mistral.md): Direct Mistral integration from a senior lab: efficient models, eval-tested, hosted or self-run. Full source ownership. - [OpenAI GPT Integration Services: Kensink Labs](https://www.kensink.com/models/openai-gpt.md): Direct OpenAI GPT integration from a senior lab: structured output, evals, cost control, vendor-neutral abstraction. Full source ownership. - [Speech-to-Text & Whisper Services: Kensink Labs](https://www.kensink.com/models/whisper-speech.md): Production transcription from a senior lab: Whisper, diarization, timestamps, wired into your LLM pipeline. Full source ownership. ## LLM engineering - [Direct LLM Engineering: Eight services on the K-Framework](https://www.kensink.com/llm.md): Senior engineers shipping direct-LLM systems. No LangChain, no LlamaIndex, no framework lock-in. Eight engagements cover enterprise LLM, on-premise, evals, fine-tuning, RAG, agents, observability, and structured output. Eight weeks each, fixed price, eval suite at handoff. - [Production LLM Agents: typed tools, guardrails, traces](https://www.kensink.com/llm/agents.md): Tool-using agents with schema-validated function calling, bounded retry budgets, per-tool rate limits, OpenTelemetry traces. Direct LLM API, no agent framework lock-in. - [Enterprise LLM Deployment: SSO, RBAC, audit, PII](https://www.kensink.com/llm/enterprise.md): Production LLM for regulated industries. SSO + RBAC + audit trails + PII redaction enforced at the proxy. Vendor-neutral abstraction. Eight weeks to governance-ready production. - [LLM Model Evaluation: golden sets, hard assertions, drift detection](https://www.kensink.com/llm/evaluation.md): Eval-first LLM development. Golden sets, CI gates, LLM-as-judge soft scoring, production drift detection. Eight weeks to a defensible release gate for every prompt and model. - [Fine-tuning · LoRA, DPO, GRPO, custom models](https://www.kensink.com/llm/fine-tuning.md): Enterprise fine-tuning hub: LoRA / QLoRA / DoRA / full SFT, preference optimization (DPO, SimPO, ORPO, KTO), reinforcement fine-tuning (GRPO/RFT), continued pretraining, model merging. Data pipelines, vendor matrix, scale playbooks, compliance for every region. - [Fine-tuning by data + compute scale: four named playbooks](https://www.kensink.com/llm/fine-tuning/by-scale.md): From under 1k examples to over 1M. Single A10G to 128 B200. Indicative cost, recommended method, hardware tier per scale. The architecture of the fine-tune changes with the data, not the other way around. - [Fine-tuning compliance: EU AI Act, GDPR, HIPAA, Colorado AI Act, DPDP, China GenAI](https://www.kensink.com/llm/fine-tuning/compliance.md): Region-by-region compliance for fine-tuned LLMs. EU AI Act (Article 25 substantial modification, GPAI training-data summary), GDPR Article 17 erasure, US state laws (Colorado, California AB 2013, NYC LL144), India DPDP, China GenAI Measures, Korea AI Basic Act. The 20-control checklist. - [Custom model build: CPT, SFT, DPO, reasoning distillation, merging](https://www.kensink.com/llm/fine-tuning/custom-models.md): Build a custom enterprise LLM from a frontier-grade open base. Continued pretraining, SFT, preference optimization (DPO / SimPO / ORPO), R1-style reasoning distillation, model merging. The full pipeline from base to signed model card. - [Fine-tuning data pipeline: sourcing, PII, synthetic, dedup, labelling](https://www.kensink.com/llm/fine-tuning/data-pipeline.md): The data engineering depth for enterprise fine-tuning. Sourcing strategies, PII redaction (Presidio), synthetic data (Distilabel, Nemotron), DEITA quality scoring, MinHash + SemDedup, labelling vendors (Surge, Scale, Argilla), feedback loops, dataset versioning. - [Fine-tuning methods: LoRA, QLoRA, SFT, DPO, GRPO, DoRA](https://www.kensink.com/llm/fine-tuning/methods.md): Every named fine-tuning technique with engineering depth: LoRA, QLoRA, DoRA, full SFT, DPO, SimPO, ORPO, KTO, GRPO/RFT, continued pretraining, distillation, model merging. When each one earns the build. - [Continued pretraining (CPT) for LLMs: domain-adaptive pretraining](https://www.kensink.com/llm/fine-tuning/methods/cpt.md): Continued pretraining for foreign vocabulary, new tokenization, and deep domain language. When SFT cannot fix what the base never saw. - [LLM distillation: small students from frontier teachers](https://www.kensink.com/llm/fine-tuning/methods/distillation.md): Reasoning distillation (R1 lineage), output distillation, custom-model distillation pipelines. The 2025 breakout method for cheap, fast specialist models. - [DoRA fine-tuning: weight-decomposed LoRA](https://www.kensink.com/llm/fine-tuning/methods/dora.md): DoRA decomposes weights into magnitude and direction. Up to +4.4% over LoRA at the same trainable parameter count. Drop-in replacement in PEFT and Unsloth. - [DPO fine-tuning: direct preference optimization](https://www.kensink.com/llm/fine-tuning/methods/dpo.md): Direct Preference Optimization for alignment without the PPO loop. Reference SFT, preference pairs, classification loss. The 2026 production workhorse. - [GRPO / Reinforcement Fine-Tuning: reasoning fine-tunes for LLMs](https://www.kensink.com/llm/fine-tuning/methods/grpo-rft.md): Group Relative Policy Optimization (DeepSeek R1) and OpenAI's RFT API for verifiable-reward fine-tuning. Math, code, structured extraction, tool use. When SFT cannot teach the model to reason. - [KTO fine-tuning: preference learning from thumbs](https://www.kensink.com/llm/fine-tuning/methods/kto.md): Kahneman-Tversky Optimization trains on single-response binary labels. The right method when production feedback is thumbs, not pairwise comparisons. - [LoRA fine-tuning: the 2026 default, by Kensink Labs](https://www.kensink.com/llm/fine-tuning/methods/lora.md): Low-rank adaptation: 99% of the accuracy of a full fine-tune, 1% of the VRAM. Rank 16, alpha 32, all-linear is our 2026 default. When LoRA earns the build and when it does not. - [LLM model merging: TIES, DARE, model soup](https://www.kensink.com/llm/fine-tuning/methods/model-merging.md): Combine fine-tunes by weight arithmetic. TIES, DARE, SLERP, task arithmetic. Multi-skill consolidation in minutes, no GPU training needed. - [ORPO fine-tuning: one-stage SFT + preference](https://www.kensink.com/llm/fine-tuning/methods/orpo.md): Odds-Ratio Preference Optimization merges SFT and preference learning into one run. Half the compute of SFT then DPO, close quality on clean data. - [QLoRA fine-tuning: 4-bit base + LoRA on a single GPU](https://www.kensink.com/llm/fine-tuning/methods/qlora.md): QLoRA enables Llama 70B fine-tuning on a single 48GB GPU. NF4 + double-quantization + paged 8-bit AdamW. Our default when VRAM is the budget you do not have. - [Full SFT fine-tuning: when LoRA falls short](https://www.kensink.com/llm/fine-tuning/methods/sft.md): Full supervised fine-tuning: every weight updates. The highest accuracy ceiling, the highest cost. When we benchmark past LoRA and full SFT earns the build. - [SimPO fine-tuning: reference-free preference optimization](https://www.kensink.com/llm/fine-tuning/methods/simpo.md): SimPO drops the reference model and adds length normalization. +6.4 AlpacaEval 2 and +7.5 Arena-Hard over DPO at lower training memory. - [Fine-tuning platforms: OpenAI RFT, Anthropic, Vertex, Bedrock, Together, Predibase, NeMo](https://www.kensink.com/llm/fine-tuning/platforms.md): Side-by-side of the 12 platforms that matter for production fine-tuning: OpenAI RFT, Anthropic on Bedrock, Vertex AI, Azure Foundry, Databricks Mosaic, Together AI, Predibase, NeMo Customizer, Modal, Lambda, CoreWeave, Unsloth. - [LLM Observability & Cost: telemetry, drift, caps](https://www.kensink.com/llm/observability.md): Token + cost telemetry per user, per endpoint, per prompt version. Drift detection on production traffic. Per-tenant caps with graceful degradation. OpenTelemetry + Grafana. - [On-Premise LLM Deployment: vLLM + Triton + your VPC](https://www.kensink.com/llm/on-premise.md): Self-hosted inference on Llama, Mistral, Qwen. vLLM + Triton + Kubernetes. GPU sizing, autoscaling, air-gapped where required. Eight weeks from weights to production. - [RAG · Production Retrieval-Augmented Generation](https://www.kensink.com/llm/rag.md): Hub for production RAG: architectures (Naive, Advanced, Agentic, GraphRAG), vector databases (pgvector, Qdrant, Milvus, Vespa), retrieval pipeline (embeddings, chunking, hybrid, reranking), scale playbooks, and multimodal RAG. - [RAG Architectures: Sketched, Benchmarked, Ranked by When Each Ships](https://www.kensink.com/llm/rag/architectures.md): Every published RAG pattern in production engineering use. Architecture sketches, eval methodology, deployment notes, and an honest verdict on when each one earns the build. - [Adaptive RAG: Per-Query Routing for Cost Optimisation](https://www.kensink.com/llm/rag/architectures/adaptive.md): Adaptive RAG routes each query to the cheapest mode that can handle it. Classifier + multiple RAG modes. The 2026 production cost-optimisation pattern. Stack, deploy, classifier calibration. - [Advanced RAG (Hybrid + Rerank): Stack, Deploy, Benchmarks](https://www.kensink.com/llm/rag/architectures/advanced.md): Production Advanced RAG: query rewrite, hybrid (pgvector + BM25) retrieval, reciprocal-rank fusion, cross-encoder rerank, citation-first generation. Stack, deploy steps, accuracy numbers, our take. - [Agentic RAG: Production Patterns, Stack, Cost Control](https://www.kensink.com/llm/rag/architectures/agentic.md): Agentic RAG in production: planner + per-source retrievers + validator + synthesizer with hard budget and structured output. Stack, deploy, benchmarks, our verdict. - [Branched RAG: Parallel Hypothesis Exploration](https://www.kensink.com/llm/rag/architectures/branched.md): Branched RAG: generate multiple interpretations, retrieve in parallel, compare and synthesize. For open-ended questions where multiple angles deserve answers. Stack, deploy, eval methodology. - [Corrective RAG (CRAG): Retrieval Eval + Web Fallback](https://www.kensink.com/llm/rag/architectures/corrective.md): Production CRAG: lightweight retrieval evaluator + web-search fallback when documents are weak. Catches bad retrievals before they bleed through. Stack, deploy, benchmarks, our take. - [GraphRAG: Multi-Hop Knowledge Graph Retrieval](https://www.kensink.com/llm/rag/architectures/graphrag.md): Production GraphRAG: entity + relationship extraction, knowledge graph build, community detection, graph traversal, path-aware synthesis. Stack, deploy, accuracy benchmarks, our take. - [HyDE: Hypothetical Document Embeddings for Technical RAG](https://www.kensink.com/llm/rag/architectures/hyde.md): HyDE: have the LLM write a hypothetical answer, embed it, find real documents that match. Lifts recall on technical and jargon-heavy queries. Stack, deploy, eval methodology. - [Iterative RAG: Multi-Round Retrieval for Research Workloads](https://www.kensink.com/llm/rag/architectures/iterative.md): Iterative RAG: retrieve, partial answer, identify gap, refine, repeat. The pattern for research questions that decompose into sub-questions. Stack, deploy, controller calibration. - [Modular RAG: Swappable Components for Production Builds](https://www.kensink.com/llm/rag/architectures/modular.md): Modular RAG: named modules with typed contracts, swappable components, no framework lock-in. The engineering posture that keeps RAG builds maintainable past month six. - [Naive RAG: The Baseline Every RAG Build Starts From](https://www.kensink.com/llm/rag/architectures/naive.md): Naive RAG explained: embed, fetch top-K, generate. The fastest thing that works, the baseline every more complex pattern earns its way past. Stack, deploy, when to keep it, when to outgrow it. - [Self-RAG: Self-Critique RAG for High-Stakes Accuracy](https://www.kensink.com/llm/rag/architectures/self-rag.md): Production Self-RAG: retrieval critique + answer critique + calibrated refusal. Catches confident wrong answers in clinical, legal, financial, regulatory work. Stack, deploy, eval methodology. - [RAG with Memory: Conversational Context for Production Chat](https://www.kensink.com/llm/rag/architectures/simple-with-memory.md): RAG plus conversation buffer for multi-turn chat. Pronouns resolve, follow-ups work, retention controls. Stack, deploy, privacy posture, our take. - [Speculative RAG: Pre-Fetch Likely Follow-Ups](https://www.kensink.com/llm/rag/architectures/speculative.md): Speculative RAG anticipates follow-up queries and pre-fetches retrieval in the background. Cuts perceived latency on conversational workloads. Stack, deploy, predictor calibration. - [RAG by corpus scale: proven designs from <100k to 1B+ chunks](https://www.kensink.com/llm/rag/by-scale.md): Four named RAG architectures by corpus size: under 100k chunks (pgvector + hybrid), 100k-10M (Qdrant + rerank), 10M-1B (Milvus / Vespa multi-stage), 1B+ (sharded distributed). The architecture changes with the scale. - [Multimodal RAG: PDFs with tables, vision LLMs, ColPali, BGE-M3](https://www.kensink.com/llm/rag/multimodal.md): Multimodal RAG for documents that aren't plain text. Vision LLM extraction (Claude, GPT), multimodal embeddings (ColPali, BGE-M3), table-aware chunking, court-ready citations. The Affidavit Mapp shape. - [RAG retrieval pipeline: embeddings, chunking, hybrid search, reranking](https://www.kensink.com/llm/rag/retrieval-pipeline.md): The four layers retrieval quality lives in: embedding model selection, chunking strategies (late chunking, contextual retrieval), hybrid search (vector + BM25 + RRF), and reranking (Cohere, BGE, ColBERT). What we run on every production RAG. - [Vector databases for RAG: pgvector, Qdrant, Milvus, Weaviate, Vespa, LanceDB, Pinecone](https://www.kensink.com/llm/rag/vector-databases.md): Honest 2026 comparison of the seven vector databases that matter for production RAG. Selection matrix by corpus scale, hybrid native, latency, and ops cost. Why pgvector is our default and when we leave it. - [LLM Structured Output: Zod schemas, validator loops](https://www.kensink.com/llm/structured-output.md): JSON schema enforcement on LLM output. Zod schemas mirror the API contract. Validator loops with bounded retries and repair prompts. Type-safe end-to-end. ## Founder hub - [Kensink Labs Founders AI · Build your AI business with a lab that ships](https://www.kensink.com/founder.md): Your AI co-founder for the build. Scan your idea for free, then ship it with a ready-made engineering, marketing, or content team. Fixed scope, fixed fee, you own the IP. - [Content Lab for founders](https://www.kensink.com/founder/content.md): A ready-made content crew for blog, docs, and long-form. A founder narrative people remember, content that teaches and ranks, and a publishing rhythm you can sustain. - [AI Engineering Team for founders](https://www.kensink.com/founder/engineering.md): A ready-made engineering crew that builds, tests, and ships your AI product. MVPs, AI agents, and a codebase your future team can maintain. Fixed scope, fixed fee. - [AI Marketing Team for founders](https://www.kensink.com/founder/marketing.md): A ready-made marketing crew that plans the launch, writes the copy, and reads the numbers. Launch plans, ad systems, and growth loops measured against real results. - [Founder pricing](https://www.kensink.com/founder/pricing.md): A ladder built for founders. Start with a free AI readiness teardown, then a fixed-scope Launch Sprint, an embedded team, or full scale. Fixed scope, fixed fee, you own the IP. - [AI Readiness Scanner](https://www.kensink.com/founder/readiness.md): Score your AI product idea in two minutes. See what is AI-ready, what is not, how to close the gap, and a roadmap to launch. Free, no sign-up to see your score. - [Founder resources](https://www.kensink.com/founder/resources.md): Playbooks for building with AI: idea to MVP in a few weeks, an AI stack for a pre-seed startup, and when to hire versus when to embed. Plus a free readiness scan. - [An AI stack for a pre-seed startup](https://www.kensink.com/founder/resources/an-ai-stack-for-a-pre-seed-startup.md): The boring, cheap, scalable stack we reach for when a founder has no infra and no time to manage it. - [Idea to MVP in a few weeks](https://www.kensink.com/founder/resources/idea-to-mvp-in-a-few-weeks.md): How to scope an AI product down to the smallest thing worth shipping, and ship it before the idea goes stale. - [When to hire vs when to embed](https://www.kensink.com/founder/resources/when-to-hire-vs-when-to-embed.md): A simple test for whether you should hire an AI team now, or embed one until you have traction. ## Industries - [Industries: production AI by vertical](https://www.kensink.com/industries.md): Five industry verticals, six service items each, all built on the K-Framework. Recruiting AI, healthcare AI, fintech AI, manufacturing AI, edtech AI. Eight-week engagements, eval-gated at handoff. - [EdTech AI: teach-back, voice, kid-safe tutoring](https://www.kensink.com/industries/edtech-ai.md): Production AI for K-12 platforms, after-school tutoring apps, and university copilots. Teach-back loops, voice-first STT/TTS, kid-safe guardrails, and a parent dashboard. - [Fintech AI: fraud, KYC, audit-grade observability](https://www.kensink.com/industries/fintech-ai.md): Production AI for neobanks, payment platforms, lending, and regulated fintech. Fraud detection, KYC document AI, audit-grade observability, and on-prem deployment. - [Healthcare AI: clinical summarization, evals, audit](https://www.kensink.com/industries/healthcare-ai.md): Production AI for hospitals, EHR add-ons, and clinical platforms. Clinical summarization, evidence-grounded copilots, on-prem deployment, and HIPAA-aware audit trails. - [Manufacturing AI: predictive maintenance, vision, edge](https://www.kensink.com/industries/manufacturing-ai.md): Production AI for manufacturers and supply-chain operators. Predictive maintenance, vision defect detection, OT/IT bridge, and edge inference that survives 24/7 line speed. - [Recruiting AI: production matching, evals, copilots](https://www.kensink.com/industries/recruiting-ai.md): AI matching for platforms that move millions of people into work. Vector retrieval, eval suites, multilingual career copilots, and the boring infrastructure behind it. ## K-Framework - [The K-Framework: A layered map of AI development](https://www.kensink.com/k-framework.md): Kensink Labs' operating system for shipping AI products that survive production. Three pillars, sixteen layers, one feedback loop. Built from production incidents. The discipline that separates real systems from AI slop. - [Algorithmic Fundamentals: K-Framework Layer A.03](https://www.kensink.com/k-framework/algorithmic-fundamentals.md): Validate the Algorithmic Fundamentals layer: defensible retrieval, rerank, and decoding choices recorded in ADRs, and a pipeline you can debug stage by stage. A CEO/CTO field guide. - [Architectural Visibility: K-Framework Layer C.02](https://www.kensink.com/k-framework/architectural-visibility.md): Validate the Architectural Visibility layer: architecture-as-code, current service catalogs and data-flow maps, ADRs, and diagrams used in reviews. A CEO/CTO field guide. - [Automated Rollback: K-Framework Layer B.05](https://www.kensink.com/k-framework/automated-rollback.md): Validate the Automated Rollback layer: gradual rollout, reversible migrations validated in CI, feature flags, and a rehearsed rollback. Recover from a bad deploy in minutes. A CEO/CTO field guide. - [Automation Layer: K-Framework Layer B.02](https://www.kensink.com/k-framework/automation-layer.md): Validate the Automation Layer: PR-driven deploys, runbooks as code, parity staging, and synthetic monitoring of the user journey. A CEO/CTO field guide. - [Code as Liability: K-Framework Layer A.05](https://www.kensink.com/k-framework/code-as-liability.md): Validate the Code as Liability layer: tests that gate merges, strict types, ADRs on non-trivial changes, and a codebase your team can extend without rewriting. A CEO/CTO field guide. - [Critical Thinking: K-Framework Layer C.03](https://www.kensink.com/k-framework/critical-thinking.md): Validate the Critical Thinking layer: reviews that re-derive assumptions, inherited defaults challenged, and a team that can defend every key choice. A CEO/CTO field guide. - [Data Strategy: K-Framework Layer A.02](https://www.kensink.com/k-framework/data-strategy.md): Validate the Data Strategy layer of an AI system: contract-first data, lineage from source to retrieval, schema-level PII gating, and reviewed retention. A CEO/CTO field guide. - [Ethics & Safety: K-Framework Layer A.04](https://www.kensink.com/k-framework/ethics-safety.md): Validate the Ethics & Safety layer: safety evals that gate releases, red-team prompts in CI, PII detection on inputs and outputs, and pre-release InfoSec sign-off. A CEO/CTO field guide. - [Evaluation Engine: K-Framework Layer B.03](https://www.kensink.com/k-framework/evaluation-engine.md): Validate the Evaluation Engine: an eval suite written before the feature, gated on every PR, with per-field scoring and a red-team set. The eval bar is the ship contract. A CEO/CTO field guide. - [Intellectual Ownership: K-Framework Layer C.04](https://www.kensink.com/k-framework/intellectual-ownership.md): Validate the Intellectual Ownership layer: ADRs that name the decider and reasoning, a questioner in reviews, and decision-focused post-incident reviews. A CEO/CTO field guide. - [Long-Term Vision: K-Framework Layer A.06](https://www.kensink.com/k-framework/long-term-vision.md): Validate the Long-Term Vision layer: direct-to-model choices, durable infrastructure, named long-horizon constraints, and a standing architecture review. A CEO/CTO field guide. - [Mentorship Speed-Run: K-Framework Layer C.01](https://www.kensink.com/k-framework/mentorship-speed-run.md): Validate the Mentorship Speed-Run layer: pair-engineering by default, co-authored decisions, and a team that can defend and extend the system at handoff. A CEO/CTO field guide. - [Model & Tooling: K-Framework Layer B.01](https://www.kensink.com/k-framework/model-tooling.md): Validate the Model & Tooling layer: tool selection as a scored decision doc, direct-to-model by default, and a known exit path for every dependency. A CEO/CTO field guide. - [System Design: K-Framework Layer A.01](https://www.kensink.com/k-framework/system-design.md): How to validate the System Design layer of an AI build: reference architecture signed off before code, a failure mode per component, and NFRs gated in CI. A CEO/CTO field guide. - [Token Economics: K-Framework Layer B.04](https://www.kensink.com/k-framework/token-economics.md): Validate the Token Economics layer: cost-per-intent metering, per-customer budgets, regression alerts on PRs, and a quarterly model-cost review. A CEO/CTO field guide. - [Values & Purpose: K-Framework Layer C.05](https://www.kensink.com/k-framework/values-purpose.md): Validate the Values & Purpose layer: a charter naming user, moment, and outcome, re-read weekly and used to gate what ships. A CEO/CTO field guide. ## More - [Kensink Labs: AI products on solid ground](https://www.kensink.com/_not-found.md): A small lab of senior engineers shipping production AI for SaaS companies. no framework lock-in. - [Kensink Labs: AI products on solid ground](https://www.kensink.com/404.md): A small lab of senior engineers shipping production AI for SaaS companies. no framework lock-in. - [About](https://www.kensink.com/about.md): A small lab of senior engineers shipping production AI. Founded by Niaz Islam. New York, est. 2024. - [Ad Optimization](https://www.kensink.com/ad-optimization.md): Most businesses waste 70% of their ad budget. Simple fixes that double your results without increasing your spend. - [AI Agents](https://www.kensink.com/ai-agents.md): Stop debugging frameworks built on quicksand. We engineer production-ready AI agents on a rock-solid foundation. - [Contact](https://www.kensink.com/contact.md): Get in touch. Send a message or book a 15-min intro call. We reply within 24 hours. - [Design System v2](https://www.kensink.com/design-system.md): Internal visual manual for Kensink Labs. Tokens, components, and rules that ship from src/components/ui. - [Dispatch: Field notes from Kensink Labs](https://www.kensink.com/dispatch.md): A lab dispatch from Kensink Labs. Engineering notes, half-finished diagrams, and the occasional rant about file formats. Twice a month, free, no tracking pixels. - [Enterprise Software Development](https://www.kensink.com/enterprise.md): Custom enterprise software built by senior engineers. Direct integration, eval-first, full source ownership. Multi-phase programs for B2B and regulated industries. - [Expertise](https://www.kensink.com/expertise.md): Boring infrastructure, modern AI. Direct LLM integration, Postgres, pgvector, eval-first development. Six platforms, ten industries. - [Frontier Firm Transformation](https://www.kensink.com/frontier-firm.md): We don't just build AI companies. We are one. The 12-week organizational transformation around AI. - [Languages & Frameworks We Build In: Kensink Labs](https://www.kensink.com/languages.md): The languages and frameworks Kensink Labs builds production software in: React, Next.js, TypeScript, Python, React Native, Flutter, and more. - [Django Development Services: Kensink Labs](https://www.kensink.com/languages/django.md): Django builds from a senior lab: clean app structure, typed Python, eval-tested, full source ownership at handoff. - [FastAPI Development Services: Kensink Labs](https://www.kensink.com/languages/fastapi.md): Typed Python APIs with FastAPI: automatic OpenAPI, async performance, eval-tested AI integration, full source ownership. - [Flutter Development Services: Kensink Labs](https://www.kensink.com/languages/flutter.md): Flutter apps with pixel-identical UI across platforms. Senior engineers, eval-tested, full source ownership at handoff. - [Go (Golang) Development Services: Kensink Labs](https://www.kensink.com/languages/golang.md): Fast, concurrent Go services and infrastructure from a senior lab. Eval-tested, full source ownership at handoff. - [Android Development Services (Kotlin): Kensink Labs](https://www.kensink.com/languages/kotlin-android.md): Native Android with Kotlin and Jetpack Compose. Senior engineers, eval-tested, full source ownership at handoff. - [Laravel / PHP Development Services: Kensink Labs](https://www.kensink.com/languages/laravel-php.md): Laravel and PHP builds from a senior lab: clean structure, tested, full source ownership at handoff. - [Next.js Development Services: Kensink Labs](https://www.kensink.com/languages/nextjs.md): Next.js App Router builds from a senior lab: server components, edge rendering, strong SEO, full source ownership. Problem to production in eight weeks. - [Node.js Development Services: Kensink Labs](https://www.kensink.com/languages/nodejs.md): TypeScript-first Node.js backends from a senior lab. Small dependency surface, eval-tested, full source ownership. - [Python Development Services: Kensink Labs](https://www.kensink.com/languages/python.md): Typed, tested Python for backends, data pipelines, and AI tooling. Senior engineers, eval-first, full source ownership at handoff. - [React Native Development Services: Kensink Labs](https://www.kensink.com/languages/react-native.md): Cross-platform iOS and Android with React Native and Expo. Shared logic with web, eval-tested, full source ownership. - [React Development Services: Kensink Labs](https://www.kensink.com/languages/react.md): Typed, server-first React from a senior lab. Maintainable component systems, eval-tested, with full source ownership at handoff. - [iOS Development Services (Swift): Kensink Labs](https://www.kensink.com/languages/swift-ios.md): Native iOS with Swift and SwiftUI. Senior engineers, deep platform integration, eval-tested, full source ownership. - [TypeScript Development Services: Kensink Labs](https://www.kensink.com/languages/typescript.md): Strict, end-to-end TypeScript from a senior lab. Shared contracts across the stack, eval-tested, full source ownership. - [Vue.js Development Services: Kensink Labs](https://www.kensink.com/languages/vuejs.md): Typed, tested Vue.js from a senior lab. Maintainable apps, eval-tested, full source ownership at handoff. - [Mobile App Development](https://www.kensink.com/mobile-app.md): Ship mobile apps you can maintain. Senior engineers, app-store hardened, full source ownership at handoff. We work in Flutter, React Native, Expo, Swift, or Kotlin, and we pick the right tool for the job. Eight weeks from concept to live. - [MVP Development](https://www.kensink.com/mvp.md): Ship your MVP in eight weeks. User-validated features only. Built for scale from day one. Full source code ownership. - [Application Design Patterns: Kensink Labs](https://www.kensink.com/patterns.md): Architecture patterns Kensink Labs builds with: RAG, multi-agent systems, event-driven, microservices, serverless, multi-tenant SaaS, CQRS, and headless. - [CQRS & Event Sourcing: Kensink Labs](https://www.kensink.com/patterns/cqrs-event-sourcing.md): CQRS and event sourcing from a senior lab: auditability and domain clarity, applied where it earns its complexity. Full source ownership. - [Event-driven Architecture: Kensink Labs](https://www.kensink.com/patterns/event-driven.md): Event-driven systems from a senior lab: queues, idempotency, observability, applied where decoupling pays. Full source ownership. - [Headless & API-first Architecture: Kensink Labs](https://www.kensink.com/patterns/headless-api-first.md): API-first and headless architecture from a senior lab: versioned contracts, OpenAPI, multi-client. Full source ownership. - [Microservices Architecture: Kensink Labs](https://www.kensink.com/patterns/microservices.md): Microservices from a senior lab: boundaries drawn for real needs, honest about monoliths. Eval-tested, full source ownership. - [Multi-tenant SaaS Architecture: Kensink Labs](https://www.kensink.com/patterns/multi-tenant-saas.md): Multi-tenant SaaS architecture from a senior lab: isolation models, data-layer enforcement, tested boundaries. Full source ownership. - [Serverless Architecture: Kensink Labs](https://www.kensink.com/patterns/serverless.md): Serverless architecture from a senior lab: scale-to-zero, edge-first, designed around the limits. Full source ownership. - [Privacy Policy](https://www.kensink.com/privacy.md): How Kensink Labs collects, stores, and uses information from website visitors and clients. - [AI ROI Calculator](https://www.kensink.com/resources/ai-roi-calculator.md): Estimate the payback on a direct-LLM agent project before committing budget. Five inputs, live output, methodology footnoted. - [Frontier Firm Diagnostic](https://www.kensink.com/resources/frontier-diagnostic.md): Twelve-question diagnostic. Score your readiness to become a Frontier Firm across leadership, tech, process, and talent. - [The Frontier Firm Playbook](https://www.kensink.com/resources/frontier-playbook.md): A 20-page field guide to the 12-week organizational AI transformation. Free download. - [Direct LLM Integration Guide](https://www.kensink.com/resources/llm-integration-guide.md): A 15-page engineering guide to integrating directly with the LLM API. No frameworks. - [What We Build, Project Types: Kensink Labs](https://www.kensink.com/services.md): The kinds of software Kensink Labs builds: web applications, marketing sites, SaaS products, e-commerce, APIs, internal tools, mobile, and enterprise systems. - [API & Backend Development: Kensink Labs](https://www.kensink.com/services/api-backend-development.md): API and backend development from a senior lab: API-first design, typed, tested, observable. Full source ownership at handoff. - [E-commerce Development: Kensink Labs](https://www.kensink.com/services/ecommerce-development.md): Custom e-commerce from a senior lab: storefront, checkout, reconciling payments, headless where it helps. Full source ownership. - [Internal Tools & Dashboards: Kensink Labs](https://www.kensink.com/services/internal-tools-dashboards.md): Internal tools and dashboards from a senior lab: real auth, audit, and usable UI to retire the spreadsheet. Full source ownership. - [Marketing Website Development: Kensink Labs](https://www.kensink.com/services/marketing-website.md): Fast, SEO-strong marketing sites from a senior lab: Next.js, server-rendered, conversion-focused. Full source ownership. - [SaaS Product Development: Kensink Labs](https://www.kensink.com/services/saas-product-development.md): SaaS product development from a senior lab: multi-tenancy, billing, auth, and roles built in from day one. Full source ownership in eight weeks. - [Web Application Development: Kensink Labs](https://www.kensink.com/services/web-application-development.md): Custom web application development from a senior lab: Next.js, TypeScript, Postgres, eval-tested, full source ownership in eight weeks. - [Technologies & Infrastructure: Kensink Labs](https://www.kensink.com/technologies.md): The data, infrastructure, and integration technologies Kensink Labs builds on: PostgreSQL, pgvector, Cloudflare, AWS, Redis, Stripe, and more. - [AWS Development Services: Kensink Labs](https://www.kensink.com/technologies/aws.md): Right-sized AWS architecture from a senior lab: infrastructure as code, cost visibility, no service sprawl. Full source ownership. - [Cloudflare Workers Development Services: Kensink Labs](https://www.kensink.com/technologies/cloudflare-workers.md): Full-stack apps on Cloudflare Workers, D1, R2, KV, and Durable Objects. Edge-first design from a lab that has shipped it. Full source ownership. - [Docker & Kubernetes Services: Kensink Labs](https://www.kensink.com/technologies/docker-kubernetes.md): Reproducible Docker builds and right-sized Kubernetes from a senior lab. No accidental complexity, full source ownership. - [Search & Elasticsearch Development: Kensink Labs](https://www.kensink.com/technologies/elasticsearch-search.md): Relevant, fast search from a senior lab: Elasticsearch or Postgres full-text, right-sized to your data. Full source ownership. - [GraphQL Development Services: Kensink Labs](https://www.kensink.com/technologies/graphql.md): Typed GraphQL APIs from a senior lab: schema design, caching, N+1 handling, full source ownership at handoff. - [pgvector & Vector Search: Kensink Labs](https://www.kensink.com/technologies/pgvector.md): Vector similarity search in PostgreSQL with pgvector. Simpler RAG architecture from a senior lab, eval-tested, full source ownership. - [PostgreSQL Development Services: Kensink Labs](https://www.kensink.com/technologies/postgresql.md): PostgreSQL schema design, search, and vectors from a senior lab. Tested migrations, observability, full source ownership at handoff. - [Redis: The In-Memory Data Platform, Explained: Kensink Labs](https://www.kensink.com/technologies/redis.md): An in-depth guide to Redis: data structures, caching patterns, persistence, replication, clustering, and its role in modern AI software. Plus production Redis from a senior lab. - [Stripe & Payments Integration: Kensink Labs](https://www.kensink.com/technologies/stripe-payments.md): Production payments from a senior lab: idempotent capture, refunds, webhooks, reconciliation, multi-processor abstraction. Full source ownership. - [Real-time & WebSocket Development: Kensink Labs](https://www.kensink.com/technologies/websockets-realtime.md): Real-time features with WebSockets from a senior lab: consistency, reconnection, and scale designed in. Full source ownership. - [Terms of Service](https://www.kensink.com/terms.md): Terms governing engagements with Kensink Labs and use of the kensink.com website. - [Professional Website Audit: Technical, Content, UX, Conversion & AI-Readiness](https://www.kensink.com/website-audit.md): A diagnostic framework, not a checklist. We audit your site across seven dimensions (technical, discovery, content, UX, conversion, analytics, strategy), score every finding by business impact, and hand you a roadmap that ships fixes in the next 30 days. - [AI-Readiness Audit: llms.txt, SSR Parity, Bot Allowlists, Citation Spot-Checks](https://www.kensink.com/website-audit/ai-readiness.md): Most audits are 5 years behind on AI. We audit schema coverage, server-side rendering parity, llms.txt, AI crawler allowlists (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) and your brand's citation presence across Perplexity, ChatGPT, Claude, and Gemini. - [Content-Type Audit Rubrics: 13 Page Types, 13 Rubrics](https://www.kensink.com/website-audit/content-types.md): Homepages, pillars, case studies, products, pricing, landing pages, docs, about, contact, legal: each page type fails differently. Auditing them against the same rubric is malpractice. Here are the 13 we use. - [Audit Framework: 14 Sub-Systems Across 7 Dimensions](https://www.kensink.com/website-audit/framework.md): The full diagnostic framework: crawlability, architecture, performance, security, semantic HTML, structured data, AI-readiness, mobile, accessibility, content, UX, backlinks, analytics, log files. Every check, every target state. - [Audit Process: Prerequisites, 5-Day Cadence, Tooling Stack](https://www.kensink.com/website-audit/process.md): What access we need on day one, the baseline metrics we capture, the exact five-day schedule, and the tooling stack we run. So you can be ready when we start, and verify what you got when we finish.