Kensink Labs
RAG architectureDirect LLM · no framework8-week engagement
RAG · RETRIEVAL-AUGMENTED GENERATION

Hybrid retrieval. Citations, not hallucinations.

Postgres + pgvector for vector search, BM25 for lexical, fused with reciprocal-rank fusion. Citation-first answers — every claim links to its chunk. No separate vector DB, no five-system synchronisation problem.

PostgreSQLpgvectorTypeScriptOpenAI
Cycle
8 weeks · corpus to citation
Stack
Postgres · pgvector · TypeScript
Output
Ingestion + retrieval + citation surface
Discipline
Citations on every answer, no exceptions
[WHY THIS EXISTS]

Most RAG demos fall apart at scale.

A naive cosine-similarity search over a 1M-chunk corpus retrieves plausible-but-wrong context, the LLM hallucinates confidently on top, and the user discovers the failure mode by getting fired for citing it. Hybrid retrieval + citations make the failure mode visible.

  • Pgvector + BM25 hybrid — dense semantic + lexical exact-match
  • Reciprocal-rank fusion to merge scores defensibly
  • Citation surface — every claim in the answer links to its source chunk
  • Chunking pipeline tuned to your corpus, not someone's blog post
[HOW WE BUILD IT]

Retrieval first, generation second.

01

Corpus + chunking audit

We characterise your corpus before we pick a chunker. PDFs with tables, code with comments, transcripts with timestamps — each gets a different strategy.

02

Postgres + pgvector

One database for your text, your embeddings, your metadata, your access controls. Skip the vector-DB-shaped tax.

03

Hybrid retrieval

BM25 finds the exact-term matches the embeddings miss; vector search finds the semantic matches BM25 misses. Reciprocal-rank fusion merges them.

04

Citation-first answers

The prompt requires the model to cite each claim. The UI renders inline citations. Users (and auditors) can verify every assertion.

[OUTCOMES AT HANDOFF]

What's live at week eight.

100%
Answers with inline citations
1 DB
Postgres for text + vectors + metadata
Hybrid
BM25 + pgvector with RRF fusion
Per-tenant
Access control on retrieval, not just generation
DIRECT LLM · APPLIED K

Bring the problem.
We’ll bring the build.

Eight weeks, fixed price, eval suite at handoff. Direct LLM engineering on top of the K-Framework. Two Q3 slots remain.