- If you had to ship one pattern tomorrow, which one?
- Advanced RAG: hybrid retrieval (pgvector + BM25 fused with RRF) + cross-encoder rerank (Cohere Rerank v3) + citation discipline. It's the 2026 production consensus, and the +17 pts of Recall@5 from reranking alone almost always justifies the latency. Promote to Agentic or GraphRAG only when the corpus or query shape demands.
- When is GraphRAG worth the build cost?
- When the question quality depends on multi-hop reasoning across linked entities — clinical decision support, regulatory analysis, complex case files, multi-document Q&A in regulated domains. Microsoft and follow-on research report 6-8 point accuracy gains over flat RAG, hitting 81%+ in specialised domains. The cost is building and maintaining the graph, which is non-trivial.
- Aren't Self-RAG and Corrective RAG basically the same?
- Close but not identical. Self-RAG critiques its own answer and decides whether to retrieve again. Corrective RAG evaluates retrieval quality before generation, and falls back to web search or query rewrite if the retrieved docs are weak. In practice we often combine them with eval gates at both stages.
- Is HyDE still useful with modern embedding models?
- Yes, in specialist domains where the query vocabulary diverges from the document vocabulary — medical, legal, code. Always pair with rerank: HyDE expands the candidate set; rerank cleans it up. Without rerank, hallucinated hypotheses can pull retrieval off course.
- Do you ever ship Naive RAG to production?
- Only for very narrow, well-keyed corpora — a single product's FAQ, internal docs with consistent vocabulary, narrow-domain chatbots. Even then, we add reranking and citations on the second iteration. The cost of "upgrade later" is almost always lower than the cost of shipping a system that gets quietly wrong.