Retrival

FAISS Index Types for Production RAG

29 January 2026·420 words·2 mins

IndexFlatIP works for small corpora. For production with 100K+ vectors, you need smarter indexes. Here’s how to choose and implement them.

RAG for Knowledge-Intensive Tasks

24 September 2025·791 words·4 mins

LLM Engineering

Picture this: You’re asking an AI about cancer treatments. It sounds super confident and gives you detailed answers. But here’s the problem — it just made up a medical study that doesn’t exist.

RAG with LangChain: Architecture, Code, and Metrics

2 August 2025·1240 words·6 mins

LLM Engineering

RAG is a design pattern, not a product. LangChain supports it out of the box. This guide shows a production-ready RAG setup in LangChain with architecture, retrieval choices, runnable code, evaluation metrics, and trade-offs from my client projects.

Reranking for Better RAG Retrieval

29 January 2025·513 words·3 mins

LLM Engineering

Bi-encoder retrieval is fast but imprecise. Cross-encoder reranking improves top-k precision at the cost of some latency. Here’s when and how to add it.

LightRAG as a LangChain Retriever

29 January 2025·376 words·2 mins

LLM Engineering

Want LightRAG’s lean retrieval with LangChain’s chain ecosystem? Here’s how to wrap LightRAG as a LangChain-compatible retriever.

BM25 Hybrid Search with LightRAG

29 January 2025·595 words·3 mins

LLM Engineering

Vector search misses keyword-heavy queries. BM25 misses semantic similarity. Combine both with hybrid search for better retrieval recall.

↑