Subhajit

Reranking for Better RAG Retrieval

29 January 2025·566 words·3 mins

Bi-encoder retrieval is fast but imprecise. Cross-encoder reranking improves top-k precision at the cost of some latency. Here’s when and how to add it. TL;DR Bi-encoders are fast (embeddings precomputed) but miss query-document interaction. Cross-encoders are slower but far more accurate — encode query + document together. Pattern: retrieve top-20 with bi-encoder, rerank to top-4 with cross-encoder. Start with ms-marco-MiniLM-L-6-v2 (80MB, fast, good accuracy). Skip reranking if latency budget < 200ms or bi-encoder recall is already high. Bi-Encoder vs Cross-Encoder # Bi-encoder (used in vector search):

LightRAG as a LangChain Retriever

29 January 2025·594 words·3 mins

LLM Engineering

Want LightRAG’s lean retrieval with LangChain’s chain ecosystem? Here’s how to wrap LightRAG as a LangChain-compatible retriever — keeping retrieval explicit and fast while using LangChain for everything downstream. TL;DR Implement BaseRetriever._get_relevant_documents to make any retriever LangChain-compatible. LightRAG’s FAISS retrieval slots straight into LangChain chains, LCEL, and agents. Use this pattern when migrating an existing LangChain pipeline to leaner retrieval incrementally. For full LangChain pipelines without constraints, the standard LangChain retriever is fine. Why Combine LightRAG with LangChain # LightRAG gives you minimal, fast retrieval. LangChain gives you chains, agents, and tooling. Sometimes you want both:

BM25 Hybrid Search with LightRAG

29 January 2025·643 words·4 mins

LLM Engineering

Vector search misses keyword-heavy queries. BM25 misses semantic similarity. Combine both with hybrid search for better retrieval recall. TL;DR Vector search (FAISS): great for semantic/paraphrase queries, bad for exact codes or IDs. BM25: great for keyword/exact matches, bad for synonyms and paraphrases. Hybrid with RRF: combines both rank lists — no score normalization needed. Start with vector_weight=0.5. Lower it if users search exact product codes frequently. Why Hybrid Search # Pure vector search struggles with: