Want LightRAG’s lean retrieval with LangChain’s chain ecosystem? Here’s how to wrap LightRAG as a LangChain-compatible retriever — keeping retrieval explicit and fast while using LangChain for everything downstream.
TL;DR
- Implement
BaseRetriever._get_relevant_documentsto make any retriever LangChain-compatible.- LightRAG’s FAISS retrieval slots straight into LangChain chains, LCEL, and agents.
- Use this pattern when migrating an existing LangChain pipeline to leaner retrieval incrementally.
- For full LangChain pipelines without constraints, the standard LangChain retriever is fine.
Why Combine LightRAG with LangChain#
LightRAG gives you minimal, fast retrieval. LangChain gives you chains, agents, and tooling. Sometimes you want both:
- Use LightRAG’s tight FAISS retrieval for speed and predictable latency
- Plug into LangChain chains for downstream processing (prompts, parsers, memory)
- Keep retrieval explicit while using LangChain’s callbacks, tracing, and streaming
Implementing the Retriever#
LangChain’s BaseRetriever requires implementing _get_relevant_documents. Here’s a complete wrapper:
from typing import List
import faiss
import numpy as np
from openai import OpenAI
from langchain_core.retrievers import BaseRetriever
from langchain_core.documents import Document
from langchain_core.callbacks import CallbackManagerForRetrieverRun
class LightRAGRetriever(BaseRetriever):
"""LangChain retriever backed by LightRAG's FAISS index."""
index: faiss.IndexFlatIP
texts: List[str]
sources: List[str]
k: int = 4
embedding_model: str = "text-embedding-3-small"
class Config:
arbitrary_types_allowed = True
def _embed(self, text: str) -> np.ndarray:
client = OpenAI()
resp = client.embeddings.create(input=[text], model=self.embedding_model)
vec = np.array([resp.data[0].embedding]).astype("float32")
faiss.normalize_L2(vec)
return vec
def _get_relevant_documents(
self,
query: str,
*,
run_manager: CallbackManagerForRetrieverRun
) -> List[Document]:
q = self._embed(query)
D, I = self.index.search(q, self.k)
docs = []
for j, i in enumerate(I[0]):
docs.append(Document(
page_content=self.texts[i],
metadata={"source": self.sources[i], "score": float(D[0][j])}
))
return docsBuilding and Using the Retriever#
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
# Build FAISS index (from LightRAG)
def build_lightrag_index(pairs):
client = OpenAI()
texts = [t for _, t in pairs]
sources = [s for s, _ in pairs]
vecs = client.embeddings.create(
input=texts,
model="text-embedding-3-small"
).data
X = np.array([v.embedding for v in vecs]).astype("float32")
faiss.normalize_L2(X)
idx = faiss.IndexFlatIP(X.shape[1])
idx.add(X)
return idx, texts, sources
# Create retriever
idx, texts, sources = build_lightrag_index(corpus_pairs)
retriever = LightRAGRetriever(index=idx, texts=texts, sources=sources, k=4)
# Use in LangChain chain
prompt = ChatPromptTemplate.from_template(
"Answer from context only.\n\nContext: {context}\n\nQuestion: {question}"
)
chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| ChatOpenAI(model="gpt-4o-mini", temperature=0)
| StrOutputParser()
)
answer = chain.invoke("What is the return policy?")Async Support#
For async LangChain chains (e.g., FastAPI endpoints), override _aget_relevant_documents:
import asyncio
from langchain_core.callbacks import AsyncCallbackManagerForRetrieverRun
class LightRAGRetriever(BaseRetriever):
# ... (same as above)
async def _aget_relevant_documents(
self,
query: str,
*,
run_manager: AsyncCallbackManagerForRetrieverRun,
) -> list[Document]:
# Run sync embed+search in a thread pool to avoid blocking the event loop
loop = asyncio.get_event_loop()
return await loop.run_in_executor(
None,
lambda: self._get_relevant_documents(query, run_manager=run_manager.get_sync()),
)Streaming with LCEL#
The retriever slots into LCEL streaming chains without modification:
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
streaming_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| ChatOpenAI(model="gpt-4o-mini", streaming=True)
| StrOutputParser()
)
for chunk in streaming_chain.stream("What is the return policy?"):
print(chunk, end="", flush=True)Performance Notes#
LightRAG’s FAISS retrieval adds ~5–20ms over a LangChain retriever on the same index. The tradeoff:
| LightRAG Retriever | LangChain Default Retriever | |
|---|---|---|
| Retrieval latency | 5–20ms | 15–40ms |
| Dependencies | faiss-cpu, openai | langchain-community + vector store |
| Configurability | Full control | Abstracted |
| Cold start | Faster | Slower |
When to Use This Pattern#
Use LightRAG + LangChain when:
- You need LangChain’s tracing/callbacks but want lean retrieval
- Your team uses LangChain for other parts of the pipeline
- You want to gradually migrate from LangChain to pure LightRAG without a big-bang rewrite
Stick with pure LightRAG if you don’t need LangChain’s abstractions. See the main LightRAG guide for the standalone approach.
