LightRAG as a LangChain Retriever

Table of Contents

Want LightRAG’s lean retrieval with LangChain’s chain ecosystem? Here’s how to wrap LightRAG as a LangChain-compatible retriever — keeping retrieval explicit and fast while using LangChain for everything downstream.

TL;DR
Implement BaseRetriever._get_relevant_documents to make any retriever LangChain-compatible.
LightRAG’s FAISS retrieval slots straight into LangChain chains, LCEL, and agents.
Use this pattern when migrating an existing LangChain pipeline to leaner retrieval incrementally.
For full LangChain pipelines without constraints, the standard LangChain retriever is fine.

Why Combine LightRAG with LangChain
#

LightRAG gives you minimal, fast retrieval. LangChain gives you chains, agents, and tooling. Sometimes you want both:

Use LightRAG’s tight FAISS retrieval for speed and predictable latency
Plug into LangChain chains for downstream processing (prompts, parsers, memory)
Keep retrieval explicit while using LangChain’s callbacks, tracing, and streaming

Implementing the Retriever
#

LangChain’s BaseRetriever requires implementing _get_relevant_documents. Here’s a complete wrapper:

from typing import List
import faiss
import numpy as np
from openai import OpenAI
from langchain_core.retrievers import BaseRetriever
from langchain_core.documents import Document
from langchain_core.callbacks import CallbackManagerForRetrieverRun


class LightRAGRetriever(BaseRetriever):
    """LangChain retriever backed by LightRAG's FAISS index."""
    
    index: faiss.IndexFlatIP
    texts: List[str]
    sources: List[str]
    k: int = 4
    embedding_model: str = "text-embedding-3-small"
    
    class Config:
        arbitrary_types_allowed = True
    
    def _embed(self, text: str) -> np.ndarray:
        client = OpenAI()
        resp = client.embeddings.create(input=[text], model=self.embedding_model)
        vec = np.array([resp.data[0].embedding]).astype("float32")
        faiss.normalize_L2(vec)
        return vec
    
    def _get_relevant_documents(
        self, 
        query: str, 
        *, 
        run_manager: CallbackManagerForRetrieverRun
    ) -> List[Document]:
        q = self._embed(query)
        D, I = self.index.search(q, self.k)
        
        docs = []
        for j, i in enumerate(I[0]):
            docs.append(Document(
                page_content=self.texts[i],
                metadata={"source": self.sources[i], "score": float(D[0][j])}
            ))
        return docs

Building and Using the Retriever
#

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# Build FAISS index (from LightRAG)
def build_lightrag_index(pairs):
    client = OpenAI()
    texts = [t for _, t in pairs]
    sources = [s for s, _ in pairs]
    
    vecs = client.embeddings.create(
        input=texts, 
        model="text-embedding-3-small"
    ).data
    X = np.array([v.embedding for v in vecs]).astype("float32")
    faiss.normalize_L2(X)
    
    idx = faiss.IndexFlatIP(X.shape[1])
    idx.add(X)
    return idx, texts, sources

# Create retriever
idx, texts, sources = build_lightrag_index(corpus_pairs)
retriever = LightRAGRetriever(index=idx, texts=texts, sources=sources, k=4)

# Use in LangChain chain
prompt = ChatPromptTemplate.from_template(
    "Answer from context only.\n\nContext: {context}\n\nQuestion: {question}"
)

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI(model="gpt-4o-mini", temperature=0)
    | StrOutputParser()
)

answer = chain.invoke("What is the return policy?")

Async Support
#

For async LangChain chains (e.g., FastAPI endpoints), override _aget_relevant_documents:

import asyncio
from langchain_core.callbacks import AsyncCallbackManagerForRetrieverRun

class LightRAGRetriever(BaseRetriever):
    # ... (same as above)

    async def _aget_relevant_documents(
        self,
        query: str,
        *,
        run_manager: AsyncCallbackManagerForRetrieverRun,
    ) -> list[Document]:
        # Run sync embed+search in a thread pool to avoid blocking the event loop
        loop = asyncio.get_event_loop()
        return await loop.run_in_executor(
            None,
            lambda: self._get_relevant_documents(query, run_manager=run_manager.get_sync()),
        )

Streaming with LCEL
#

The retriever slots into LCEL streaming chains without modification:

from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

streaming_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI(model="gpt-4o-mini", streaming=True)
    | StrOutputParser()
)

for chunk in streaming_chain.stream("What is the return policy?"):
    print(chunk, end="", flush=True)

Performance Notes
#

LightRAG’s FAISS retrieval adds ~5–20ms over a LangChain retriever on the same index. The tradeoff:

	LightRAG Retriever	LangChain Default Retriever
Retrieval latency	5–20ms	15–40ms
Dependencies	`faiss-cpu`, `openai`	`langchain-community` + vector store
Configurability	Full control	Abstracted
Cold start	Faster	Slower

When to Use This Pattern
#

Use LightRAG + LangChain when:

You need LangChain’s tracing/callbacks but want lean retrieval
Your team uses LangChain for other parts of the pipeline
You want to gradually migrate from LangChain to pure LightRAG without a big-bang rewrite

Stick with pure LightRAG if you don’t need LangChain’s abstractions. See the main LightRAG guide for the standalone approach.

LightRAG as a LangChain Retriever

Why Combine LightRAG with LangChain
#

Implementing the Retriever
#

Building and Using the Retriever
#

Async Support
#

Streaming with LCEL
#

Performance Notes
#

When to Use This Pattern
#

Related
#

Related

Why Combine LightRAG with LangChain#

Implementing the Retriever#

Building and Using the Retriever#

Async Support#

Streaming with LCEL#

Performance Notes#

When to Use This Pattern#

Related#

Related

Why Combine LightRAG with LangChain
#

Implementing the Retriever
#

Building and Using the Retriever
#

Async Support
#

Streaming with LCEL
#

Performance Notes
#

When to Use This Pattern
#

Related
#