This is a cache of https://www.elastic.co/search-labs/blog/build-rag-workflow-langgraph-elasticsearch. It is a snapshot of the page at 2025-04-26T00:55:23.541+0000.
Build a powerful RAG workflow using LangGraph and Elasticsearch - Elasticsearch Labs

​​Build a powerful RAG workflow using LangGraph and Elasticsearch

In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses.

Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics, or building prod-ready apps Elastic Vector Database.

To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now.

The LangGraph Retrieval Agent Template is a starter project developed by LangChain to facilitate the creation of retrieval-based question-answering systems using LangGraph in LangGraph Studio. This template is pre-configured to integrate seamlessly with Elasticsearch, enabling developers to rapidly build agents that can index and retrieve documents efficiently.

This blog focuses on running and customizing the LangChain Retrieval Agent Template using LangGraph Studio and LangGraph CLI. The template provides a framework for building retrieval-augmented generation (RAG) applications, leveraging various retrieval backends like Elasticsearch.

We will walk you through setting up, configuring the environment, and executing the template efficiently with Elastic while customizing the agent flow.

Prerequisites

Before proceeding, ensure you have the following installed:

  • Elasticsearch Cloud deployment or on-prem Elasticsearch deployment (or create a 14-day Free Trial on Elastic Cloud) - Version 8.0.0 or higher
  • Python 3.9+
  • Access to an LLM provider such as Cohere (used in this guide), OpenAI, or Anthropic/Claude

Creating the LangGraph app

1. Install the LangGraph CLI

2. Create LangGraph app from retrieval-agent-template

You will be presented with an interactive menu that will allow you to choose from a list of available templates. Select 4 for Retrieval Agent and 1 for Python, as shown below:

  • Troubleshooting: If you encounter the error “urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)> “

Please run the Install Certificate Command of Python to resolve the issue, as shown below.

3. Install dependencies

In the root of your new LangGraph app, create a virtual environment and install the dependencies in edit mode so your local changes are used by the server:

Setting up the environment

1. Create a .env file

The .env file holds API keys and configurations so the app can connect to your chosen LLM and retrieval provider. Generate a new .env file by duplicating the example configuration:

2. Configure the .env file

The .env file comes with a set of default configurations. You can update it by adding the necessary API keys and values based on your setup. Any keys that aren't relevant to your use case can be left unchanged or removed.

  • Example .env file (using Elastic Cloud and Cohere)

Below is a sample .env configuration for using Elastic Cloud as the retrieval provider and Cohere as the LLM, as demonstrated in this blog:

Note: While this guide uses Cohere for both response generation and embeddings, you’re free to use other LLM providers such as OpenAI, Claude, or even a local LLM model depending on your use case. Make sure that each key you intend to use is present and correctly set in the .env file.

3. Update configuration file - configuration.py

After setting up your .env file with the appropriate API keys, the next step is to update your application’s default model configuration. Updating the configuration ensures the system uses the services and models you’ve specified in your .env file.

Navigate to the configuration file:

The configuration.py file contains the default model settings used by the retrieval agent for three main tasks:

  • Embedding model – converts documents into vector representations
  • Query model – processes the user’s query into a vector
  • Response model – generates the final response

By default, the code uses models from OpenAI (e.g., openai/text-embedding-3-small) and Anthropic (e.g., anthropic/claude-3-5-sonnet-20240620 and anthropic/claude-3-haiku-20240307).

In this blog, we're switching to using Cohere models. If you're already using OpenAI or Anthropic, no changes are needed.

Example changes (using Cohere):

Open configuration.py and modify the model defaults as shown below:

Running the Retrieval Agent with LangGraph CLI

1. Launch LangGraph server

This will start up the LangGraph API server locally. If this runs successfully, you should see something like:

Open Studio UI URL.

There are two graphs available:

  • Retrieval Graph: Retrieves data from Elasticsearch and responds to Query using an LLM.
  • Indexer Graph: Indexes documents into Elasticsearch and generates embeddings using an LLM.

2. Configuring the Indexer Graph

  • Open the Indexer Graph.
  • Click Manage Assistants.
    • Click on 'Add New Assistant', enter the user details as specified, and then close the window.

3. Indexing sample documents

  • Index the following sample documents, which represent a hypothetical quarterly report for the organization NoveTech:

Once the documents are indexed, you will see a delete message in the thread, as shown below.

4. Running the Retrieval Graph

  • Switch to the Retrieval Graph.
  • Enter the following search query:

The system will return relevant documents and provide an exact answer based on the indexed data.

Customize the Retrieval Agent

To enhance the user experience, we introduce a customization step in the Retrieval Graph to predict the next three questions a user might ask. This prediction is based on:

  • Context from the retrieved documents
  • Previous user interactions
  • Last user query

The following code changes are required to implement Query Prediction feature:

1. Update graph.py

  • Add predict_query function:
  • Modify respond function to return response Object , instead of message:
  • Update graph structure to add new node and edge for predict_query:

2. Update prompts.py

  • Craft prompt for Query Prediction in prompts.py:

3. Update configuration.py

  • Add predict_next_question_prompt:

4. Update state.py

  • Add the following attributes:

5. Re-run the Retrieval Graph

  • Enter the following search query again:

The system will process the input and predict three related questions that users might ask, as shown below.

Conclusion

Integrating the Retrieval Agent template within LangGraph Studio and CLI provides several key benefits:

  • Accelerated development: The template and visualization tools streamline the creation and debugging of retrieval workflows, reducing development time.
  • Seamless deployment: Built-in support for APIs and auto-scaling ensures smooth deployment across environments.
  • Easy updates: Modifying workflows, adding new functionalities, and integrating additional nodes is simple, making it easier to scale and enhance the retrieval process.
  • Persistent memory: The system retains agent states and knowledge, improving consistency and reliability.
  • Flexible workflow modeling: Developers can customize retrieval logic and communication rules for specific use cases.
  • Real-time interaction and debugging: The ability to interact with running agents allows for efficient testing and issue resolution.

By leveraging these features, organizations can build powerful, efficient, and scalable retrieval systems that enhance data accessibility and user experience.

The full source code for this project is available on GitHub.

Related content

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself