In this article, you will learn how to leverage LlamaIndex Workflows with Elasticsearch to quickly build a self-filtering search application using LLM.
LlamaIndex Workflows propose a different approach to the issue of splitting tasks into different agents by introducing a steps-and-events architecture. This simplifies the design compared to similar methodologies based on DAG (Directed Acyclic Graph) like LangGraph. If you want to read more about agents in general, I recommend you read this article.


Image Source: https://www.llamaindex.ai/blog/introducing-workflows-beta-a-new-way-to-create-complex-ai-applications-with-llamaindex
One of the main LlamaIndex features is the capacity to easily create loops during execution. Loops can help us with autocorrect tasks since we can repeat a step until we get the expected result or reach a given number of retries.
To test this feature, we’ll build a flow to generate Elasticsearch queries based on the user’s question using a LLM with an autocorrect mechanism in case the generated query is not valid. If after a given amount of attempts the LLM cannot generate a valid query, we’ll change the model and keep trying until timeout.
To optimize resources, we can use the first query with a faster and cheaper model, and if the generation still fails, we can use a more expensive one.

Understanding steps and events
A step is an action that needs to be run via a code function. It receives an event together with a context, which can be shared by all steps. There are two types of base events: StartEvent, which is a flow-initiating event, and StopEven, to stop the event’s execution.

A Workflow is a class that contains all the steps and interactions and puts them all together.
We’ll create a Workflow to receive the user’s request, expose mappings and possible fields to filter, generate the query, and then make a loop to fix an invalid query. A query could be invalid for Elasticsearch because it does not provide valid JSON or because it has syntax errors.
To show you how this works, we’ll use a practical case of searching for hotel rooms with a workflow to extract values to create queries based on the user’s search.
The complete example is available in this Notebook.
Steps
- Install dependencies and import packages
- Prepare data
- Llama-index workflows
- Execute workflow tasks
1. Install dependencies and import packages
We’ll use mistral-saba-24b and llama3-70b Groq models, so besides elasticsearch
and llama-index
, we’ll need the llama-index-llms-groq
package to handle the interaction with the LLMs.
Groq is an inference service that allows us to use different open available models from providers like Meta, Mistral, and OpenAI. In this example, we’ll use its free layer. You can get the API KEY that we’ll use later here.
Let’s proceed to install the required dependencies: Elasticsearch, the LlamaIndex core library, and the LlamaIndex Groq LLM’s package.
We start by importing some dependencies to handle environment variables (os), and managing JSON.
After that, we import the Elasticsearch client with the bulk helper to index using the bulk API. We finish by importing the Groq class from LlamaIndex to interact with the model, and the components to create our workflow.
2. Prepare data
Setup keys
We set the environment variables needed for Groq and Elasticsearch. The getpass library allows us to enter them via a prompt without echoing them.
Elasticsearch client
The Elasticsearch client handles the connection with Elasticsearch and allows us to interact with Elasticsearch using the Python library.
Ingesting data to Elasticsearch
We are going to create an index with hotel rooms as an example:
Mappings
We’ll use text-type fields for the properties where we want to run full-text queries; “keyword” for those where we want to apply filters or sorting, and “byte/integer” for numbers.
Ingesting documents to Elasticsearch
Let’s ingest some hotel rooms and amenities so users can ask questions that we can turn into Elasticsearch queries against the documents.
We parse the JSON documents into a bulk Elasticsearch request.
3. LlamaIndex Workflows
We need to create a class with the functions required to send Elasticsearch mapping to the LLM, run the query, and handle errors.
Workflow prompts
The EXTRACTION_PROMPT
will provide the user’s question and index the mappings to the LLM so it can return an Elasticsearch query.
Then, the REFLECTION_PROMPT
will help the LLM make corrections in case of errors by providing the output from the EXTRACTION_PROMPT
, plus the error caused by the query.
Workflow events
We created classes to handle extraction and query validation events:
Workflow
Now, let’s put everything together. We first need to set the maximum number of attempts to change the model to 3.
Then, we will do an extraction using the model configured in the workflow. We validate if the event is StartEvent
; if so, we capture the model and question (passage).
Afterward, we run the validation step, that is, trying to run the extracted query in Elasticsearch. If there are no errors, we generate a StopEvent
and stop the flow. Otherwise, we issue a ValidationErrorEvent
and repeat step 1, providing the error to try to correct it and return to the validation step. If there is no valid query after 3 attempts, we change the model and repeat the process until we reach the timeout parameter of 60s running time.
4. Execute workflow tasks
We will make the following search: Rooms with smart TV, wifi, jacuzzi and price per night less than 300
. We’ll start using the mistral-saba-24b
model and switch to llama3-70b-8192
, if needed, following our flow.

Results
(Formatted for readability)
=== EXTRACT STEP ===
MODEL: mistral-saba-24b
OUTPUT:
Step extract produced event ExtractionDone
Running step validate
=== VALIDATE STEP ===
Max retries for model mistral-saba-24b reached, changing model
Elasticsearch results:
Step validate produced event ValidationErrorEvent
Running step extract
=== EXTRACT STEP ===
MODEL: llama3-70b-8192
OUTPUT:
Step extract produced event ExtractionDone
Running step validate
=== VALIDATE STEP ===
Elasticsearch results:
Step validate produced event StopEvent
In the example above, the query failed because the mistral-saba-24b
model returned it in markdown format, adding ```json at the beginning and ``` at the end. In contrast, the llama3-70b-8192
model directly returned the query using the JSON format. Based on our needs, we can capture, validate, and test different errors or build fallback mechanisms after a number of attempts.
Conclusion
The LlamaIndex workflows offer an interesting alternative to develop agentic flows using events and steps. With only a few lines of code, we managed to create a system that is able to autocorrect with interchangeable models.
How could we improve this flow?
- Along with the mappings, we can send to the LLM possible exact values for the filters, reducing the number of no result queries because of misspelled filters. To do so, we can run a terms aggregation on the features and show the results to the LLM.
- Adding code corrections to common issues—like the Markdown issue we had—to improve the success rate.
- Adding a way to handle valid queries that yield no results. For example, remove one of the filters and try again to make suggestions to the user. A LLM could be helpful in choosing which filters to remove based on the context.
- Adding more context to the prompt, like user preferences or previous searches, so that we can provide customized suggestions together with the Elasticsearch results.
Would you like to try one of these?
Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!
Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial, or try Elastic on your local machine now.