This is a cache of https://www.elastic.co/search-labs/blog/agentic-rag. It is a snapshot of the page at 2025-07-02T01:05:55.327+0000.
Building an agentic RAG assistant with JavaScript, Mastra and Elasticsearch - Elasticsearch Labs

Building an agentic RAG assistant with JavaScript, Mastra and Elasticsearch

Learn how to build AI agents in the JavaScript ecosystem

Elasticsearch has native integrations to industry leading Gen AI tools and providers. check out our webinars on going Beyond RAG Basics, or building prod-ready apps Elastic Vector Database.

To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now.

This idea came to me while in the midst of a heated, high-stakes fantasy basketball league. I wondered: Could I build an AI agent that would help me dominate my weekly matchups? Absolutely!

In this post, we’ll explore how to build an agentic RAG assistant using Mastra and a lightweight JavaScript web application to interact with it. By connecting this agent to Elasticsearch, we give it access to structured player data and the capability to run real-time statistical aggregations, in order to give you recommendations grounded in player statistics. Head over to the GitHub repo to follow along; the README provides instructions on how to clone and run the application on your own.

Here’s what it should look like when it’s all put together:

Note: This blog post builds upon “Building AI Agents with AI SDK and Elastic”. If you’re new to AI agents in general and what they could be used for, start there.

Architecture overview

At the core of the system is a large language model (LLM), which acts as the agent’s reasoning engine (the brain). It interprets user input, decides which tools to call, and orchestrates the steps needed to generate a relevant response.

The agent itself is scaffolded by Mastra, an agent framework in the JavaScript ecosystem. Mastra wraps the LLM with backend infrastructure, exposes it as an API endpoint, and provides an interface for defining tools, system prompts and agent behavior.

On the frontend, we use Vite to quickly scaffold a React web application that provides a chat interface for sending queries to the agent and receiving its responses.

Finally, we have Elasticsearch, which stores player statistics and matchup data that the agent can query and aggregate.

Background

Let’s go over a few fundamental concepts:

What is agentic RAG?

AI agents can interact with other systems, operate independently, and perform actions based on their defined parameters. Agentic RAG combines the autonomy of an AI Agent with the principles of retrieval augmented generation, enabling an LLM to choose what tools to call and which data to use as context to generate a response. Read more about RAG here.

Choosing a framework, why go beyond AI-SDK?

There are many AI agent frameworks available and you’ve probably heard of the more popular ones like CrewAI, AutoGen and LangGraph. Most of these frameworks share a common set of functionalities, including support for different models, tool usage, and memory management.

Here is a framework comparison sheet by Harrison Chase (CEO of LangChain).

What piqued my interest with Mastra is that it’s a JavaScript-first framework built for full-stack developers to easily integrate agents into their ecosystem. Vercel’s AI-SDK also does most of this, but where Mastra shines is when your projects include more complex agent workflows. Mastra enhances the base patterns set by the AI-SDK and in this project, we’ll be using them in tandem.

Frameworks and model choice considerations

While these frameworks can help you build AI agents quickly, there are some drawbacks to consider. For example, when using any other frameworks outside of AI agents or any abstraction layer in general, you lose a bit of control. If the LLM doesn’t use the tools correctly or does something you don’t want it to, the abstraction makes it harder to debug. Still, in my opinion, this tradeoff is worth the ease and speed you get when building, especially because these frameworks are gaining momentum and are being constantly iterated on.

Again, these frameworks are model agnostic, meaning you can plug and play different models, remember that models vary in the data sets they were trained on and in turn vary in the responses they give. Some models don’t even support tool calling. So it is possible to switch and test out different models to see which one gives you the best responses, but keep in mind you will most likely have to rewrite the system prompt for each one. For example, using Llama3.3 over GPT-4o, involves a lot more prompting and specific instructions to get the response you want.

NBA fantasy basketball

Fantasy basketball involves starting a league with a group of your friends (warning, depending on how competitive your group is, it could affect the status of your friendships), usually with some money at stake. Each of you then drafts a team of 10 players to compete against another friends’ 10 players alternating weekly. The points that contribute to your overall score are how each of your players does against their opponents in a given week.

If a player on your team gets injured, suspended, etc., there is a list of free agent players available to add to your team. This is where a lot of the hard thinking in fantasy sports occurs because you only have a limited number of pickups and everyone is constantly on the hunt to pick up the best player.

This is where our NBA AI assistant will shine, especially in situations where you quickly have to decide which player to pick up. Instead of having to manually look up how a player performs against a specific opponent, the assistant can find that data quickly and compare averages to give you an informed recommendation.

Now that you know some basics about agentic RAG and NBA fantasy basketball, let’s see it in practice.

Building the project

If you get stuck at any point or don’t want to build it from scratch, please refer to the repo.

What we’ll cover

  1. Scaffolding the project:
    1. Backend (Mastra): Use npx create mastra@latest to scaffold the backend and define the agent logic.
    2. Frontend (Vite + React): Use npm create vite@latest to build the frontend chat interface to interact with the agent.
  2. Setting up environment variables
    1. Install dotenv to manage environment variables.
    2. Create an .env file and provide the required variables.
  3. Setting up Elasticsearch
    1. Spin up an Elasticsearch cluster (either locally or on cloud).
    2. Install the official Elasticsearch client.
    3. Ensure environment variables are accessible.
    4. Establish connection to the client.
  4. Bulk ingesting NBA data into Elasticsearch
    1. Create an index with the appropriate mappings to enable aggregations.
    2. Bulk ingest player game statistics from a CSV file into an Elasticsearch index.
  5. Define Elasticsearch Aggregations
    1. Query to calculate historical averages against a specific opponent.
    2. Query to calculate season averages against a specific opponent.
  6. Player comparison utility file
    1. Consolidates helper functions and Elasticsearch aggregations.
  7. Building the agent
    1. Add the agent definition and system prompt.
    2. Install zod and define tools.
    3. Add middleware setup to handle CORS.
  8. Integrating the frontend
    1. Using AI-SDK’s useChat to interact with the agent.
    2. Create the UI to hold properly formatted conversations.
  9. Running the application
    1. Start both the backend (Mastra server) and frontend (React app).
    2. Sample queries and usage.
  10. What’s next: Making the agent more intelligent
    1. Adding semantic search capabilities to enable more insightful recommendations.
    2. Enable dynamic querying by moving the search logic to the Elasticsearch MCP (Model Context Protocol) server.

Prerequisites

  • Node.js and npm: Both the backend and the frontend run on Node. Make sure you have Node 18+ and npm v9+ installed (which comes bundled with Node 18+).
  • Elasticsearch cluster: An active Elasticsearch cluster, either locally or on cloud.
  • OpenAI API Key: Generate one on the API keys page in OpenAI's developer portal.

Project structure

Step 1: Scaffolding the project

  1. First, create the directory nba-ai-assistant-js and navigate inside using:

Backend:

  1. Use the Mastra create tool with the command:

2. You should get some prompts in your terminal, for the first one, we’ll name the project backend:

3. Next, we’ll keep the default structure for storing the Mastra files, so input src/.

4. Then, we’ll choose OpenAI as our default LLM provider.

5. Finally, it will ask for your OpenAI API key. For now, we’ll choose the option to skip and provide it later in a .env file.

Frontend:

  1. Navigate back to the root directory and run the Vite create tool using this command: npm create vite@latest frontend -- --template react

This should create a lightweight React app named frontend with a specific template for React.

If all goes well, inside your project directory, you should be looking at a backend directory that holds the Mastra code and a frontend directory with your React app.

Step 2: Setting up environment variables

  1. To manage sensitive keys, we’ll use the dotenv package to load our environment variables from the .env file. Navigate to the backend directory and install dotenv:

2. While in the backend directory, an example.env file is provided with the appropriate variables to fill in. If you create your own, be sure to include the following variables:

Note: Make sure this file is excluded from your version control by adding .env to .gitignore.

Step 3: Setting up Elasticsearch

First, you need an active Elasticsearch cluster. There are two options:

  • Option A: Use Elasticsearch Cloud
    • Sign up for Elastic Cloud
    • Create a new deployment
    • Get your endpoint URL and API key (encoded)
  • Option B: Run Elasticsearch locally
    • Install and run Elasticsearch locally
    • Use http://localhost:9200 as your endpoint
    • Generate an API key

Installing the Elasticsearch client on the backend:

  1. First, install the official Elasticsearch client in your backend directory:
  1. Then create a directory lib to hold reusable functions and navigate into it:
  1. Inside, create a new file called elasticClient.js. This file will initialize the Elasticsearch client and expose it for use across your project.

4. Since we’re using ECMAScript modules (ESM), __dirname and __filename aren’t available. To ensure your environment variables are correctly loaded from the .env file in the backend folder, add this setup to the top of your file:

5. Now, initialize the Elasticsearch client using your environment variables and check the connection:

Now, we can import this client instance to any file that needs to interact with your Elasticsearch cluster.

Step 4: Bulk ingesting NBA data into Elasticsearch

Dataset:

For this project, we’ll reference the datasets available in the backend/data directory in the repo. Our NBA assistant will use this data as its knowledge base for running statistical comparisons and generating recommendations.

  • sample_player_game_stats.csv - Sample player game statistics (e.g., points, rebounds, steals, etc, per game per player over their entire NBA career. We’ll use this dataset to perform aggregations. (Note: This is mock data, pre-generated for demo purposes and not sourced from official NBA sources.)
  • playerAndTeamInfo.js - Substitute for player and team metadata that would normally be provided by an API call so the agent can match player and team names to IDs. Since we are using sample data, we don’t want the overhead of fetching from an external API, so we hardcoded some values the agent can reference.

Implementation:

  1. While in the backend/lib directory, create a file named playerDataIngestion.js.
  2. Set up imports, resolve the CSV file path and set up parsing. Again, since we’re using ESM, we need to reconstruct __dirname to resolve the path to the sample CSV. Also, we’ll import Node.js’s built-in modules, fs and readline, to parse through the given CSV file line by line.

This sets you up to efficiently read and parse the CSV when we get to the bulk ingestion step.

3. Create an index with the appropriate mapping. While Elasticsearch can automatically infer field types with dynamic mapping, we want to be explicit here so that each stat gets treated as a numerical field. This is important because we’ll use these fields for aggregations later on. We also want to use the type float for stats like points, rebounds, etc., to make sure we include decimal values. Finally, we want to add the mapping property dynamic: 'strict' so that Elasticsearch doesn’t dynamically map unrecognized fields. 

4. Add the function to bulk ingest the CSV data into your Elasticsearch index. Inside the code block, we skip the header line. Then, split each line item by a comma and push them into the document object. This step also cleans them up and ensures they are the proper type. Next, we push the documents into the bulkBody array along with the index info, which will serve as the payload for the bulk ingestion into Elasticsearch.

5. Then, we can use Elasticsearch’s Bulk API with elasticClient.bulk() to ingest multiple documents in a single request. The error handling below is structured to give you a count of how many documents failed to be ingested and how many were successful.

6. Run the main() function below to sequentially run the createIndex() and bulkIngestCsv() functions.

If you see a console log saying the bulk ingestion was successful, perform a quick check on your Elasticsearch index to see if the documents were indeed successfully ingested.

Step 5: Defining Elasticsearch aggregations and consolidating

These will be the main functions that will be used when we define the tools for the AI Agent in order to compare players’ statistics against each other.

  1. Navigate to the backend/lib directory and create a file called elasticAggs.js.

2. Add the query below to calculate historical averages for a player against a specific opponent. This query uses a bool filter with 2 conditions: one matching player_id and another matching the opponent_team_id, to retrieve only the relevant games. We don’t need to return any documents, we only care about the aggregations, so we set size:0. Under the aggs block, we run multiple metric aggregations in parallel on fields like points, rebounds, assists, steals, blocks and fg_percentage to calculate their average values. LLMs can be hit or miss with calculations and this offloads that process to Elasticsearch, ensuring our NBA AI assistant has access to accurate data.

3. To calculate the season averages for a player against a specific opponent, we’ll use virtually the same query as the historical one. The only difference in this query is that the bool filter has an additional condition for game_date. The field game_date has to fall within the range of the current NBA season. In this case, the range is between 2024-10-01 and 2025-06-30. This extra condition below ensures that the aggregations that follow will isolate just the games from this season.

Step 6: Player comparison utility

To keep our code modular and maintainable, we’ll create a utility file that consolidates metadata helper functions and Elasticsearch aggregations. This will power the main tool used by the agent. More on that later:

  1. Create a new file comparePlayers.js in the backend/lib directory.

2. Add the function below to consolidate metadata helpers and Elasticsearch aggregation logic into a single function that powers the main tool used by the agent.

Step 7: Building the agent

Now that you’ve created the frontend and backend scaffolding, ingested NBA game data, and established a connection to Elasticsearch, we can start to put all the pieces together to build the agent.

Defining the agent

Navigate to the index.ts file within the backend/src/mastra/agents directory and add the agent definition. You can specify fields like:

  • Name: Give your agent a name that will be used as a reference when called on the frontend.
  • Instructions/system prompt: A system prompt gives the LLM the initial context and rules to follow during the interaction. It’s similar to the prompt users will send through the chat box, but this one is given before any user input. Again, this will change depending on the model you choose.
  • Model: Which LLM to use (Mastra supports OpenAI, Anthropic, local models, etc.).
  • Tools: A list of tool functions the agent can call.
  • Memory: (Optional) if we want the agent to remember conversation history, etc. For simplicity, we can start without persistent memory, though Mastra supports it.


Defining tools

  1. Navigate to the index.ts file within the backend/src/mastra/tools directory.
  2. Install Zod using the command:
  1. Add tool definitions. Note that we import the function within the comparePlayers.js file as the main function the agent will use when calling this tool. Using Mastra’s createTool() function, we will register our playerComparisonTool. The fields include:
  • id: This is a natural language description to help the agent understand what the tool does.
  • input schema: To define the shape of the input for the tool, Mastra uses the Zod schema, which is a TypeScript schema validation library. Zod helps by making sure the agent inputs correctly structured input and prevents the tool from executing if the input structure doesn’t match.
  • description: This is a natural language description to help the agent understand when to call and use the tool.
  • execute: The logic that runs when the tool is called. In our case, we are using an imported helper function to return performance stats.

Adding middleware to handle CORS

Add middleware in the Mastra server to handle CORS. They say there are three things in life you can’t avoid: death, taxes, and for web devs it’s CORS. In short, Cross-Origin Resource Sharing is a browser security feature that blocks the frontend from making requests to a backend running on a different domain or port. Even though we run both the backend and frontend on localhost, they use different ports, triggering the CORS policy. We need to add the middleware specified in the Mastra docs so that our backend allows those requests from the frontend.

  1. Navigate to the index.ts file within the backend/src/mastra directory and add the config for CORS:
  • origin: ['http://localhost:5173']
    • Allows requests from only this address (Vite default address)
  • allowMethods: ["GET", "POST"]
    • HTTP methods that are allowed. Most of the time, it will be using POST.
  • allowHeaders: ["Content-Type", "Authorization", "x-mastra-client-type, "x-highlight-request", "traceparent"],
    • These decide which custom headers can be used in requests

Step 8: Integrating the frontend

This React component provides a simple chat interface that connects to the Mastra AI agent using the useChat() hook from @ai-sdk/react. We are also going to use this hook to display token usage, tool calls and to render the conversation. In the system prompt above, we also ask the agent to output the response in markdown, so we’ll use react-markdown to properly format the response.

  1. While in the frontend directory, install the @ai-sdk/react package to use the useChat() hook.

2. While in the same directory, install React Markdown so we can properly format the response the agent generates.

3. Implement useChat(). This hook will manage the interaction between your frontend and your AI agent backend. It handles message state, user input, status and gives you lifecycle hooks for observability purposes. The options we pass in include:

  • api: This defines the endpoint of your Mastra AI Agent. It defaults to port 4111 and we also want to add the route that supports streaming responses.
  • onToolCall: This executes anytime the agent calls a tool; we are using it to track which tools our agent is calling.
  • onFinish: This executes after the agent completes a full response. Even though we enabled streaming, onFinish will still be run after the full message is received and not after each chunk. Here, we are using it to track our token usage. This can be helpful when monitoring LLM costs and optimizing them.

4. Finally, head over to the ChatUI.jsx component in the frontend/components directory to create the UI to hold our conversation. Next, wrap the response in a ReactMarkdown component in order to properly format the response from the agent.

Step 9: Running the application

Congrats! You are now ready to run the application. Follow these steps to start both the backend and frontend.

  1. In a terminal window, starting from the root directory, navigate to the backend directory and start the Mastra server:

2. In another terminal window, starting from the root directory, navigate to the frontend directory and start the React app:

3. Head over to your browser and navigate to:

http://localhost:5173

You should be able to see the chat interface. Try out these sample prompts:

  • "Compare LeBron James and Stephen Curry"
  • "Who should I pick between Jayson Tatum and Luka Doncic?"

What’s next: Making the agent more intelligent

To make the assistant more agentic and the recommendations more insightful, I’ll be adding a few key upgrades in the next iteration.

Semantic search for NBA news

There are a ton of factors that can affect player performance, a lot of which don’t show up in raw stats. Things like injury reports, lineup changes, or even a post-game analysis, you can only find in news articles. To capture this additional context, I’ll be adding semantic search capabilities so the agent can retrieve relevant NBA articles and factor that narrative into its recommendations.

Dynamic search with the Elasticsearch MCP server

MCP (Model Context Protocol) is quickly becoming the standard for how agents connect to data sources. I’ll be migrating the search logic into the Elasticsearch MCP server, which allows the agent to dynamically build queries rather than relying on predefined search functions we provide. This enables us to use more natural language workflows and reduces the need to manually write every single search query. Learn more about the Elasticsearch MCP server and the current state of the ecosystem here.

These changes are already in progress, stay tuned!

Conclusion

In this blog, we built an agentic RAG assistant that provides tailored recommendations for your fantasy basketball team using JavaScript, Mastra and Elasticsearch. We covered:

  • Agentic RAG fundamentals and how combining the autonomy of an AI agent with the tools to effectively use RAG can lead to more nuanced and dynamic agents.
  • Elasticsearch and how its data storage capabilities and powerful native aggregations make it a great partner as a knowledge base for an LLM.
  • The Mastra framework and how it simplifies building these agents for developers in the javaScript ecosystem.

Whether you’re a basketball fanatic, exploring how to build AI agents, or both like me, I hope this blog gave you some building blocks to get started. The full repo is available on GitHub, feel free to clone and tinker. Now, go win that fantasy league!

Related content

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself