This is a cache of https://www.elastic.co/search-labs/blog/vertex-ai-elasticsearch-playground-fast-rag-apps. It is a snapshot of the page at 2024-10-10T00:26:03.664+0000.
Quickly iterate chat apps on Elasticsearch playground with Google Cloud Gemini models — Search Labs

Quickly create RAG apps with Vertex AI Gemini models and Elasticsearch playground

Quickly create RAG apps with Vertex AI Gemini models and Elasticsearch playground

In this blog, we will connect Elasticsearch to Google’s Gemini 1.5 chat model using Elastic’s Playground and Vertex AI API. The addition of Gemini models to Playground enables Google Cloud developers to quickly ground LLMs, test retrieval, tune chunking, and ship gen AI search apps to prod with Elastic.

You will need an Elasticsearch cluster up and running. We will use a Serverless Project on Elastic Cloud. If you don’t have an account, you can sign up for a free trial.

You will also need a Google Cloud account with Vertex AI Enabled. If you don’t have a Google Cloud account, you can sign up for a free trial.

Configuring Vertex AI

First, we will configure a Vertex AI service account, which will allow us to make API calls securely from Elasticsearch to the Gemini model. You can follow the detailed instructions on Google Cloud’s doc page here, but we will cover the main points.

Go to the Create Service Account section of the Google Cloud console. There, select the project which has Vertex AI enabled.

Next, give your service account a name and optionally, a description. Click “Create and Continue”.

Set the access controls for your project. For this blog, we used the “Vertex AI User” role, but you need to ensure your access controls are appropriate for your project and account.

Click Done.

The final setup in Google Cloud is to create an API key for the service account and download it in JSON format.

Click “KEYS” in your service account then “ADD KEY” and “Create New”.

Ensure you select “json” as the key type then click “CREATE”.

The key will be created and automatically downloaded to your computer. We will need this key in the next section.

Connect to your LLM from Playground

With Google Cloud configured, we can continue configuring the Gemini LLM connection in Elastic’s Playground.

This blog assumes you already have data in Elasticsearch you want to use with Playground. If not, follow the Search Labs Blog Playground: Experiment with RAG applications with Elasticsearch in minutes to get started.

In Kibana, Select Playground from the side navigation menu. In Serverless, this is under the “Build” heading. When that opens for the first time, you can select “Connect to an LLM”.

Select “Google Gemini”:

Fill out the form to complete the configuration.

Open the JSON credentials file created and downloaded from the previous section, copy the complete JSON, and paste it into the “Credentials JSON” section. Then click “Save”

It’s Playground Time!

Elastic’s Playground allows you to experiment with RAG context settings and system prompts before integrating into full code.

By changing settings while chatting with the model, you can see which settings will provide the optimal responses for your application.

Additionally, configure which fields in your Elasticsearch data are searched to add context to your chat completion request. Adding context will help ground the model and provide more accurate responses.

This step uses Elastic’s ELSER sparse embeddings model, available built-in, for retrieving context via semantic search, that is passed on to the Gemini model.

That’s it (for now)

Conversational search is an exciting area where powerful large language models, such as those offered by Google Vertex AI are being used by developers to build new experiences. Playground simplifies the the process of prototyping and tuning, enabling you to ship your apps more quickly.

Explore more ideas to build with Elasticsearch and Google Vertex AI, and happy searching!

Ready to try this out on your own? Start a free trial.

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!

Related content

Building a search app with Blazor and Elasticsearch

October 9, 2024

Building a search app with Blazor and Elasticsearch

Learn how to build a search application using Blazor and Elasticsearch, and how to use the Elasticsearch .NET client for hybrid search.

LangChain4j with Elasticsearch as the embedding store

October 8, 2024

LangChain4j with Elasticsearch as the embedding store

LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Discover how to use it to build your RAG application in plain Java.

Using Eland on Elasticsearch Serverless

October 4, 2024

Using Eland on Elasticsearch Serverless

Learn how to use Eland on Elasticsearch Serverless

Testing your Java code with mocks and real Elasticsearch

October 3, 2024

Testing your Java code with mocks and real Elasticsearch

Learn how to write your automated tests for Elasticsearch, using mocks and Testcontainers

How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder

October 2, 2024

How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder

Learn about different ways you can ingest data from AWS S3 into Elastic Cloud

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself