This is a cache of https://www.elastic.co/search-labs/blog/alibaba-cloud-ai-embeddings-reranking. It is a snapshot of the page at 2025-03-03T00:35:32.670+0000.
Embeddings and reranking with Alibaba Cloud AI Service - Elasticsearch Labs

Embeddings and reranking with Alibaba Cloud AI Service

Using Alibaba Cloud AI Service features with Elastic.

In this article, we'll cover how to integrate Alibaba Cloud AI features with Elasticsearch to improve relevance in semantic searches.

Alibaba Cloud AI Search is a solution that integrates advanced AI features with Elasticsearch tools, by leveraging the Qwen LLM family to contribute with advanced models for inference and classification. In this article, we'll use descriptions of novels and plays written by the same author to test the Alibaba reranking and sparse embedding endpoints.

Steps

  1. Configure Alibaba Cloud AI
  2. Create Elasticsearch mappings
  3. Index data into Elasticsearch
  4. Query data
  5. Bonus: Answering questions with completion

Configure Alibaba Cloud AI

Alibaba Cloud AI reranking and embeddings

Open inference Alibaba Cloud offers different services. In this example, we'll use the descriptions of popular books and plays by Agatha Christie to test Alibaba Cloud embeddings and reranking endpoints in semantic search.

The Alibaba Cloud AI reranking endpoint is a semantic reranking functionality. This type of reranking uses a machine learning model to reorder search results based on their semantic similarity to a query. This allows you to use out-of-the-box semantic search capabilities on existing full-text search indices.

The sparse embedding endpoint is a type of embedding where most values are zero, making relevant information more prominent.

Get Alibaba Cloud API Key

We need a valid API Key to integrate Alibaba with Elasticsearch. To get it, follow these steps:

  1. Access the Alibaba Cloud portal from the Service Plaza section.
  2. Go to the left menu API Keys as shown below.
  3. Generate a new API Key.

Configure Alibaba Endpoints

We´ll first configure the sparse embedding endpoint to transform the text descriptions into semantic vectors:

Embeddings endpoint:

We´ll then configure the rerank endpoint to reorganize results.

Rerank Endpoint:

Now that the endpoints are configured, we can prepare the Elasticsearch index.

Create Elasticsearch mappings

Let's configure the mappings. For this, we need to organize both the texts with the descriptions as well as the model-generated vectors.

We'll use the following properties:

  • semantic_description: to store the embeddings generated by the model and run semantic searches.
  • description: we'll use a "text" type to store the novels and plays’ descriptions and use them for full-text search.

We'll include the copy_to parameter so that both the text and the semantic field are available for hybrid search:

With the mappings ready, we can now index the data.

Index data into Elasticsearch

Here's the dataset with the descriptions that we'll use for this example. We'll index it using the Elasticsearch Bulk API.

Note that the first two documents, “Black Coffee” and “The Mousetraps” are plays while the others are novels.

Query data

To see the different results we can get, we'll run different types of queries, starting with semantic query, then applying reranking, and finally using both. We'll use the same question "Which novel was written by Agatha Christie?" expecting to get the three documents that explicitly say novel, plus the one that says book. The two plays should be the last results.

We'll begin querying the semantic_text field to ask: "Which novel was written by Agatha Christie?" Let's see what happens:

Response:

In this case, the response prioritized most of the novels, but the document that says “book” appears last. We can still further refine the results with reranking.

Refining results with Reranking

In this case, we'll use a _inference/rerank request to assess the documents we got in the first query and improve their rank in the results.

Response:

The response here shows that both plays are now at the bottom of the results.

Semantic search and reranking endpoint combined

Using a retriever, we'll combine the semantic query and reranking in just one step:

Response:

The results here differ from the semantic query. We can see that the document with no exact match for "novel" but that says “book” (The Murder of Roger Ackroyd) appears higher than in the first semantic search. Both plays are still the last results, just like with reranking.

Bonus: Answering questions with completion

With embeddings and reranking we can satisfy a search query, but still, the user will see all the search results and not the actual answer.

With the examples provided, we are one step away from a RAG implementation, where we can provide the top results + the question to an LLM to get the right answer.

Fortunately, Alibaba Cloud AI Service also provides an endpoint service we can use to achieve this purpose.

Let’s create the endpoint

Completion Endpoint:

And now, send the results and question from the previous query:

Query

Response

Conclusion

Integrating Alibaba Cloud AI Search with Elasticsearch allows us to easily access completion, embedding, and reranking models to incorporate them into our search pipeline.

We can use the reranking and embedding endpoints, either separately or together, with the help of a retriever.

We can also introduce the completion endpoint to finish up a RAG end-to-end implementation.

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!

Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial, or try Elastic on your local machine now.

Related content

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself