This is a cache of https://www.elastic.co/search-labs/blog/filters-facets-using-ml. It is a snapshot of the page at 2025-04-08T00:34:05.782+0000.
Generating filters and facets using ML - Elasticsearch Labs

Generating filters and facets using ML

Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach.

Filters and facets are mechanisms used to refine search results, helping users find relevant content or products more quickly. In the classical approach, rules are manually defined. For example, in a movie catalog, attributes such as genre are pre-defined for use in filters and facets. On the other hand, with AI models, new attributes can be automatically extracted from the characteristics of movies, making the process more dynamic and personalized. In this blog, we explore the pros and cons of each method, highlighting their applications and challenges.

Filters and facets

Before we begin, let's define what filters and facets are. Filters are predefined attributes used to restrict a set of results. In a marketplace, for example, filters are available even before a search is performed. The user can select a category, such as "Video games", before searching for "PS5", refining the search to a more specific subset instead of the entire database. This significantly increases the chances of obtaining more relevant results.

Facets work similarly to filters but are only available after the search is performed. In other words, the search returns results, and based on them, a new list of refinement options is generated. For example, when searching for a PS5 console, facets such as storage capacity, shipping cost, and color may be displayed to help users choose the ideal product.

Now that we have defined filters and facets, let's discuss the impact of the classical and Machine Learning (ML)-based approaches on their implementation and usage. Each method has advantages and challenges that influence search efficiency.

Classical approach

In this approach, filters and facets are manually defined based on predefined rules. This means that the attributes available for refining the search are fixed and planned in advance, considering the catalog structure and user needs.

For example, in a marketplace, categories such as "Electronics" or "Fashion" may have specific filters like brand, format and price range. These rules are created statically, ensuring consistency in the search experience but requiring manual adjustments whenever new products or categories emerge.

Although this approach provides predictability and control over the displayed filters and facets, it can be limited when new trends arise that demand dynamic refinement.

Pros:

  • Predictability and control: Since filters and facets are manually defined, management becomes easier.
  • Low complexity: No need to train models.
  • Ease of maintenance: As rules are predefined, adjustments and corrections can be made quickly.

Cons:

  • Reindexing required for new filters: Whenever a new attribute needs to be used as a filter, the entire dataset must be reindexed to ensure that documents contain this information.
  • Lack of dynamic adaptation: Filters are static and do not automatically adjust to changes in user behavior.

Implementation of filters/facets – Classical approach

In Dev Tools, Kibana, we will create a demonstration of filters/facets using the classical approach.

First, we define the mapping to structure the index:

The brand and storage fields are set as keyword, allowing them to be used directly in aggregations (facets). The price field is of type float, enabling the creation of price ranges.

In the next step, the product data will be indexed:

Now, let's retrieve classic facets by grouping the results by brand, storage, and price range. In the query, size:0 was defined. In this scenario, the goal is to retrieve only the aggregation results without including the documents corresponding to the query.

The response will include counts for Brand, Storage, and Price, helping to create filters and facets.

Machine learning/AI-based approach

In this approach, Machine Learning (ML) models, including Artificial Intelligence (AI) techniques, analyze data attributes to generate relevant filters and facets. Instead of relying on predefined rules, ML/AI leverages indexed data characteristics. This enables the dynamic discovery of new facets and filters.

Pros:

  • Automatic updates: New filters and facets are generated automatically, without the need for manual adjustments.
  • Discovery of new attributes: It can identify previously unconsidered data characteristics as filters, enriching the search experience.
  • Reduced manual effort: The team does not need to constantly define and update filtering rules as AI learns from available data.

Cons:

  • Maintenance complexity: The use of models may require pre-validation to ensure the consistency of the generated filters.
  • Requires ML and AI expertise: The solution demands qualified professionals to fine-tune and monitor model performance.
  • Risk of irrelevant filters: If the model is not well-calibrated, it may generate facets that are not useful for users.
  • Cost: The use of ML and AI may require third-party services, increasing operational costs.

It's worth noting that even with a well-calibrated model and a well-crafted prompt, the generated facets should still go through a review step. This validation can be manual or based on moderation rules, ensuring that the content is appropriate and safe. While not necessarily a drawback, it is an important consideration to ensure the quality and suitability of the facets before they are made available to users.

Implementation of filters/facets – AI approach

In this demonstration, we will use an AI model to automatically analyze product characteristics and suggest relevant attributes. With a well-structured prompt, we extract information from the catalog and transform it into filters and facets. Below, we present each step of the process.

Initially, we will use the Inference API to register an endpoint for integration with an ML service. Below is an example of integration with OpenAI's service.

Now, we define the pipeline to execute the prompt and obtain the new filters generated by the model.

Running a simulation of this pipeline for the "PlayStation 5" product, with the following description:

Stunning Gaming: Marvel at stunning graphics and experience the features of the new PS5.

Breathtaking Immersion: Discover a deeper gaming experience with support for haptic feedback, adaptive triggers, and 3D Audio technology.

Slim Design: With the PS5 Digital Edition, gamers get powerful gaming technology in a sleek, compact design.

1TB of Storage: Have your favorite games ready and waiting for you to play with 1TB of built-in SSD storage.

Backward Compatibility and Game Boost: The PS5 console can play over 4,000 PS4 games. With Game Boost, you can even enjoy faster, smoother frame rates in some of the best PS4 console games.

Let's observe the prompt output generated from this simulation.

Now a new field, dynamic_facets, will be added to the new index to store the facets generated by the AI.

Using the Reindex API, we will reindex the videogames index to videogames_1, applying the generate_filter_ai pipeline during the process. This pipeline will automatically generate dynamic facets during indexing.

Now, we will run a search and get the new filters:

Results:

To symbolize the implementation of the facets, below is a simple front-end:

The UI code presented is here.

Conclusion

Both approaches to creating filters and facets have their benefits and points of concern. The classic approach, based on manual rules, offers control and lower costs but requires constant updates and does not dynamically adapt to new products or features.

On the other hand, the AI ​​and Machine Learning-based approach automates facet extraction, making the search more flexible and allowing the discovery of new attributes without manual intervention. However, this approach can be more complex to implement and maintain, requiring calibration to ensure consistent results.

The choice between the classic and AI-based approaches depends on the needs and complexity of the business. For simpler scenarios, where data attributes are stable and predictable, the classic approach can be more efficient and easier to maintain, avoiding unnecessary costs with infrastructure and AI models. On the other hand, the use of ML/AI to extract facets can add significant value, improving the search experience and making filtering more intelligent.

The important thing is to evaluate whether automation justifies the investment or whether a more traditional solution already meets the business needs effectively.

Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial, or try Elastic on your local machine now.

Related content

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself