This is a cache of https://www.elastic.co/search-labs/blog/semantic-search-query-rules. It is a snapshot of the page at 2024-12-23T00:40:22.891+0000.
Ensuring business rules work seamlessly with semantic search - Elasticsearch Labs

Ensuring business rules work seamlessly with semantic search

Harness the power of query rules combined with semantic search and rerankers.

Did you know that query rules work seamlessly with semantic search? With query rules as a retriever, it’s never been easier to combine semantic search and complex logic such as RRF or semantic reranking with the power of query rules.

Query rules are an important tool in our relevance toolbox,

Introducing the query rules retriever

The rule retriever is what’s known as a compound retriever, which allows chaining complex behavior in a retriever tree where the order of operation matters.

Like the rule query, the rule retriever works on already-defined query rulesets. You can create a ruleset using the query rules CRUD API.

What does this look like in practice? Here’s an example of a simple query ruleset, that pins a document with the id id1 when the query_string parameter matches puggles:

PUT /_query_rules/my-ruleset
{
  "rules": [
    {
      "rule_id": "rule1",
      "type": "pinned",
      "criteria": [
        {
          "type": "exact",
          "metadata": "query_string",
          "values": [ "puggles" ]
        }
      ],
      "actions": {
        "ids": [
          "id1"
        ]
      }
    }
  ]
}

Next, here is a simple example of a rule retriever that will match on this ruleset:

POST my-index/_search
{
  "retriever": {
    "rule": {
      "match_criteria": {
        "query_string": "puggles"
      },
      "ruleset_ids": [
        "my-ruleset"
      ],
      "retriever": {
        "standard": {
          "query": {
            "query_string": {
              "query": "puggles"
            }
          }
        }
      }
    }
  }
}

In this case, we’re simply defining a standard sub-retriever that’s a simple query_string query. This is very similar to how the rule query works today by specifying an organic query. The retriever will return a list of search results with matching rule(s) applied.

Semantic search and query rules

The simple example doesn’t show the true power of query rules: applying business rules on top of semantic search. This can help to return results important to promotional campaigns or simply “fix” specific queries where semantic search doesn’t return the results we want.

We can use the same retriever framework to perform query rules using semantic search by specifying these queries under the defined standard retriever. Here is an example using the semantic query:

POST my-index/_search
{
  "retriever": {
    "rule": {
      "match_criteria": {
        "query_string": "puggles"
      },
      "ruleset_ids": [ "my-ruleset" ],
      "retriever": {
        "standard": {
          "query": {
            "semantic": {
              "field": "semantic_field",
              "query": "what is the best pug mix?"
            }
          }
        }
      }
    }
  }
}

Similarly, sparse_vector and knn queries will seamlessly work when defined as standard retrievers with the query rules retriever.

Reranking and query rules

You can combine RRF with query rules by nesting the rrf retriever under the rule retriever, for example:

POST my-index/_search
{
  "retriever": {
    "rule": {
      "match_criteria": {
        "query_string": "puggles"
      },
      "ruleset_ids": [
        "my-ruleset"
      ],
      "retriever": {
        "rrf": {
          "retrievers": [
            {
              "standard": {
                "query": {
                  "semantic": {
                    "field": "semantic_field",
                    "query": "what is the best pug mix?"
                  }
                }
              }
            },
            {
              "standard": {
                "query": {
                  "query_string": {
                    "query": "puggles"
                  }
                }
              }
            }
          ]
        }
      }
    }
  }
}

Important note: Order is important here. While technically there’s nothing stopping you from running RRF on top of the rule retriever, this will not work as intended due to the order of operations in the retriever tree. When you’re running a rule retriever, in order to ensure that all of the rules are applied as intended, the rule retriever must always be the outermost/top level retriever.

Similarly, you can combine query rules with semantic reranking. Here’s an example using our Elastic reranker:

POST my-index/_search
{
  "retriever": {
    "rule": {
      "match_criteria": {
        "query_string": "puggles"
      },
      "ruleset_ids": [ "my-ruleset" ],
      "retriever": {
        "text_similarity_reranker": {
          "retriever": {
            "standard": {
              "query": {
                "semantic": {
                  "field": "semantic_field",
                  "query": "what is the best pug mix?"
                }
              }
            }
          }
        }
      },
      "field": "text_field",
      "inference_id": "elastic-rerank-endpoint",
      "inference_text": "what is the best pug mix?"
    }
  }
}

Bringing it all together, here’s an example of how you could combine semantic, sparse_vector, knn and lexical text search queries together with RRF and semantic reranking, and apply query rules on top of them:

POST my-index/_search
{
  "retriever": {
    "rule": {
      "match_criteria": {
        "query_string": "puggles"
      },
      "ruleset_ids": [ "my-ruleset" ],
      "retriever": {
        "text_similarity_reranker": {
          "retriever": {
            "rrf": {
              "retrievers": [
                {
                  "standard": {
                    "query": {
                      "sparse_vector": {
                        "field": "sparse_field",
                        "inference_id": "elser-endpoint",
                        "query": "what is the best pug mix?"
                      }
                    }
                  }
                },
                {
                  "standard": {
                    "query": {
                      "knn": {
                        "field": "dense_field",
                        "query_vector": [ 1, 2, 3 ],
                        "k": 10,
                        "num_candidates": 100
                      }
                    }
                  }
                },
                {
                  "standard": {
                    "query": {
                      "semantic": {
                        "field": "semantic_field",
                        "query": "what is the best pug mix?"
                      }
                    }
                  }
                },
                {
                  "standard": {
                    "query": {
                      "query_string": {
                        "query": "puggles"
                      }
                    }
                  }
                }
              ]
            }
          },
          "field": "text_field",
          "inference_id": "elastic-rerank-endpoint",
          "inference_text": "what is the best pug mix?"
        }
      }
    }
  }
}

Combining rule types

Query rules aren’t just for pinned documents anymore! In Elasticsearch 8.16 we introduced a new rule type, exclude. This allows you to specify documents that you never want to be returned in search results, as well as documents you want to pin at the top of search results.

Use cases for the exclude rule include but are not limited to:

  • Fixing relevance issues in specific queries by removing results that aren’t helpful or relevant to the query
  • Suppressing results temporarily that we don’t want to be returned in any search results until a certain time

Here’s an example of a query ruleset that includes both pinned and excluded rules:

PUT /_query_rules/my-ruleset
{
  "rules": [
    {
      "rule_id": "rule1",
      "type": "pinned",
      "criteria": [
        {
          "type": "exact",
          "metadata": "query_string",
          "values": [ "puggles" ]
        }
      ],
      "actions": {
        "ids": [
          "id1"
        ]
      }
    },
    {
      "rule_id": "rule2",
      "type": "exclude",
      "criteria": [
        {
          "type": "exact",
          "metadata": "query_string",
          "values": [ "chiweenies" ]
        }
      ],
      "actions": {
        "ids": [
          "id2"
        ]
      }
    }
  ]
}

Rules are applied based on match criteria, so it’s possible for a rule retriever to match on both pinned and excluded documents in the same query.

Try it out yourself

The rule retriever is exceptionally powerful when combined with semantic search and reranking strategies, because it offers fine-tuned control over search results while harnessing the power of semantic search. The rule retriever is already available in our Serverless offering, and will be available for Stack versions starting with 8.17.0.

Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial, or try Elastic on your local machine now.

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself