This is a cache of https://www.elastic.co/search-labs/blog/esql-ruby-helper-elasticsearch. It is a snapshot of the page at 2024-10-29T00:39:32.264+0000.
How to use the ES|QL Helper in the Elasticsearch Ruby Client - Search Labs

How to use the ES|QL Helper in the Elasticsearch Ruby Client

Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results.

Introduction

The Elasticsearch Ruby client can be used to craft EQ|QL queries and make it easier to work with the data returned from esql.query. ES|QL allows developers to filter, transform, and analyze data stored in Elasticsearch via queries. It uses "pipes" (|) to work with the data step by step.

The esql.query API has been supported in the Elasticsearch Ruby Client since it was available as experimental in version 8.11.0.

You can execute an ES|QL request with the following code:

client = Elasticsearch::Client.new
query = <<ESQL
        FROM sample_data
        | EVAL duration_ms = ROUND(event.duration / 1000000.0, 1)
ESQL

response = client.esql.query(body: { query: query })

The default response is parsed from JSON (you can also get a CSV or text by passing in the format parameter), and it looks like this:

puts response

{"columns"=>[
  {"name"=>"@timestamp", "type"=>"date"},
  {"name"=>"client.ip", "type"=>"ip"},
  {"name"=>"event.duration", "type"=>"long"},
  {"name"=>"message", "type"=>"keyword"},
  {"name"=>"duration_ms", "type"=>"double"}
],
"values"=>[
  ["2023-10-23T12:15:03.360Z", "172.21.2.162", 3450233, "Connected to 10.1.0.3", 3.5],
  ["2023-10-23T12:27:28.948Z", "172.21.2.113", 2764889, "Connected to 10.1.0.2", 2.8],
  ["2023-10-23T13:33:34.937Z", "172.21.0.5", 1232382, "Disconnected", 1.2],
  ["2023-10-23T13:51:54.732Z", "172.21.3.15", 725448, "Connection error", 0.7],
  ["2023-10-23T13:52:55.015Z", "172.21.3.15", 8268153, "Connection error", 8.3],
  ["2023-10-23T13:53:55.832Z", "172.21.3.15", 5033755, "Connection error", 5.0],
  ["2023-10-23T13:55:01.543Z", "172.21.3.15", 1756467, "Connected to 10.1.0.1", 1.8]
]}

ES|QL Helper

In Elasticsearch Ruby v8.13.0, the client introduced the ES|QL Helper for the esql.query API. Instead of the default response, the helper returns an array of hashes with the columns as keys and the respective values instead of the default JSON value.

Additionally, you can iterate through the response values and transform the data in by passing in a Hash of column => Proc values. You could use this for example to convert a @timestamp column value into a DateTime object. We'll take a look at how to use this with example data.

Setup and Ingesting data

For this example, we're using the JSON dump from TheGamesDB, a community driven crowd-sourced games information website. Once we've downloaded the JSON file, we can ingest it into Elasticsearch by using another Helper form the Ruby client, the Bulk Helper.

The data includes the list of all games in the database within the data.games keys. It also includes platforms and box art information, but for the purpose of this example, we're only going to use the games data. The BulkHelper provides a way to ingest a JSON file directly into Elasticsearch.

To use the helper, we need to require it in our code, and instantiate it with a client and an index on which to perform the bulk action (we can change the index later on an already instantiated helper). We can use ingest_json and pass in the JSON file, the keys where it can find the data, and slice to separate the documents in batches before sending them to Elasticsearch:

require 'elasticsearch/helpers/bulk_helper'
file = './database-latest.json'
index = 'videogames'

bulk_helper = Elasticsearch::Helpers::BulkHelper.new(client, index)
bulk_helper.ingest_json(file, keys: ['data', 'games'], slice: 100)

This will ingest all the game titles with their respective information into the videogames index.

Using the ES|QL Helper

With the data loaded, we can now query it with ES|QL:

require 'elasticsearch/helpers/esql_helper'

query = <<~ESQL
  FROM videogames
  | WHERE game_title LIKE "*Turrican*"
  | LIMIT 100
  | SORT release_date
ESQL

If we run this query with the esql.query API directly, we'll get the columns/values result:

response = client.esql.query(body: { query: query })
response['columns']
[
  {"name"=>"alternates", "type"=>"text"},
  {"name"=>"alternates.keyword", "type"=>"keyword"},
  {"name"=>"coop", "type"=>"text"},
  {"name"=>"coop.keyword", "type"=>"keyword"},
  {"name"=>"country_id", "type"=>"long"},
  {"name"=>"developers", "type"=>"long"},
  {"name"=>"game_title", "type"=>"text"},
 ...
]

response['values']
[
  [nil, nil, "No", "No", 0, 6970, "Turrican",  ...],
  [nil, nil, "No", "No", 0, nil, "Turrican II: The Final Fight", ...]

The helper however, returns an Array of Hashes with the columns as keys and the respective values. So we can work with the response, and access the value for each Hash in the Array with the name of a column as the key:

response = Elasticsearch::Helpers::ESQLHelper.query(client, query)
[
  {
    "alternates"=>nil,
    "alternates.keyword"=>nil,
    "coop"=>"No",
    "coop.keyword"=>"No",
    "country_id"=>0,
    "developers"=>6970,
    "game_title"=>"Turrican",
    ...
  },
  ...
]

response.map { |game| "#{game['game_title']} - πŸ“… #{game['release_date']}" }
[
  "Turrican - πŸ“… 1990-01-01T00:00:00.000Z",
  "Turrican II: The Final Fight - πŸ“… 1990-01-01T00:00:00.000Z",
  ...
]

The ESQLHelper also provides the ability to transform the data in the response. We can do this by passing in a Hash of column => Proc values. For example, let's say we want to format the release date in this previous query to show a more human friendly date. We can run this:

parser = proc do |t|
  DateTime.parse(t).strftime('%B %d, %Y')
end

response = Elasticsearch::Helpers::ESQLHelper.query(client, query, parser: { 'release_date' => parser })

If we run the same code from before, we'll get this result:

response.map { |game| "#{game['game_title']} - πŸ“… #{game['release_date']}" }
[
  "Turrican - πŸ“… January 01, 1990",
  "Turrican II: The Final Fight - πŸ“… January 01, 1990",
  ...
]

You can pass in as many Procs as there are columns in the response. For example, the data includes a youtube field, where sometimes the url for a video on YouTube is stored, other times just the video hash (e.g. U4bKxcV5hsg). The url for a YouTube video follows the convention https://youtube.com/watch?v=VIDEOHASH. So we could also add a parser to prepend the url to the values that only include the hash:

parser = {
  'release_date' => proc { |t| DateTime.parse(t).strftime('%B %d, %Y') },
  'youtube' => proc do |video|
    if video =~ URI::DEFAULT_PARSER.make_regexp
      video
    elsif video.nil? || video.empty?
      nil
    else
      "https://www.youtube.com/watch?v=#{video}"
    end
  end
}
# We run the query again and pass the new parser to the helper:
response = Elasticsearch::Helpers::ESQLHelper.query(client, query, parser: parser)

If we then run response.map { |a| a['youtube'] }.compact, we'll get just the urls for YouTube videos for the videogames we're looking for.

Conclusion

As you can see, the ESQLHelper class can make it easier to work with the data returned from esql.query. You can learn more about the Elasticsearch Ruby Client and its helpers in theΒ official documentation. And if you have any feedback, questions or requests, don't hesitate to create a new issue in the client's repository.

Ready to try this out on your own? Start a free trial.

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself