Introduction
The Elasticsearch Ruby client can be used to craft EQ|QL queries and make it easier to work with the data returned from esql.query
. ES|QL allows developers to filter, transform, and analyze data stored in Elasticsearch via queries. It uses "pipes" (|
) to work with the data step by step.
The esql.query
API has been supported in the Elasticsearch Ruby Client since it was available as experimental in version 8.11.0.
You can execute an ES|QL request with the following code:
client = Elasticsearch::Client.new
query = <<ESQL
FROM sample_data
| EVAL duration_ms = ROUND(event.duration / 1000000.0, 1)
ESQL
response = client.esql.query(body: { query: query })
The default response is parsed from JSON (you can also get a CSV or text by passing in the format
parameter), and it looks like this:
puts response
{"columns"=>[
{"name"=>"@timestamp", "type"=>"date"},
{"name"=>"client.ip", "type"=>"ip"},
{"name"=>"event.duration", "type"=>"long"},
{"name"=>"message", "type"=>"keyword"},
{"name"=>"duration_ms", "type"=>"double"}
],
"values"=>[
["2023-10-23T12:15:03.360Z", "172.21.2.162", 3450233, "Connected to 10.1.0.3", 3.5],
["2023-10-23T12:27:28.948Z", "172.21.2.113", 2764889, "Connected to 10.1.0.2", 2.8],
["2023-10-23T13:33:34.937Z", "172.21.0.5", 1232382, "Disconnected", 1.2],
["2023-10-23T13:51:54.732Z", "172.21.3.15", 725448, "Connection error", 0.7],
["2023-10-23T13:52:55.015Z", "172.21.3.15", 8268153, "Connection error", 8.3],
["2023-10-23T13:53:55.832Z", "172.21.3.15", 5033755, "Connection error", 5.0],
["2023-10-23T13:55:01.543Z", "172.21.3.15", 1756467, "Connected to 10.1.0.1", 1.8]
]}
ES|QL Helper
In Elasticsearch Ruby v8.13.0, the client introduced the ES|QL Helper for the esql.query
API. Instead of the default response, the helper returns an array of hashes with the columns as keys and the respective values instead of the default JSON value.
Additionally, you can iterate through the response values and transform the data in by passing in a Hash of column => Proc
values. You could use this for example to convert a @timestamp
column value into a DateTime
object. We'll take a look at how to use this with example data.
Setup and Ingesting data
For this example, we're using the JSON dump from TheGamesDB, a community driven crowd-sourced games information website. Once we've downloaded the JSON file, we can ingest it into Elasticsearch by using another Helper form the Ruby client, the Bulk Helper.
The data includes the list of all games in the database within the data.games
keys. It also includes platforms and box art information, but for the purpose of this example, we're only going to use the games data. The BulkHelper provides a way to ingest a JSON file directly into Elasticsearch.
To use the helper, we need to require it in our code, and instantiate it with a client and an index on which to perform the bulk action (we can change the index later on an already instantiated helper). We can use ingest_json
and pass in the JSON file, the keys where it can find the data, and slice
to separate the documents in batches before sending them to Elasticsearch:
require 'elasticsearch/helpers/bulk_helper'
file = './database-latest.json'
index = 'videogames'
bulk_helper = Elasticsearch::Helpers::BulkHelper.new(client, index)
bulk_helper.ingest_json(file, keys: ['data', 'games'], slice: 100)
This will ingest all the game titles with their respective information into the videogames
index.
Using the ES|QL Helper
With the data loaded, we can now query it with ES|QL:
require 'elasticsearch/helpers/esql_helper'
query = <<~ESQL
FROM videogames
| WHERE game_title LIKE "*Turrican*"
| LIMIT 100
| SORT release_date
ESQL
If we run this query with the esql.query
API directly, we'll get the columns/values result:
response = client.esql.query(body: { query: query })
response['columns']
[
{"name"=>"alternates", "type"=>"text"},
{"name"=>"alternates.keyword", "type"=>"keyword"},
{"name"=>"coop", "type"=>"text"},
{"name"=>"coop.keyword", "type"=>"keyword"},
{"name"=>"country_id", "type"=>"long"},
{"name"=>"developers", "type"=>"long"},
{"name"=>"game_title", "type"=>"text"},
...
]
response['values']
[
[nil, nil, "No", "No", 0, 6970, "Turrican", ...],
[nil, nil, "No", "No", 0, nil, "Turrican II: The Final Fight", ...]
The helper however, returns an Array of Hashes with the columns as keys and the respective values. So we can work with the response, and access the value for each Hash in the Array with the name of a column as the key:
response = Elasticsearch::Helpers::ESQLHelper.query(client, query)
[
{
"alternates"=>nil,
"alternates.keyword"=>nil,
"coop"=>"No",
"coop.keyword"=>"No",
"country_id"=>0,
"developers"=>6970,
"game_title"=>"Turrican",
...
},
...
]
response.map { |game| "#{game['game_title']} - π
#{game['release_date']}" }
[
"Turrican - π
1990-01-01T00:00:00.000Z",
"Turrican II: The Final Fight - π
1990-01-01T00:00:00.000Z",
...
]
The ESQLHelper also provides the ability to transform the data in the response. We can do this by passing in a Hash of column => Proc
values. For example, let's say we want to format the release date in this previous query to show a more human friendly date. We can run this:
parser = proc do |t|
DateTime.parse(t).strftime('%B %d, %Y')
end
response = Elasticsearch::Helpers::ESQLHelper.query(client, query, parser: { 'release_date' => parser })
If we run the same code from before, we'll get this result:
response.map { |game| "#{game['game_title']} - π
#{game['release_date']}" }
[
"Turrican - π
January 01, 1990",
"Turrican II: The Final Fight - π
January 01, 1990",
...
]
You can pass in as many Procs as there are columns in the response. For example, the data includes a youtube
field, where sometimes the url for a video on YouTube is stored, other times just the video hash (e.g. U4bKxcV5hsg
). The url for a YouTube video follows the convention https://youtube.com/watch?v=VIDEOHASH
. So we could also add a parser to prepend the url to the values that only include the hash:
parser = {
'release_date' => proc { |t| DateTime.parse(t).strftime('%B %d, %Y') },
'youtube' => proc do |video|
if video =~ URI::DEFAULT_PARSER.make_regexp
video
elsif video.nil? || video.empty?
nil
else
"https://www.youtube.com/watch?v=#{video}"
end
end
}
# We run the query again and pass the new parser to the helper:
response = Elasticsearch::Helpers::ESQLHelper.query(client, query, parser: parser)
If we then run response.map { |a| a['youtube'] }.compact
, we'll get just the urls for YouTube videos for the videogames we're looking for.
Conclusion
As you can see, the ESQLHelper class can make it easier to work with the data returned from esql.query
. You can learn more about the Elasticsearch Ruby Client and its helpers in theΒ official documentation. And if you have any feedback, questions or requests, don't hesitate to create a new issue in the client's repository.
Ready to try this out on your own? Start a free trial.
Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!