Elasticsearch 8.16 has a new functionality that allows you to upload PDF files directly into Kibana and analyze them using Playground. In this article, we'll see how to use this functionality by uploading a resume in PDF format and then using Playground to interact with it.
Playground is a low-code platform hosted in Kibana, that allows you to create a RAG application and chat with your content. You can read more about it in this article and even test it using this link.
steps:
- Configure the Elasticsearch Inference service Endpoint
- Upload PDFs to Kibana
- Interact with the data in Playground
Configure the Elasticsearch Inference service Endpoint
To run semantic searches, we must first configure an inference endpoint. In this example, we'll use the Elasticsearch Inference Endpoint. This endpoint offers:
- rerank
- sparse embedding
- text embedding
For this example, let's select sparse embedding:
PUT _inference/sparse_embedding/my-elser-model
{
"service": "elasticsearch",
"service_settings": {
"adaptive_allocations": {
"enabled": true,
"min_number_of_allocations": 1,
"max_number_of_allocations": 10
},
"num_threads": 1,
"model_id": ".elser_model_2"
}
}
Once configured, confirm that the model was correctly loaded into Kibana by checking search > Relevance > Inference Endpoint in the Kibana UI.
Upload PDFs to Kibana
We'll upload the resume of a junior developer to learn how to use the Kibana upload files functionality.
Go to the Kibana UI and follow these steps:
Next, for Import Data, we have two options:
simple: This is the default option and it allows us to quickly upload our PDF into the index and automatically creates a data view with the indexed info.
Advanced: This option allows us to customize mappings or add ingest pipelines. Within these settings you can:
- Add a semantic text type of field.
- Index settings: If you want to configure things like shards or analyzers.
- Index Mappings: If you want to change a field type or how you define your data.
- Ingest Pipeline: If you want to make changes to your data before indexing it.
Go to "Advanced" and select "Add additional field":
select the field attachment.content
; in “copy to field” type "content" and make sure that the inference endpoint is my-elser-model:
The field Copy to is used to copy the content from attachment.content
to a new semantic_text
field of (content), which automatically generates vector embeddings using the underlying Inference endpoint (Elastic’s ELsER in this case). This makes both the semantic and text fields available so you can run full-text, semantic, or hybrid searches.
Once everything is configured, click on "Import":
Now that the index is created, we can explore it using Playground.
Interact with the data in Playground
Connect to Playground
After configuring the index and uploading the resumes, we now need to connect the index to Playground. Click Connect to an LLM and select one of the options.
Configure the chatbot
Once Playground has been configured and we have indexed Alex Johnson's resume, we can interact with the data. Using semantic search and LLMs we can ask questions using natural language and get answers even if the documents don't have the keywords we used in the query, like in the example below:
Using the instructions menu, we can control the chatbot behavior and define features like the response format. It can also include citations, to make sure the answer is properly grounded.
If we go to the "Query" tab, we can see the query generated by Playground and we add both a text
and a semantic_text
fields, Playground will automatically generate a hybrid query to normalize the score between different types of different types of queries.
Playground not only answers questions but also helps us understand the internal components of a RAG system, like querying, retrieval phase, context and prompt instructions.
Give it a try!
With the Elasticsearch 8.16 update, we can easily upload PDF/Word/Powerpoint files using the Kibana UI. It can automatically create an index in the simple mode, and you can use the advanced mode to customize your index and tailor it to your needs.
Once your files are uploaded, you can access Playground and quickly and easily chat with them since Playground will handle the LLM interactions and provide the best query based on the type of fields you want to search.
Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!
Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial, or try Elastic on your local machine now.