This is a cache of It is a snapshot of the page at 2025-02-07T00:42:33.415+0000.
How to ingest data to <strong>elasticsearch</strong> through Logstash - <strong>elasticsearch</strong> Labs

How to ingest data to elasticsearch through Logstash

A step-by-step guide to integrating Logstash with elasticsearch for efficient data ingestion, indexing, and search.

What is Logstash?

Logstash is a widely used Elastic Stack tool for processing large volumes of log data in real-time. It acts as an efficient data pipeline, integrating information from various sources into a single structured flow. Its primary function is to reliably perform data extraction, transformation, and loading.

Logstash offers several advantages, particularly its versatility in supporting multiple types of inputs, filters, and outputs, enabling integration with a wide range of sources and destinations. It processes data in real-time, capturing and transforming information. Its native integration with the Elastic Stack, especially elasticsearch and Kibana, facilitates data analysis and visualization. Additionally, it includes advanced filters that enable efficient data normalization, enrichment, and transformation.

How does Logstash work?

Logstash is composed of inputs, filters, and outputs, which form the data processing pipeline. These components are configured in a .config file that defines the data ingestion flow.

  • Inputs: Capture data from various sources.
  • Filters: Process and transform the captured data.
  • Outputs: Send the transformed data to defined destinations.

The most common types of each component are presented below:

Types of Inputs:

  • File: Reads log files in various formats (text, JSON, CSV, etc.).
  • Message Queues: Kafka, RabbitMQ.
  • APIs: Webhooks or other data collection APIs.
  • Databases: JDBC connections for relational data extraction.

Types of Filters:

  • Grok: For analyzing and extracting text patterns.
  • Mutate: Modifies fields (renames, converts types, removes data).
  • Date: Converts date and time strings into a readable date format.
  • GeoIP: Enriches logs with geographic data.
  • JSON: Parses or generates JSON data.

Types of Outputs:

  • elasticsearch: The most common destination, elasticsearch is a search and analytics engine that allows powerful searches and visualizations of data indexed by Logstash.
  • Files: Stores processed data locally.
  • Cloud Services: Logstash can send data to various cloud services, such as AWS S3, Google Cloud Storage, Azure Blob Storage, for storage or analysis.
  • Databases: Logstash can send data to various other databases, such as MySQL, PostgreSQL, MongoDB, etc., through specific connectors.

Data Ingestion for elasticsearch

In this example, we implement data ingestion into elasticsearch using Logstash. The steps configured in this example will have the following flow:

  1. Kafka will be used as the data source.
  2. Logstash will consume the data, apply filters such as grok, geoip, and mutate to structure it.
  3. The transformed data will be sent to an index in elasticsearch.
  4. Kibana will be used to visualize the indexed data.


We will use Docker Compose to create an environment with the necessary services: elasticsearch, Kibana, Logstash, and Kafka. The Logstash configuration file, named logstash.conf, will be mounted directly into the Logstash container. Below we will detail the configuration of the configuration file.

Here is docker-compose.yml:

As mentioned above, the Logstash pipeline will be defined, in this step we will describe the Input, Filter and Output configurations.

The logstash.conf file will be created in the current directory (where docker-compose.yml is located). In docker-compose.yml the logstash.conf file that is on the local file system will be mounted inside the container at the path /usr/share/logstash/pipeline/logstash.conf.

Logstash Pipeline Configuration

The Logstash pipeline is divided into three sections: input, filter, and output.

  • Input: Defines where the data will be consumed from (in this case, Kafka).
  • Filter: Applies transformations and structuring to the raw data.
  • Output: Specifies where the processed data will be sent (in this case, elasticsearch).

Next, we will configure each of these steps in detail.

Input Configuration

The data source is a Kafka topic and to consume the data from the topic it will be necessary to configure the Kafka input plugin. Below is the configuration for the Kafka plugin in Logstash, where we define:

  • bootstrap_servers: Address of the Kafka server.
  • topics: Name of the topic to be consumed.
  • group_id: Consumer group identifier.

With this, we are ready to receive the data.

Filter Configuration

Filters are responsible for transforming and structuring data. Let's configure the following filters:

Grok Filter

Extracts structured information from unstructured data. In this case, it extracts the timestamp, log level, client IP, URI, status, and the JSON payload.

The example log:

Extracted Fields:

  • timestamp: Extracts the date and time (e.g., 2025-01-05T16:30:15).
  • log_level: Captures the log level (e.g., INFO, ERROR).
  • client_ip: Captures the client's IP address (e.g.,
  • uri: Captures the URI path (e.g., /api/products).
  • status: Captures the HTTP status code (e.g., 200).

Date Filter

Converts the timestamp field into a format readable by elasticsearch and stores it in @timestamp.

GeoIP Filter

Next, we will use the geoip filter to retrieve geographic information, such as country, region, city, and coordinates, based on the value of the client_ip field.

Mutate Filter

The mutate filter allows transformations on fields. In this case, we will use two of its properties:

  • remove_field: Removes the timestamp and message fields, as they are no longer needed.
  • convert: Converts the status field from a string to an integer.

Output Configuration

The output defines where the transformed data will be sent. In this case, we will use elasticsearch.

We now have our configuration file defined. Below is the complete file:

Send and Ingest Data

With the containers running, we can start sending messages to the topic and wait for the data to be indexed.First, create the topic if you haven't already.

To send the messages, execute the following command in the terminal:

Messages to be sent:

To view the indexed data, go to Kibana:

Once the indexing has been successfully completed, we can view and analyze the data in Kibana. The mapping and indexing process ensures that the fields are structured according to the configurations defined in Logstash.


With the configuration presented, we created a pipeline using Logstash to index logs in a containerized environment with elasticsearch and Kafka. We explored Logstash's flexibility to process messages using filters such as grok, date, geoip, and mutate, structuring the data for analysis in Kibana. Additionally, we demonstrated how to configure the integration with Kafka to consume messages and use them for processing and indexing the data.



Logstash Docker

GeoIp Plugin

Mutate Plugin

Grok Plugin

Kafka Plugin

Want to get Elastic certified? Find out when the next elasticsearch Engineer training is running!

elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial, or try Elastic on your local machine now.

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself