Tutorial

set up cross-cluster replication in Elasticsearch

A comprehensive guide to setting up and managing cross-cluster replication in Elasticsearch for enhanced data availability and disaster recovery

By Mukesh Dhamat, Ramani Vishal Damjibhai

To learn more about Elasticsearch, check out "The ultimate guide to Elasticsearch."

Cross-Cluster Replication (CCR) in Elasticsearch is a feature that enables the replication of indices from one Elasticsearch cluster (the leader cluster) to another cluster (the follower cluster). This ensures continuous synchronization of data between clusters, maintaining an up-to-date copy of the indices in the follower cluster.

Why use cross-cluster replication?

Disaster recovery: CCR provides a robust disaster recovery solution by maintaining up-to-date copies of critical indices in geographically separated clusters. If the primary cluster fails or suffers from a major outage, the follower cluster can take over with minimal data loss, ensuring business continuity.
Data locality: In scenarios where users or applications are distributed across different geographical regions, CCR ensures that data is available closer to the end-users. This reduces latency and improves the performance of read operations.
High availability: CCR increases the availability of your data. Even if one cluster experiences downtime, the replicated data remains accessible from the follower cluster, ensuring uninterrupted access to critical information.
Load balancing: By distributing read requests between the leader and follower clusters, CCR can help balance the load, reducing the strain on a single cluster and improving overall system performance and reliability.
Regulatory compliance: Certain regulations may require that data be stored in specific geographic locations. CCR allows you to comply with these requirements by replicating data to clusters located in the required regions.

Cross-cluster replication setup in the same Openshift cluster but in different namespaces

Create a remote cluster connection

To create a remote cluster connection to another Elasticsearch cluster deployed within the same Openshift cluster, specify the remoteClusters attribute in your Elasticsearch spec. The following example describes how to configure elastic01 in the elastic namespace as a remote cluster in elastic02 in the elastic2 namespace.

We need to update the Elasticsearch CustomResourceDefinition (CRD). In this scenario, elastic01 will be the leader index and elastic02 will be the follower index.

Note: Add node.roles: - remote_cluster_client in the follower cluster’s Elasticsearch.yaml file.

apiVersion: elasticsearch.k8s.elastic.co/v1
  kind: Elasticsearch
  metadata:
    name: elastic02
    namespace: elastic2
  spec:
    nodesets:
    - count: 3
      name: default
    remoteClusters:
    - name: elastic01
      elasticsearchRef:
        name: elastic01
        namespace: elastic
    version: 8.14.0

Configure the remote cluster connection through the Elasticsearch REsT API

In the elastic namespace, expose the transport layer of elastic01 through a load balancer. Add a LoadBalancer service in the Elasticsearch.yaml file for elastic01:

apiVersion: elasticsearch.k8s.elastic.co/v1
   kind: Elasticsearch
   metadata:
     name: elastic01
     namepspace: elastic
   spec:
     transport:
       service:
         spec:
           type: LoadBalancer

This configuration creates a service <elastic cluster name>-es-transport of type LoadBalancer. This will create a LoadBalancer resource in IBM Cloud.

Using the Elasticsearch REsT API, configure elastic01 (namespace: elastic) as a remote cluster in elastic02 (namespace: elastic2). Run the following query in the elastic02 DevTools:

PUT _cluster/settings
 {
   "persistent": {
     "cluster": {
       "remote": {
         "elastic01": {
           "mode": "proxy",
           "proxy_address": "${LOADBALANCER_IP or LOADBALCNER_DNs}:9300"
         }
       }
     }
   }
 }

Use the public IP of the LoadBalancer or the FQDN of the LoadBalancer in the proxy_address.

Once this is done, you will see that the remote cluster is in a Connected state.

Cross-cluster replication setup: Leader and Follower index

Go to the Cross-Cluster Replication section under Management in the elastic02 Kibana UI.
Click Create a Follower Index.
select the remote cluster.
Enter the name of the Leader Index you want to replicate to the Follower Cluster (e.g., elastic01 as the Leader cluster). Ensure that the Leader index exists in the remote cluster.
Enter the name of the Follower Index you want to create in the Follower Cluster.
start Replication.
Click Create.

This will start replicating documents from the Leader index to the Follower index.

Cross-cluster replication setup between different Openshift clusters

You can configure a remote cluster connection to an Elasticsearch cluster from another cluster running in two different Openshift Clusters.

Ensure that both clusters trust each other’s certificate authority.

Configure the Remote Cluster Connection through the Elasticsearch REsT API

Example:

elastic01 is hosted in one Openshift Cluster.
elastic02 is hosted in a different Openshift Cluster.

To configure elastic01 as a remote cluster in elastic02:

a. Trust Certificates:

The Elasticsearch transport layer is stored in a secret named <cluster_name>-es-transport-certs-public. To extract the certificate for elastic01, run the following command:

oc get secret es-sample-es-transport-certs-public -o go-template='{{index .data "ca.crt" | base64decode}}' > remote.ca.crt

This command extracts the certificate from the elastic01 cluster. Next, log in to the elastic02 cluster and run the following command to create a ConfigMap, referencing the extracted file named remote.ca.crt:

oc create configmap remote-certs --from-file=ca.crt=remote.ca.crt

b. Configure the Trusted CA:

Use this ConfigMap to configure elastic01’s CA as a trusted CA in elastic02. Open the Elasticsearch YAML file for elastic02 and add the following content:

apiVersion: elasticsearch.k8s.elastic.co/v1
  kind: Elasticsearch
  metadata:
    name: elastic02
  spec:
    transport:
      tls:
        certificateAuthorities:
          configMapName: remote-certs
    nodesets:
    - count: 3
      name: default
    version: 8.14.0

c. Repeat for elastic02 in elastic01:

Repeat the steps to add the CA of elastic02 to elastic01 as well.

Configure the remote cluster connection through the Elasticsearch REsT API

Expose the Transport Layer of elastic01. Add a LoadBalancer service to the Elasticsearch.yaml file for elastic01:

apiVersion: elasticsearch.k8s.elastic.co/v1
  kind: Elasticsearch
  metadata:
    name: elastic01
  spec:
    transport:
      service:
        spec:
          type: LoadBalancer

The above changes create a service <elastic cluster name>-es-transport of type LoadBalancer. This will create a LoadBalancer resource in IBM Cloud.

Using the Elasticsearch REsT API, configure elastic01 as a remote cluster in elastic02. Run the following query in the elastic02 DevTools:

PUT _cluster/settings
 {
   "persistent": {
     "cluster": {
       "remote": {
         "elastic01": {
           "mode": "proxy",
           "proxy_address": "${LOADBALANCER_IP or LOADBALCNER_DNs}:9300"
         }
       }
     }
   }
 }

Use the public IP of the LoadBalancer or the FQDN of the LoadBalancer in the proxy_address.

Once this is done, you will see that the remote cluster is in a connected state.

Cross-cluster replication setup: Leader and Follower index

Navigate to the Cross-Cluster Replication section under Management in the elastic02 Kibana UI.
Click Create a Follower Index.
select the remote cluster.
Enter the name of the Leader Index you want to replicate to the Follower Cluster (e.g., elastic01 as the Leader cluster). Ensure that the Leader index exists in the remote cluster.
Enter the name of the Follower index you want to create in the Follower Cluster.
start Replication.
Click Create.

This will start replicating documents from the Leader Index to the Follower Index.

Conclusion

Cross-Cluster Replication (CCR) in Elasticsearch enhances data availability, disaster recovery, and performance by continuously synchronizing indices between clusters. This feature ensures that data remains up-to-date and accessible, even in the event of cluster failures or geographic distribution needs. setting up CCR involves creating remote cluster connections, configuring Elasticsearch settings, and using the REsT API for management. CCR is a robust solution for maintaining seamless access to critical data across different clusters.

Topics

Languages

Products

Open source

set up cross-cluster replication in Elasticsearch

Why use cross-cluster replication?

Cross-cluster replication setup in the same Openshift cluster but in different namespaces

Create a remote cluster connection

Configure the remote cluster connection through the Elasticsearch REsT API

Cross-cluster replication setup: Leader and Follower index

Cross-cluster replication setup between different Openshift clusters

Configure the remote cluster connection through the Elasticsearch REsT API

Cross-cluster replication setup: Leader and Follower index

Conclusion