elasticsearch for Apache Hadoop
editelasticsearch for Apache Hadoop
editelasticsearch for Apache Hadoop is an open-source, stand-alone, self-contained, small library that allows Hadoop jobs (whether using Map/Reduce or libraries built upon it such as Hive or new upcoming libraries like Apache Spark ) to interact with elasticsearch. One can think of it as a connector that allows data to flow bi-directionaly so that applications can leverage transparently the elasticsearch engine capabilities to significantly enrich their capabilities and increase the performance.
elasticsearch for Apache Hadoop offers first-class support for vanilla Map/Reduce and Hive so that using elasticsearch is literally like using resources within the Hadoop cluster. As such, elasticsearch for Apache Hadoop is a passive component, allowing Hadoop jobs to use it as a library and interact with elasticsearch through elasticsearch for Apache Hadoop APIs.
While the official name of the project is elasticsearch for Apache Hadoop throughout the documentation the term elasticsearch-hadoop will be used instead to increase readability.
If you are looking for elasticsearch HDFS Snapshot/Restore plugin (a separate project), please refer to its home page.