Understanding Elasticsearch Node Communication

Madhusudhan Konda
3 min readJan 29, 2023
Excerpts taken from my upcoming book: Elasticsearch in Action

Me @ Medium || LinkedIn || Twitter || GitHub

Elasticsearch hides the nitty-gritty details about what goes on behind the scene: from starting the node to creating a cluster to indexing data to backups and snapshots and querying.

Adding nodes scales up the cluster straightaway and gives us the benefit of resilience from the start. The expected communication medium in this case is via HTTP interface over RESTful APIs.

The other side of the coin is the communication between nodes: how each of the nodes talks to other nodes, how the master makes cluster-wide decisions, and so on. For this, Elasticsearch uses two types of communication:

  • The HTTP interface for the interaction between the clients and the nodes using RESTful APIs.
  • The transport layer interface for node-to-node communications.

The cluster is exposed on port 9200 for HTTP (or HTTPS) communication by default, although we can change this by tweaking the configuration file (update the elasticsearch.yml file accordingly).

On the other hand, the transport layer is set to port 9300, meaning the node-to-node communications happen on that port. Both interfaces are set in the configuration file (elasticsearch.yml) for individual nodes under the networkattribute, but we can change this as per our requirements.

When we start Elasticsearch on a machine, it binds to the localhost by default. We can change this binding to a specific network address by changing the network.host and the network.port (as well as the transport.port for node-to-node networking) if needed.

Changing these settings on a farm of computers is a pain, especially when you need to set up a cluster with hundreds of nodes. Make sure that you have housekeeping scripts handy to alleviate such nuisance. One ideal way is to create your configurations in a central folder and point the ES_PATH_CONF variable to those settings (we could also use Ansible, Azure Pipelines, GitOps, etc., for such purposes). Exporting this variable lets Elasticsearch choose the configuration from this directory.

Coming back to setting up the network properties, we can use special values in the config (elasticsearch.yml file) for setting the network host rather than configuring the host manually.

Setting the network.host property to _local_ lets Elasticsearch set its address automatically. This sets the loopback address (127.0.0.1) as the network host. The _local_ special value is set as default for the network.host attribute. My suggestion is to leave it untouched.

There’s also a _site_ value that sets the network.host attribute to site-local addresses (192.168.0.1, for example). You can set network.host, defaulting to these to special values, by setting network.host: [_local_, _site_] in the configuration.

Me @ Medium || LinkedIn || Twitter || GitHub

These short articles are condensed excerpts taken from my book Elasticsearch in Action, Second Edition. The code is available in my GitHub repository.

Elasticsearch in Action

--

--

Madhusudhan Konda
Madhusudhan Konda

Written by Madhusudhan Konda

Madhusudhan Konda is a full-stack lead engineer, mentor, and conference speaker. He delivers live online training on Elasticsearch, Elastic Stack &Spring Cloud

No responses yet