Understanding Elasticsearch Node Communication
Me @ Medium || LinkedIn || Twitter || GitHub
Elasticsearch hides the nitty-gritty details about what goes on behind the scene: from starting the node to creating a cluster to indexing data to backups and snapshots and querying.
Adding nodes scales up the cluster straightaway and gives us the benefit of resilience from the start. The expected communication medium in this case is via HTTP interface over RESTful APIs.
The other side of the coin is the communication between nodes: how each of the nodes talks to other nodes, how the master makes cluster-wide decisions, and so on. For this, Elasticsearch uses two types of communication:
- The HTTP interface for the interaction between the clients and the nodes using RESTful APIs.
- The transport layer interface for node-to-node communications.
The cluster is exposed on port 9200 for HTTP (or HTTPS) communication by default, although we can change this by tweaking the configuration file (update the elasticsearch.yml
file accordingly).
On the other hand, the transport layer is set to port 9300, meaning the node-to-node communications happen on that port. Both interfaces are set in the configuration file (elasticsearch.yml) for individual nodes under the network
attribute, but we can change this as per our requirements.
When we start Elasticsearch on a machine, it binds to the localhost by default. We can change this binding to a specific network address by changing the network.host
and the network.port
(as well as the transport.port
for node-to-node networking) if needed.
Changing these settings on a farm of computers is a pain, especially when you need to set up a cluster with hundreds of nodes. Make sure that you have housekeeping scripts handy to alleviate such nuisance. One ideal way is to create your configurations in a central folder and point the ES_PATH_CONF
variable to those settings (we could also use Ansible, Azure Pipelines, GitOps, etc., for such purposes). Exporting this variable lets Elasticsearch choose the configuration from this directory.
Coming back to setting up the network properties, we can use special values in the config (elasticsearch.yml
file) for setting the network host rather than configuring the host manually.
Setting the network.host
property to _local_
lets Elasticsearch set its address automatically. This sets the loopback address (127.0.0.1) as the network host. The _local_
special value is set as default for the network.host
attribute. My suggestion is to leave it untouched.
There’s also a _site_
value that sets the network.host
attribute to site-local addresses (192.168.0.1, for example). You can set network.host
, defaulting to these to special values, by setting network.host: [_local_, _site_]
in the configuration.