Elasticsearch in Action: Understanding Master: Quorum and Split Brain (2/2)

Madhusudhan Konda
5 min readDec 30, 2022
Elasticsearch in Action by M Konda

The excerpts are taken from my book Elasticsearch in Action, Second Edition. The code is available in my GitHub repository. You can find executable Kibana scripts in the repository so you can run the commands in Kibana straight away. All code is tested against Elasticsearch 8.4 version.

In the last article, we looked at master and master elections. We look at another concept called Quorum in this artcile. We will also go over “Split Brain” issue in Elasticsearch.

  1. Understanding Master: Master Node and Elections
  2. Understanding Master: Quorum and Split Brain

Quorum

The master is in control of maintaining and managing the cluster. However, it consults a quorum of master-eligible nodes for cluster state updates as well as master elections. A quorum is a carefully selected subset of a master-eligible nodes required for a master to operate the cluster effectively. This is the majority of the nodes that would be consulted by the master to get a consensus in the matters of cluster state and other issues.

Although we are learning about a quorum, the good news is that we (users/administrators) don’t have to worry about how to form the quorum. The clusters automatically formulate a quorum based on the available master-eligible nodes. There’s a simple formula to find the minimum number of master nodes (quorum) required, given a set of master-eligible nodes:

Minimum number of master nodes = (number of master-eligible nodes / 2) + 1

Let’s say we have a 20-node cluster, and we have 8 nodes assigned as master-eligible nodes (the node role is set to master). By applying this formula, our cluster creates a quorum with (carefully chosen) 5 master-eligible nodes (8 / 2 + 1 = 5). The idea is that we need at least 5 master-eligible nodes to form a quorum.

The rule of thumb is that the recommended minimum master-eligible members in any node cluster is three. Setting three master nodes as a minimum is a surefire way to manage the cluster. Another big advantage of having at least three nodes in a cluster quorum is that this alleviates the split-brain issue, which is discussed in the following section.

Split Brain

Elasticsearch’s cluster health heavily relies on multiple factors: network, memory, JVM garbage collection, and so on. There are some instances where the cluster gets split in to two clusters: with a few nodes in one cluster and some in another cluster. For example, look at the figure below.

Figure : A two-node cluster with one master

As you can see in the figure, we have a cluster with two master-eligible nodes, but one (Node A) is elected to be the master node. As long as we are in a happy-day state, the cluster is healthy, and the master carries out its bestowed responsibility diligently.

Let’s throw a wrench in the works. Let’s say Node B died due to a hardware issue. Because Node A is the master, it continues working to service the queries with one node at hand: we effectively have a one-node cluster while we wait for another Node B to boot up to join the cluster.

Here’s where it could get tricky. While booting up, let’s say network connectivity is severed, making Node B unable to see the existence of Node A. This leads to Node B assuming a master role because it thinks there’s no master in the cluster, even though Node A exists as the master. This leads to a split-brain situation as the figure below shows.

Figure : Split-brain cluster: a cluster with two masters

Because both nodes do not communicate due to network issues, they are still working happily along, being part of the cluster. Because both nodes are masters, any requests coming to either of them is carried out only by the receiving node. The data in one node is not visible to the other node, however, and this raises data discrepancies. This is one of the reasons why we should have at least three master-eligible nodes in a cluster. Having three nodes avoids the split-brain cluster formation altogether.

Dedicated master node

Because a node can be assigned multiple roles, it’s not a surprise to see a cluster with 20 nodes where all nodes are performing all roles. There is no harm in creating this type of cluster architecture; however, this type of setting works only for lightweight cluster needs. As we already learned, the master node is the critical node in the cluster, which keeps the cluster ticking.

If the data is expected to be indexed or searched at an exponential growth rate, every node including the master nodes takes a performance hit. A slow-performing master node is asking for trouble: the cluster operations run slower or even stall. For this reason, it is always advisable to create a dedicated machine for hosting the master node. Having a dedicated master node lets the cluster’s run smoothly and mitigates data loss and application down time.

As mentioned, the rule of thumb is to have three dedicated master-eligible nodes at least in a cluster. When you are forming a cluster, make sure that you set node.roles to master as shown in the following snippet to make the node-dedicated master. This way, the dedicated master role is not overloaded to do data or ingest related operations but just managesing the cluster full time.

node.roles: [ master ]
  1. Understanding Master: Master Node and Elections
  2. Understanding Master: Quorum and Split Brain

That’s pretty much about master in Elasticsearch!

Me @ Medium || LinkedIn || Twitter || GitHub

--

--

Madhusudhan Konda

Madhusudhan Konda is a full-stack lead engineer, mentor, and conference speaker. He delivers live online training on Elasticsearch, Elastic Stack &Spring Cloud