Clustering Zookeeper

In this tutorial we have learnt how to set up a ZooKeeper server instance in standalone mode: Zookeeper quickstart. A standalone instance is a potential single point of failure. If the ZooKeeper server fails, the whole application that was using the instance for its distributed coordination will fail and stop functioning. Hence, running ZooKeeper in standalone mode is not recommended for production, although for development and evaluation purposes, it serves the need.

In a production environment, ZooKeeper should be run on multiple servers in a replicated mode, also called a ZooKeeper ensemble. The minimum recommended number of servers is three, and five is the most common in a production environment. The replicated group of servers in the same application domain is called a quorum. In this mode, the ZooKeeper server instance runs on multiple different machines, and all servers in the quorum have copies of the same configuration file. In a quorum, ZooKeeper instances run in a leader/follower format. One of the instances is elected the leader, and others become followers. If the leader fails, a new leader election happens, and another running instance is made the leader. However, these intricacies are fully hidden from applications using ZooKeeper and from developers.

Creating a Zookeeper cluster on the same machine

In the first example we will arrange for a cluster of three nodes on the same host. Here is our cluster map:

Node 1: /usr/share/zookeeper/zoo1/

Node 2: /usr/share/zookeeper/zoo2/

Node 3: /usr/share/zookeeper/zoo3/

Here is the zoo.cfg for the Node1:

tickTime=2000

initLimit=5
syncLimit=2
dataDir=./data
clientPort=2181

server.1=localhost:2666:3666
server.2=localhost:2667:3667
server.3=localhost:2668:3668

Each server.n entry specifies the address and port numbers used by ZooKeeper server n. There are three colon-separated fields for each server.n entry.

  • The first field is the hostname or IP address of server n.
  • The second field, (2666 for server1), is the port used for peer-to-peer communication in the quorum, such as to connect followers to leaders. A follower opens a TCP connection to the leader using this port.
  • The third field, (3666 for server1), is the port for leader election, in case a new leader arises in the quorum. As all communication happens over TCP, a second port is required to respond to leader election inside the quorum.

Here is the zoo.cfg for the Node2:

tickTime=2000
initLimit=5
syncLimit=2
dataDir=./data
clientPort=2182

server.1=localhost:2666:3666
server.2=localhost:2667:3667
server.3=localhost:2668:3668

Here is the zoo.cfg for the Node3:

tickTime=2000
initLimit=5
syncLimit=2
dataDir=./data
clientPort=2183

server.1=localhost:2666:3666
server.2=localhost:2667:3667
server.3=localhost:2668:3668

When we start up a server, it needs to know which server it is. A server figures out its ID by reading a file named myid in the data directory. We can create these files with the following commands:

$ echo 1 > /usr/share/zookeeper/zoo1/data/myid

$ echo 2 > /usr/share/zookeeper/zoo2/data/myid

$ echo 3 > /usr/share/zookeeper/zoo3/data/myid

Now, we are all set to start the ZooKeeper instances. Let’s start the instances as follows:

$ /usr/share/zookeeper/zoo1/bin/zkServer.sh start  

$ /usr/share/zookeeper/zoo2/bin/zkServer.sh start

$ /usr/share/zookeeper/zoo3/bin/zkServer.sh start

Once all the instances start, we can use the zkCli.sh script to connect to the multinode ZooKeeper cluster, like we did earlier:

$ ${ZK_HOME}/bin/zkCli.sh –server localhost:2181, localhost:2182, localhost:2183

Creating a Zookeeper cluster on different machines

The ZooKeeper configuration file for an horizontal cluster is similar to the one we used for a vertical cluster. Instead of localhost we will use the machine names which are supposed to be host1, host2 and host3

tickTime=2000
dataDir=/usr/share/zookeeper/data
clientPort=2181
initLimit=5
syncLimit=2

server.1=host1:2888:3888
server.2=host2:2888:3888
server.3=host3:2888:3888

The difference with our first example is that now each host is using the same peer-to-peer port and the same leader election port.

Starting the server instances

After setting up the configuration file for each of the servers in the quorum, we need to start the ZooKeeper server instances. The procedure is the same as for standalone mode. We have to connect to each of the machines and execute the following command:

${ZK_HOME}/bin/zkServer.sh start

Once the instances are started successfully, we will execute the following command on each of the machines to check the instance states:

${ZK_HOME}/bin/zkServer.sh status

For example, take a look at the next quorum:

[host1] # ${ZK_HOME}/bin/zkServer.sh status

JMX enabled by default
Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower
[host2] # ${ZK_HOME}/bin/zkServer.sh status

JMX enabled by default
Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: leader
[host3] # ${ZK_HOME}/bin/zkServer.sh status

JMX enabled by default
Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower

As seen in the preceding example, zoo2 is made the leader of the quorum, while zoo1 and zoo3 are the followers. Connecting to the ZooKeeper quorum through the command-line shell is also the same as in standalone mode, except that we should now specify a connection string in the host1:port2, host2:port2 … format to the server argument of ${ZK_HOME}/bin/zkCli.sh:

$ zkCli.sh -server host1:2181,host2:2181,host3:2181

Connecting to host1:2181, host2:2181, host3:2181
… … … …
Welcome to ZooKeeper!

Found the article helpful? if so please follow us on Socials