Kafka & Zookeeper for Development: Zookeeper Ensemble

Previously we spun up Zookeeper and Kafka locally but also through Docker. What comes next is spinning up more than just one Kafka and Zookeeper node and create a 3 node cluster. To achieve this the easy way locally docker-compose will be used. Instead of spinning up various instances on the cloud or running various Java processes and altering configs, docker-compose will greatly help us to bootstrap a Zookeeper ensemble and the Kafka brokers, with everything needed preconfigured.

The first step is to create the Zookeeper ensemble but before going there let’s check the ports needed.
Zookeeper needs three ports.

  • 2181 is the client port. On the previous example it was the port our clients used to communicate with the server.
  • 2888 is the peer port. This is the port that zookeeper nodes use to talk to each other.
  • 3888 is the leader port. The port that nodes use to talk to each other when it comes to the leader election.

By using docker compose our nodes will use the same network and the container name will also be an internal dns entry.
The names on the zookeeper nodes would be zookeeper-1, zookeeper-2, zookeeper-3.

Our next goal is to give to each zookeeper node a configuration that will enable the nodes to discover each other.

This is the typical zookeeper configuration expected.

  • tickTime is the unit of time in milliseconds zookeeper uses for heartbeat and minimum session timeout.
  • dataDir is the location where ZooKeeper will store the in-memory database snapshots
  • initlimit and SyncLimit are used for the zookeeper synchronization.
  • server* are a list of the nodes that will have to communicate with each other

zookeeper.properties

clientPort=2181
dataDir=/var/lib/zookeeper
syncLimit=2
DATA.DIR=/var/log/zookeeper
initLimit=5
tickTime=2000
server.1=zookeeper-1:2888:3888
server.2=zookeeper-2:2888:3888
server.3=zookeeper-3:2888:3888

When it comes to a node the server that the node is located, should be bound to `0.0.0.0`. Thus we need three different zookeeper properties per node.

For example for the node with id 1 the file should be the following
zookeeper1.properties

clientPort=2181
dataDir=/var/lib/zookeeper
syncLimit=2
DATA.DIR=/var/log/zookeeper
initLimit=5
tickTime=2000
server.1=0.0.0.0:2888:3888
server.2=zookeeper-2:2888:3888
server.3=zookeeper-3:2888:3888

The next question that arises is the id file of zookeeper. How a zookeeper instance can identify which is its id.

Based on the documentation we need to specify the server ids using the myid file

The myid file is a plain text file located at a nodes dataDir containing only a number the server name.

So three myids files will be created each containing the number of the broker

myid_1.txt

1

myid_2.txt

2

myid_3.txt

3

It’s time to spin up the Zookeeper ensemble. We will use the files specified above. Different client ports are mapped to the host to avoid collision.

version: "3.8"
services:
  zookeeper-1:
    container_name: zookeeper-1
    image: zookeeper
    ports:
      - "2181:2181"
    volumes:
      - type: bind
        source: ./zookeeper1.properties
        target: /conf/zoo.cfg
      - type: bind
        source: ./myid_1.txt
        target: /data/myid
  zookeeper-2:
    container_name: zookeeper-2
    image: zookeeper
    ports:
      - "2182:2181"
    volumes:
    - type: bind
      source: ./zookeeper2.properties
      target: /conf/zoo.cfg
    - type: bind
      source: ./myid_2.txt
      target: /data/myid
  zookeeper-3:
    container_name: zookeeper-3
    image: zookeeper
    ports:
      - "2183:2181"
    volumes:
      - type: bind
        source: ./zookeeper3.properties
        target: /conf/zoo.cfg
      - type: bind
        source: ./myid_3.txt
        target: /data/myid

Eventually instead of having all those files mounted it would be better if we had a more simple option. Hopefully the image that we use gives us the choice of specifying ids and brokers using environment variables.

version: "3.8"
services:
  zookeeper-1:
    container_name: zookeeper-1
    image: zookeeper
    ports:
      - "2181:2181"
    environment:
      ZOO_MY_ID: "1"
      ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181
  zookeeper-2:
    container_name: zookeeper-2
    image: zookeeper
    ports:
      - "2182:2181"
    environment:
      ZOO_MY_ID: "2"
      ZOO_SERVERS: server.1=zookeeper-1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181
  zookeeper-3:
    container_name: zookeeper-3
    image: zookeeper
    ports:
      - "2183:2181"
    environment:
      ZOO_MY_ID: "3"
      ZOO_SERVERS: server.1=zookeeper-1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181

Now let’s check the status of our zookeeper nodes.

Let’s use the zookeeper shell to see the leaders and followers

docker exec -it zookeeper-1 /bin/bash
>./bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower

And

> docker exec -it zookeeper-3 /bin/bash
root@81c2dc476127:/apache-zookeeper-3.6.2-bin# ./bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader

So in this tutorial we created a Zookeeper ensemble using docker-compose and we also had a leader election. This recipe can be adapted to apply on the a set of VMs or a container orchestration engine.

On the next tutorial we shall add some Kafka brokers to connect to the ensemble.