Best Practices To Install Apache Kafka on Debian 11
In this guide, we aim to show you how to Install Apache Kafka on Debian 11. Apache Kafka is an open-source distributed publish-subscribe messaging platform specifically designed to handle real-time streaming data for distributed streaming, pipelining, and replay of data feeds, enabling fast, scalable operations. It’s a powerful tool for building real-time data pipelines and streaming applications.
Kafka operates as a broker-based solution, managing streams of data as records within a cluster of servers. These Kafka servers can be geographically distributed across multiple data centers, ensuring data persistence by storing streams of records (messages) across multiple server instances within topics. A topic organizes records or messages as a series of tuples, which are immutable Python objects consisting of a key, a value, and a timestamp.
Let’s delve into the steps required to set up Apache Kafka on Debian 11.
Steps To Install and Configure Apache Kafka on Debian 11
Before proceeding, ensure you are logged in to your Debian 11 server as a non-root user with sudo privileges and have a basic firewall configured. Refer to our guide on Initial Server Setup with Debian 11 for detailed instructions.
1. Install Required Packages For Kafka
First, prepare your server for the Install Apache Kafka on Debian 11 process. Update and upgrade your local package index using the following command:
sudo apt update && sudo apt upgrade
Next, install the necessary packages, including JRE and JDK, on your Debian 11 system:
sudo apt install default-jre wget git unzip default-jdk -y
2. Install Apache Kafka on Debian 11
Now, download the latest release of Kafka.
Download Kafka Debian
Visit the Apache Kafka downloads page and locate the latest release. Under Binary downloads, obtain the sources, using the wget command:
sudo wget https://downloads.apache.org/kafka/3.3.2/kafka_2.13-3.3.2.tgz
Create a directory for Kafka under the /usr/local directory and navigate to it:
sudo mkdir /usr/local/kafka-server && sudo cd /usr/local/kafka-server
Extract the downloaded file into this directory:
sudo tar -xvzf ~/kafka_2.13-3.3.2.tgz --strip 1
Create Zookeeper Systemd Unit File
Create a Zookeeper systemd unit file to streamline common service actions like starting, stopping, and restarting Kafka.
Zookeeper is a top-level Apache software that acts as a centralized service, managing naming and configuration data and providing flexible and robust synchronization within distributed systems. Zookeeper monitors the status of Kafka cluster nodes and tracks Kafka topics, partitions, etc.
Use your preferred text editor (e.g., vi editor) to create the zookeeper systemd unit file:
sudo vi /etc/systemd/system/zookeeper.service
Add the following content to the file:
[Unit]
Description=Apache Zookeeper Server
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties
ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
Save and close the file.
Create Systemd Unit File for Kafka
Create a systemd unit file for Apache Kafka on Debian 11. Again, use your preferred text editor:
sudo vi /etc/systemd/system/kafka.service
Add the following content, ensuring that your _JAVAHOME configurations are correct:
[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
After=zookeeper.service
[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64"
ExecStart=/usr/local/kafka-server/bin/kafka-server-start.sh /usr/local/kafka-server/config/server.properties
ExecStop=/usr/local/kafka-server/bin/kafka-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
Save and close the file.
Reload the systemd daemon to apply changes and start the services:
sudo systemctl daemon-reload
sudo systemctl enable --now zookeeper
sudo systemctl enable --now kafka
Verify that your Kafka and Zookeeper services are active and running:
sudo systemctl status kafka
Output
● kafka.service - Apache Kafka Server
Loaded: loaded (/etc/systemd/system/kafka.service; enabled; vendor preset:>
Active: active (running) since Mon 2023-01-30 04:30:51 EST; 12s ago
Docs: http://kafka.apache.org/documentation.html
Main PID: 5850 (java)
Tasks: 69 (limit: 4679)
Memory: 328.2M
CPU: 7.525s
CGroup: /system.slice/kafka.service
└─5850 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx1G -Xms1G -
...
sudo systemctl status zookeeper
Output
● zookeeper.service - Apache Zookeeper Server
Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; vendor pre>
Active: active (running) since Mon 2023-01-30 04:30:45 EST; 1min 12s ago
Main PID: 5473 (java)
Tasks: 32 (limit: 4679)
Memory: 72.6M
CPU: 2.811s
CGroup: /system.slice/zookeeper.service
└─5473 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseM>
...
3. Install CMAK on Debian 11 (Kafka Manager)
CMAK (formerly Kafka Manager) is an open-source tool developed by Yahoo for managing Apache Kafka clusters. Clone CMAK from GitHub:
cd ~
sudo git clone https://github.com/yahoo/CMAK.git
Output
Cloning into 'CMAK'...
remote: Enumerating objects: 6542, done.
remote: Counting objects: 100% (266/266), done.
remote: Compressing objects: 100% (142/142), done.
remote: Total 6542 (delta 150), reused 187 (delta 112), pack-reused 6276
Receiving objects: 100% (6542/6542), 3.97 MiB | 7.55 MiB/s, done.
Resolving deltas: 100% (4211/4211), done.
4. Configure Cluster Manager for Apache Kafka
Make configuration changes in the CMAK config file:
sudo vi ~/CMAK/conf/application.conf
Change cmak.zkhosts="my.zookeeper.host.com:2181"
to reflect your Zookeeper host(s). You can specify multiple Zookeeper hosts separated by commas, like this: cmak.zkhosts="my.zookeeper.host.com:2181,other.zookeeper.host.com:2181"
. The hostnames can be IP addresses. For this example, we’ll set it to:
cmak.zkhosts="localhost:2181"
Save and close the file.
Create a zip file for deploying the application:
cd ~/CMAK/
./sbt clean dist
This process downloads and compiles files, so it may take a while to complete.
When finished, you’ll see output similar to:
Output
[info] Your package is ready in /root/CMAK/target/universal/cmak-3.0.0.7.zip
Navigate to the directory containing the zip file and extract it:
cd /root/CMAK/target/universal
unzip cmak-3.0.0.7.zip
cd cmak-3.0.0.7
5. Access CMAK Service (Cluster Manager for Apache Kafka)
Run the Cluster Manager for Apache Kafka service:
bin/cmak
By default, it uses port 9000. Open your browser and go to http://ip-or-domain-name-of-server:9000
. If your firewall is enabled, allow external access to the port:
sudo ufw allow 9000
You should see the CMAK interface.

Add Cluster From the CMAK
Add a cluster by clicking "cluster" and then "add cluster".
Fill in the required details (Cluster Name, Zookeeper Hosts, etc.). If you have multiple Zookeeper hosts, separate them with commas.
Create a Topic in the CMAK interface
From your newly added cluster, click on "Topic" and then "create". Input the necessary details for the new topic (Replication Factor, Partitions, etc.) and click "Create".
Click on Cluster View to see your topics.
You can now add, delete, and configure topics as needed.
Conclusion
You have successfully learned how to Install Apache Kafka on Debian 11. Kafka finds extensive application in event-driven architectures, log aggregation, data analytics, and stream processing.
Alternative Solutions for Installing Apache Kafka on Debian 11
While the manual installation method described above is a solid approach, alternative solutions offer varying degrees of automation and ease of use. Here are two different ways to solve the problem of Install Apache Kafka on Debian 11.
1. Using Docker Compose
Docker Compose simplifies the deployment of multi-container Docker applications. This is an excellent option for setting up Kafka along with its dependencies (like Zookeeper) in a self-contained and reproducible environment.
Explanation:
Docker Compose uses a YAML file to define the services, networks, and volumes required for the application. This approach ensures consistency across different environments (development, testing, production). It also reduces the risk of dependency conflicts.
Code Example:
Create a docker-compose.yml
file:
version: '3.7'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- "2181:2181"
kafka:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
control-center:
image: confluentinc/cp-control-center:latest
depends_on:
- kafka
ports:
- "9021:9021"
environment:
CONTROL_CENTER_BOOTSTRAP_SERVERS: 'kafka:9092'
CONTROL_CENTER_ZOOKEEPER_CONNECT: 'zookeeper:2181'
CONTROL_CENTER_REPLICATION_FACTOR: 1
CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1
CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
restart: always
Steps:
- Install Docker and Docker Compose: If not already installed, follow the official Docker documentation to install Docker and Docker Compose on your Debian 11 system.
- Save the
docker-compose.yml
file: Save the above YAML file to a directory on your server. -
Run Docker Compose: Navigate to the directory containing the
docker-compose.yml
file and run the following command:docker-compose up -d
This command will download the necessary images, create the containers, and start Kafka and Zookeeper in detached mode.
- Access Kafka: Kafka will be accessible on
localhost:9092
. Confluent Control Center can be accessed vialocalhost:9021
, which provides a web UI for managing and monitoring Kafka.
2. Using Ansible Automation
Ansible is a powerful automation tool that can be used to automate the Install Apache Kafka on Debian 11 process. This is useful for managing multiple Kafka instances or deploying Kafka across a cluster of servers.
Explanation:
Ansible uses playbooks written in YAML to define the tasks to be executed on remote hosts. This allows you to automate the entire Kafka installation and configuration process, including installing dependencies, downloading Kafka, configuring Zookeeper, and starting the Kafka service.
Code Example:
Create an Ansible playbook (e.g., kafka_install.yml
):
---
- hosts: kafka_servers
become: true
tasks:
- name: Update apt cache
apt:
update_cache: yes
- name: Install required packages
apt:
name:
- default-jre
- wget
- git
- unzip
- default-jdk
state: present
- name: Create Kafka directory
file:
path: /usr/local/kafka-server
state: directory
owner: root
group: root
mode: '0755'
- name: Download Kafka
get_url:
url: https://downloads.apache.org/kafka/3.3.2/kafka_2.13-3.3.2.tgz
dest: /tmp/kafka_2.13-3.3.2.tgz
- name: Extract Kafka
unarchive:
src: /tmp/kafka_2.13-3.3.2.tgz
dest: /usr/local/kafka-server
remote_src: yes
extra_opts: [--strip-components=1]
- name: Create Zookeeper systemd unit file
template:
src: templates/zookeeper.service.j2
dest: /etc/systemd/system/zookeeper.service
- name: Create Kafka systemd unit file
template:
src: templates/kafka.service.j2
dest: /etc/systemd/system/kafka.service
- name: Reload systemd daemon
systemd:
daemon_reload: yes
- name: Enable and start Zookeeper
systemd:
name: zookeeper
enabled: yes
state: started
- name: Enable and start Kafka
systemd:
name: kafka
enabled: yes
state: started
Create template files (templates/zookeeper.service.j2
and templates/kafka.service.j2
) similar to the content provided in the original article for the systemd unit files.
Steps:
-
Install Ansible: Install Ansible on your control machine (the machine from which you will run the playbook).
sudo apt update sudo apt install ansible
-
Configure Ansible Inventory: Create an Ansible inventory file (e.g.,
hosts
) that lists the Kafka servers:[kafka_servers] kafka1 ansible_host=your_kafka_server_ip ansible_user=your_user ansible_password=your_password
Replace
your_kafka_server_ip
,your_user
, andyour_password
with the appropriate values for your Kafka server. You can also use SSH keys for authentication instead of passwords. -
Run the Playbook: Execute the Ansible playbook:
ansible-playbook -i hosts kafka_install.yml
This command will connect to the Kafka server(s) and execute the tasks defined in the playbook.
These alternative solutions offer different advantages. Docker Compose provides a simple and self-contained deployment, while Ansible offers powerful automation capabilities for managing multiple Kafka instances. The best choice depends on your specific requirements and infrastructure.
By understanding these different approaches, you can choose the method that best suits your needs for the Install Apache Kafka on Debian 11.