Best Steps To Install Apache Kafka on Rocky Linux 8
In this comprehensive guide, we will walk you through the process to Install Apache Kafka on Rocky Linux 8. Apache Kafka is a distributed, fault-tolerant, high-throughput streaming platform. It belongs to a family of technologies known as queuing, messaging, or streaming engines. Think of Kafka as the NoSQL equivalent to traditional queuing technologies, offering greater scalability and performance.
Follow the steps outlined below on the Orcacore website to successfully Install Apache Kafka on Rocky Linux 8.
Before proceeding, ensure you have a Rocky Linux 8 server. Log in as a non-root user with sudo privileges and set up a basic firewall. You can refer to our guide on Initial Server Setup with Rocky Linux 8 for assistance with this initial configuration.
1. Install Required Packages For Kafka
First, we need to prepare the server environment to Install Apache Kafka on Rocky Linux 8. Begin by updating and upgrading your local package index with the following command:
sudo dnf update -y && sudo dnf upgrade -y
Next, install the necessary packages, including the Java Development Kit (JDK), which Kafka requires to run.
sudo dnf install wget git unzip java-11-openjdk -y
This command installs wget
for downloading files, git
for cloning the CMAK repository later, unzip
for extracting archive files, and java-11-openjdk
which is the required Java Development Kit.
2. Install Apache Kafka on Rocky Linux 8
Now that the environment is prepared, we can download and install the latest release of Kafka.
Apache Kafka Download
Visit the Apache Kafka download page and locate the latest release under the Binary downloads section. Obtain the URL for the recommended release and use the wget
command to download it to your server:
sudo wget https://downloads.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz
Next, create a directory to house your Kafka installation under the /usr/local
directory and navigate into it:
# sudo mkdir /usr/local/kafka-server
# sudo cd /usr/local/kafka-server
Extract the downloaded Kafka archive into this directory:
sudo tar -xvzf ~/kafka_2.13-3.4.0.tgz --strip 1
The --strip 1
option removes the top-level directory from the archive when extracting.
Create Zookeeper Systemd Unit File
Zookeeper is crucial for managing Kafka clusters. It maintains the state of the Kafka cluster nodes and tracks topics, partitions, and other metadata. We’ll create a systemd unit file for Zookeeper to simplify service management.
Use your preferred text editor (like vi
) to create the Zookeeper systemd unit file:
sudo vi /etc/systemd/system/zookeeper.service
Add the following content to the file:
[Unit]
Description=Apache Zookeeper Server
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties
ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
Save and close the file.
Create Systemd Unit File for Kafka
Now, create a systemd unit file for Apache Kafka itself. Again, use your text editor:
sudo vi /etc/systemd/system/kafka.service
Add the following content to the file. Important: Verify that the JAVA_HOME
configuration is correctly set, or Kafka will fail to start.
[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
After=zookeeper.service
[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/jre-11-openjdk"
ExecStart=/usr/local/kafka-server/bin/kafka-server-start.sh /usr/local/kafka-server/config/server.properties
ExecStop=/usr/local/kafka-server/bin/kafka-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
Save and close the file.
Reload the systemd daemon to apply the changes and then start and enable the Zookeeper and Kafka services:
# sudo systemctl daemon-reload
# sudo systemctl enable --now zookeeper
# sudo systemctl enable --now kafka
Verify that the Kafka and Zookeeper services are active and running:
sudo systemctl status kafka
**Output**
● kafka.service - Apache Kafka Server
Loaded: loaded (/etc/systemd/system/kafka.service; enabled; vendor preset: disabled)
Active: **active** (**running**) since Wed 2023-03-15 05:10:19 EDT; 6s ago
Docs: http://kafka.apache.org/documentation.html
Main PID: 93631 (java)
Tasks: 69 (limit: 23699)
Memory: 321.2M
CGroup: /system.slice/kafka.service
└─93631 /usr/lib/jvm/jre-11-openjdk/bin/java -Xmx1G -Xms1G -server -...
...
sudo systemctl status zookeeper
**Output**
● zookeeper.service - Apache Zookeeper Server
Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; vendor preset: disabled)
Active: **active** (**running**) since Wed 2023-03-15 05:10:13 EDT; 50s ago
Main PID: 93247 (java)
Tasks: 32 (limit: 23699)
Memory: 72.2M
CGroup: /system.slice/zookeeper.service
└─93247 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMi...
...
Install CMAK on Rocky Linux 8
CMAK (formerly known as Kafka Manager) is an open-source tool developed by Yahoo for managing Apache Kafka clusters. Clone the CMAK repository from GitHub:
# cd ~
# sudo git clone https://github.com/yahoo/CMAK.git
**Output**
Cloning into 'CMAK'...
remote: Enumerating objects: 6542, done.
remote: Counting objects: 100% (266/266), done.
remote: Compressing objects: 100% (144/144), done.
remote: Total 6542 (delta 150), reused 195 (delta 110), pack-reused 6276
Receiving objects: 100% (6542/6542), 3.96 MiB | 12.45 MiB/s, done.
Resolving deltas: 100% (4211/4211), done.
Configure Cluster Manager for Apache Kafka
Make the necessary configuration changes in the CMAK configuration file:
sudo vi ~/CMAK/conf/application.conf
Modify the cmak.zkhosts
parameter to point to your Zookeeper host(s). You can specify multiple hosts separated by commas. Use IP addresses or hostnames.
cmak.zkhosts="localhost:2181"
Save and close the file.
Create a zip file that can be used to deploy the application. This process downloads and compiles necessary files, which may take some time.
# cd ~/CMAK/
# ./sbt clean dist
Upon completion, you should see the following output:
**Output**
[info] Your package is ready in /root/CMAK/target/universal/cmak-3.0.0.7.zip
Navigate to the directory containing the zip file and extract it:
# cd /root/CMAK/target/universal
# unzip cmak-3.0.0.7.zip
# cd cmak-3.0.0.7
3. Access CMAK Service
With the previous steps completed, you can run the Cluster Manager for Apache Kafka service on Rocky Linux 8:
bin/cmak
By default, the service runs on port 9000
. Open your web browser and access the CMAK interface at http://ip-or-domain-name-of-server:9000
. If you have a firewall enabled, allow access to port 9000
externally:
sudo firewall-cmd --zone=public --permanent --add-port 9000/tcp
The CMAK interface should now be visible.
Add Cluster From the CMAK
Add your Kafka cluster through the CMAK interface. Click "cluster" and then "add cluster".
Fill in the requested details, such as the Cluster Name, Zookeeper Hosts (separated by commas if you have multiple), and other relevant information based on your setup.
Create a Topic in the CMAK interface
Within your newly added Apache Kafka cluster on Rocky Linux 8, click "Topic" and then "create." Input the necessary details for the new topic, including the Replication Factor, Partitions, and any other required configurations. Click "Create" to finalize the topic creation.
Then, click on Cluster view to see your topics.
From there you can add topics, delete them, configure them, etc. The process to Install Apache Kafka on Rocky Linux 8 is now complete.
Conclusion
Apache Kafka is a powerful tool for handling high-throughput, fault-tolerant, and scalable messaging and event processing. By following these steps, you have successfully completed the Apache Kafka download and Install Apache Kafka on Rocky Linux 8.
Alternative Solutions for Installing Apache Kafka on Rocky Linux 8
While the previous guide outlines a manual installation process, alternative methods exist that can simplify the deployment and management of Apache Kafka on Rocky Linux 8. Here are two such alternatives:
1. Using Docker Compose
Docker Compose allows you to define and manage multi-container Docker applications. We can use it to create a Kafka cluster with Zookeeper and CMAK, all in separate containers. This approach provides isolation, portability, and ease of management.
Explanation:
- Docker: Containerization technology that allows you to package an application and its dependencies into a standardized unit for software development.
- Docker Compose: A tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration.
Steps:
-
Install Docker and Docker Compose: If you haven’t already, install Docker and Docker Compose on your Rocky Linux 8 server. Follow the official Docker documentation for installation instructions.
-
Create a
docker-compose.yml
file: Create a file nameddocker-compose.yml
in a suitable directory. This file will define the Kafka, Zookeeper, and CMAK services.version: '3.8' services: zookeeper: image: confluentinc/cp-zookeeper:latest hostname: zookeeper container_name: zookeeper ports: - "2181:2181" environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 kafka: image: confluentinc/cp-kafka:latest hostname: kafka container_name: kafka depends_on: - zookeeper ports: - "9092:9092" - "9999:9999" environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,BROKER://localhost:9092 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,BROKER:PLAINTEXT KAFKA_INTER_BROKER_LISTENER_NAME: BROKER KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1 KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1 cmak: image: sheepkiller/kafka-manager:latest hostname: cmak container_name: cmak depends_on: - kafka ports: - "9000:9000" environment: ZK_HOSTS: "zookeeper:2181"
-
Start the Kafka cluster: Navigate to the directory containing the
docker-compose.yml
file and run the following command:docker-compose up -d
This command starts all the services defined in the
docker-compose.yml
file in detached mode (-d
). -
Access CMAK: Once the containers are running, access the CMAK interface in your browser at
http://ip-or-domain-name-of-server:9000
.
Advantages:
- Simplified deployment and management.
- Isolation of Kafka, Zookeeper, and CMAK.
- Portability across different environments.
2. Using Ansible
Ansible is an automation tool that can be used to provision and configure servers. We can create an Ansible playbook to automate the installation and configuration of Kafka, Zookeeper, and CMAK on Rocky Linux 8.
Explanation:
- Ansible: An open-source automation tool that simplifies application deployment, system configuration, and task automation.
- Playbook: A YAML file that contains a set of instructions for Ansible to execute.
Steps:
-
Install Ansible: Install Ansible on your control machine (the machine from which you’ll run the playbook). Follow the official Ansible documentation for installation instructions.
-
Create an Ansible playbook: Create a YAML file named
kafka_install.yml
(or any name you prefer) and define the tasks required to install and configure Kafka, Zookeeper, and CMAK.--- - hosts: kafka_servers become: true tasks: - name: Install required packages dnf: name: - wget - git - unzip - java-11-openjdk state: present - name: Download Kafka get_url: url: https://downloads.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz dest: /tmp/kafka_2.13-3.4.0.tgz - name: Create Kafka directory file: path: /usr/local/kafka-server state: directory owner: root group: root mode: '0755' - name: Extract Kafka command: tar -xvzf /tmp/kafka_2.13-3.4.0.tgz --strip 1 -C /usr/local/kafka-server - name: Create Zookeeper systemd unit file template: src: templates/zookeeper.service.j2 dest: /etc/systemd/system/zookeeper.service - name: Create Kafka systemd unit file template: src: templates/kafka.service.j2 dest: /etc/systemd/system/kafka.service - name: Reload systemd daemon systemd: name: systemd daemon_reload: yes - name: Enable and start Zookeeper systemd: name: zookeeper state: started enabled: yes - name: Enable and start Kafka systemd: name: kafka state: started enabled: yes - name: Clone CMAK repository git: repo: https://github.com/yahoo/CMAK.git dest: /opt/cmak - name: Configure CMAK template: src: templates/application.conf.j2 dest: /opt/cmak/conf/application.conf # Add tasks to build and run CMAK (omitted for brevity)
-
Create template files: Create Jinja2 template files for
zookeeper.service
,kafka.service
, andapplication.conf
. These templates will be used to dynamically generate the configuration files based on your environment. Store these templates in a directory namedtemplates
within the same directory as the playbook.Example:
templates/zookeeper.service.j2
[Unit] Description=Apache Zookeeper Server Requires=network.target remote-fs.target After=network.target remote-fs.target [Service] Type=simple ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
-
Configure your Ansible inventory: Define the
kafka_servers
group in your Ansible inventory file, listing the IP addresses or hostnames of the servers where you want to install Kafka. -
Run the playbook: Execute the playbook from your control machine using the following command:
ansible-playbook kafka_install.yml -i <inventory_file>
Replace
<inventory_file>
with the path to your Ansible inventory file.
Advantages:
- Automated installation and configuration.
- Idempotent execution (the playbook can be run multiple times without causing unintended changes).
- Centralized management of Kafka infrastructure.
These alternative solutions offer different approaches to simplifying the deployment and management of Apache Kafka on Rocky Linux 8, allowing you to choose the method that best suits your needs and technical expertise.