Install Apache Kafka on AlmaLinux 9 with Best Steps

Posted on

Install Apache Kafka on AlmaLinux 9 with Best Steps

Install Apache Kafka on AlmaLinux 9 with Best Steps

In this tutorial, we will guide you through the process of Install Apache Kafka on AlmaLinux 9. Apache Kafka is a powerful, open-source, distributed streaming platform ideal for developing real-time, event-driven applications. Orcacore provides this comprehensive guide to help you set up Kafka effectively.

Kafka has three primary capabilities:

Kafka is a distributed platform – it operates as a fault-tolerant, highly available cluster capable of spanning multiple servers and even data centers. Kafka topics are partitioned and replicated in a way that enables scaling to serve high volumes of concurrent consumers without impacting performance. Apache.org states, "Kafka will perform the same whether you have 50 KB or 50 TB of persistent storage on the server."

Before we begin to Install Apache Kafka on AlmaLinux 9, ensure you have a non-root user with sudo privileges and a basic firewall configured on your server. You can follow our guide on Initial Server Setup with AlmaLinux 9 for assistance.

1. Install Required Packages For Apache Kafka

First, prepare your AlmaLinux 9 server for the Kafka installation. Update and upgrade your local package index using the following command:

sudo dnf update -y && sudo dnf upgrade -y

Next, install the necessary packages, including the Java Development Kit (JDK), using the command below:

sudo dnf install wget git unzip java-11-openjdk -y

2. Set up Apache Kafka on AlmaLinux 9

Now, download and configure the latest release of Kafka.

Download Kafka

Visit the Apache Kafka downloads page and locate the latest release under Binary downloads. Use the wget command to download the recommended version:

sudo wget https://downloads.apache.org/kafka/3.3.2/kafka_2.13-3.3.2.tgz

Create a directory for Kafka under the /usr/local directory and navigate into it:

sudo mkdir /usr/local/kafka-server && sudo cd /usr/local/kafka-server

Extract the downloaded Kafka archive into this directory:

sudo tar -xvzf ~/kafka_2.13-3.3.2.tgz --strip 1

Create Zookeeper Systemd Unit File

Create a Zookeeper systemd unit file to manage Zookeeper as a service. Zookeeper is a centralized service that maintains configuration data and provides synchronization within distributed systems, tracking the status of Kafka cluster nodes, topics, and partitions.

Create the zookeeper.service file using your favorite text editor (e.g., vi):

sudo vi /etc/systemd/system/zookeeper.service

Add the following content to the file:

[Unit]
Description=Apache Zookeeper Server
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties
ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file.

Create Systemd Unit File for Kafka

Create a systemd unit file for Kafka to manage it as a service. Use vi or your preferred text editor:

sudo vi /etc/systemd/system/kafka.service

Add the following content to the file, ensuring your JAVA_HOME configuration is correct:

[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/jre-11-openjdk"
ExecStart=/usr/local/kafka-server/bin/kafka-server-start.sh /usr/local/kafka-server/config/server.properties
ExecStop=/usr/local/kafka-server/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file.

Reload the systemd daemon to apply the changes and start the Zookeeper and Kafka services:

sudo systemctl daemon-reload
sudo systemctl enable --now zookeeper
sudo systemctl enable --now kafka

Verify that Kafka and Zookeeper services are active and running:

sudo systemctl status kafka
**Output**
● kafka.service - Apache Kafka Server
     Loaded: loaded (/etc/systemd/system/kafka.service; enabled; vendor preset:>
     Active: **active** (**running**) since Thu 2023-01-26 02:12:14 EST; 7s ago
       Docs: http://kafka.apache.org/documentation.html
   Main PID: 74791 (java)
      Tasks: 69 (limit: 23609)
     Memory: 332.3M
        CPU: 7.875s
     CGroup: /system.slice/kafka.service
...
sudo systemctl status zookeeper
**Output**
● zookeeper.service - Apache Zookeeper Server
     Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; vendor pre>
     Active: **active** (**running**) since Thu 2023-01-26 02:12:09 EST; 42s ago
   Main PID: 74408 (java)
      Tasks: 32 (limit: 23609)
     Memory: 72.1M
        CPU: 2.979s
     CGroup: /system.slice/zookeeper.service
...

3. Install Kafka Manager (CMAK) on AlmaLinux 9

CMAK (previously known as Kafka Manager) is an open-source tool for managing Apache Kafka clusters. Clone the CMAK repository from GitHub:

cd ~
sudo git clone https://github.com/yahoo/CMAK.git
**Output**
Cloning into 'CMAK'...
remote: Enumerating objects: 6542, done.
remote: Counting objects: 100% (266/266), done.
remote: Compressing objects: 100% (143/143), done.
remote: Total 6542 (delta 150), reused 196 (delta 111), pack-reused 6276
Receiving objects: 100% (6542/6542), 3.96 MiB | 12.41 MiB/s, done.
Resolving deltas: 100% (4211/4211), done.

4. Configure Cluster Manager for Apache Kafka

Make configuration changes in the CMAK config file (Kafka Manager File). Open the file with vi:

sudo vi ~/CMAK/conf/application.conf

Change cmak.zkhosts="my.zookeeper.host.com:2181" to reflect your Zookeeper host(s). You can specify multiple Zookeeper hosts, separated by commas: cmak.zkhosts="localhost:2181,other.zookeeper.host.com:2181". Use IP addresses if necessary.

cmak.zkhosts="localhost:2181"

Save and close the file.

Create a zip file for deploying the application:

cd ~/CMAK/
./sbt clean dist

This process may take some time as files are downloaded and compiled.

**Output**
[info] Your package is ready in /root/CMAK/target/universal/cmak-3.0.0.7.zip

Navigate to the directory containing the zip file, unzip it, and change to the extracted directory:

cd /root/CMAK/target/universal
unzip cmak-3.0.0.7.zip
cd cmak-3.0.0.7

5. Access Kafka Manager Service

Run the Cluster Manager for Apache Kafka service on AlmaLinux 9:

bin/cmak

By default, it runs on port 9000. Open your browser and navigate to http://ip-or-domain-name-of-server:9000. If your firewall is running, allow external access to port 9000:

sudo firewall-cmd --zone=public --permanent --add-port 9000/tcp

You should see the Kafka Manager interface.

Install Apache Kafka on AlmaLinux 9, Kafka Manager
Kafka Manager – Clusters

Add Cluster From the CMAK

Add your Kafka cluster by clicking cluster and then adding the cluster.

Apache kafka add cluster AlmaLinux 9
Kafka Manager – Add Cluster

Fill in the requested details (Cluster Name, Zookeeper Hosts, etc.). If you have multiple Zookeeper Hosts, add them delimited by a comma. Fill in other details based on your needs.

Cluster information
Kafka Manager – Cluster Info

Create a Topic in the CMAK interface

From your newly added cluster, click on Topic and then create. Input all the necessary details for the new Topic (Replication Factor, Partitions, etc.). Fill in the form and click Create.

Then, click on Cluster view to see your topics.

Topic info
Kafka Manager – Topic Summary

From there, you can add, delete, and configure topics.

Conclusion

You have successfully learned how to Install Apache Kafka on AlmaLinux 9. Apache Kafka is designed to handle large volumes of data in real-time, allowing systems to process and transfer data between different applications efficiently and reliably.

Hope you enjoy it. You may also like these articles:

Install and Configure WordPress on AlmaLinux 9

Install VMware Tools on AlmaLinux 8

Input Output Redirection in Linux

Preset aliases for all users in Linux

Run Shell Scripting in Linux

How to delete files and directories in Linux?

How to check the line number in a Linux file?

Alternative Solutions to Installing Apache Kafka on AlmaLinux 9

While the above steps detail a manual installation process, here are two alternative methods for deploying Apache Kafka on AlmaLinux 9. These methods aim to simplify and automate the setup, providing quicker and more manageable deployments.

1. Using Docker Compose

Docker Compose is a tool for defining and running multi-container Docker applications. Using Docker Compose to deploy Kafka and Zookeeper can significantly simplify the process, encapsulating all dependencies and configurations within a single, declarative file. This approach is especially beneficial for development and testing environments.

Explanation:

Docker Compose allows you to define your application’s services, networks, and volumes in a docker-compose.yml file. This file specifies the Docker images to use, environment variables, port mappings, and dependencies between services. When you run docker-compose up, Docker Compose automatically builds or pulls the necessary images, creates the defined networks and volumes, and starts the containers in the correct order.

Example docker-compose.yml file:

version: '3.7'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    ports:
      - "2181:2181"

  kafka:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

networks:
  default:
    name: kafka-network

Steps:

  1. Install Docker and Docker Compose: Follow the official Docker documentation to install Docker and Docker Compose on your AlmaLinux 9 system.

  2. Create docker-compose.yml file: Create a file named docker-compose.yml in a directory of your choice and paste the above configuration.

  3. Start the services: Navigate to the directory containing the docker-compose.yml file and run the following command:

    docker-compose up -d

This command will start the Zookeeper and Kafka containers in detached mode. You can then interact with your Kafka instance running within the Docker containers.

2. Using Ansible

Ansible is an open-source automation tool that allows you to provision, configure, and manage infrastructure as code. Using Ansible to deploy Kafka offers a more robust and scalable solution, particularly suitable for production environments.

Explanation:

Ansible uses playbooks, which are YAML files containing a series of tasks to be executed on target hosts. These tasks can include installing packages, configuring files, starting services, and more. By creating an Ansible playbook for Kafka deployment, you can automate the entire installation and configuration process, ensuring consistency and repeatability across multiple servers.

Code Example (Simplified Ansible Playbook):

---
- hosts: kafka_servers
  become: true
  tasks:
    - name: Install required packages
      dnf:
        name:
          - wget
          - git
          - unzip
          - java-11-openjdk
        state: present

    - name: Download Kafka
      get_url:
        url: https://downloads.apache.org/kafka/3.3.2/kafka_2.13-3.3.2.tgz
        dest: /tmp/kafka_2.13-3.3.2.tgz

    - name: Create Kafka directory
      file:
        path: /usr/local/kafka-server
        state: directory
        owner: root
        group: root

    - name: Extract Kafka
      command: tar -xvzf /tmp/kafka_2.13-3.3.2.tgz --strip 1 -C /usr/local/kafka-server
      args:
        creates: /usr/local/kafka-server/bin/kafka-server-start.sh

    - name: Create Zookeeper systemd unit file
      template:
        src: zookeeper.service.j2
        dest: /etc/systemd/system/zookeeper.service

    - name: Create Kafka systemd unit file
      template:
        src: kafka.service.j2
        dest: /etc/systemd/system/kafka.service

    - name: Reload systemd daemon
      systemd:
        daemon_reload: yes

    - name: Enable and start Zookeeper
      systemd:
        name: zookeeper
        state: started
        enabled: yes

    - name: Enable and start Kafka
      systemd:
        name: kafka
        state: started
        enabled: yes

Steps:

  1. Install Ansible: Install Ansible on your control machine (the machine from which you will run the playbook).

    sudo dnf install ansible -y
  2. Configure Ansible Inventory: Create an Ansible inventory file (e.g., hosts) and add the IP addresses or hostnames of your Kafka servers.

  3. Create Playbook: Create an Ansible playbook (e.g., kafka_install.yml) similar to the example above. You’ll also need to create the zookeeper.service.j2 and kafka.service.j2 template files based on the content provided in the original article.

  4. Run Playbook: Execute the Ansible playbook:

    ansible-playbook -i hosts kafka_install.yml

This command will run the playbook on the specified Kafka servers, automating the installation and configuration process.

These alternative solutions provide more streamlined and manageable approaches to deploying Apache Kafka on AlmaLinux 9, particularly for larger or more complex deployments. Using Docker Compose is excellent for development and testing, while Ansible is better suited for production environments due to its scalability and configuration management capabilities. These different methods can help you Install Apache Kafka on AlmaLinux 9 effectively.

Leave a Reply

Your email address will not be published. Required fields are marked *