Best Steps To Install Apache Kafka on Rocky Linux 8

Posted on

Best Steps To Install Apache Kafka on Rocky Linux 8

Best Steps To Install Apache Kafka on Rocky Linux 8

In this comprehensive guide, we will walk you through the process to Install Apache Kafka on Rocky Linux 8. Apache Kafka is a distributed, fault-tolerant, high-throughput streaming platform. It belongs to a family of technologies known as queuing, messaging, or streaming engines. Think of Kafka as the NoSQL equivalent to traditional queuing technologies, offering greater scalability and performance.

Follow the steps outlined below on the Orcacore website to successfully Install Apache Kafka on Rocky Linux 8.

Before proceeding, ensure you have a Rocky Linux 8 server. Log in as a non-root user with sudo privileges and set up a basic firewall. You can refer to our guide on Initial Server Setup with Rocky Linux 8 for assistance with this initial configuration.

1. Install Required Packages For Kafka

First, we need to prepare the server environment to Install Apache Kafka on Rocky Linux 8. Begin by updating and upgrading your local package index with the following command:

sudo dnf update -y && sudo dnf upgrade -y

Next, install the necessary packages, including the Java Development Kit (JDK), which Kafka requires to run.

sudo dnf install wget git unzip java-11-openjdk -y

This command installs wget for downloading files, git for cloning the CMAK repository later, unzip for extracting archive files, and java-11-openjdk which is the required Java Development Kit.

2. Install Apache Kafka on Rocky Linux 8

Now that the environment is prepared, we can download and install the latest release of Kafka.

Apache Kafka Download

Visit the Apache Kafka download page and locate the latest release under the Binary downloads section. Obtain the URL for the recommended release and use the wget command to download it to your server:

sudo wget https://downloads.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz

Next, create a directory to house your Kafka installation under the /usr/local directory and navigate into it:

# sudo mkdir /usr/local/kafka-server
# sudo cd /usr/local/kafka-server

Extract the downloaded Kafka archive into this directory:

sudo tar -xvzf ~/kafka_2.13-3.4.0.tgz --strip 1

The --strip 1 option removes the top-level directory from the archive when extracting.

Create Zookeeper Systemd Unit File

Zookeeper is crucial for managing Kafka clusters. It maintains the state of the Kafka cluster nodes and tracks topics, partitions, and other metadata. We’ll create a systemd unit file for Zookeeper to simplify service management.

Use your preferred text editor (like vi) to create the Zookeeper systemd unit file:

sudo vi /etc/systemd/system/zookeeper.service

Add the following content to the file:

[Unit]
Description=Apache Zookeeper Server
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties
ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file.

Create Systemd Unit File for Kafka

Now, create a systemd unit file for Apache Kafka itself. Again, use your text editor:

sudo vi /etc/systemd/system/kafka.service

Add the following content to the file. Important: Verify that the JAVA_HOME configuration is correctly set, or Kafka will fail to start.

[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/jre-11-openjdk"
ExecStart=/usr/local/kafka-server/bin/kafka-server-start.sh /usr/local/kafka-server/config/server.properties
ExecStop=/usr/local/kafka-server/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file.

Reload the systemd daemon to apply the changes and then start and enable the Zookeeper and Kafka services:

# sudo systemctl daemon-reload
# sudo systemctl enable --now zookeeper
# sudo systemctl enable --now kafka

Verify that the Kafka and Zookeeper services are active and running:

sudo systemctl status kafka
**Output**
● kafka.service - Apache Kafka Server
   Loaded: loaded (/etc/systemd/system/kafka.service; enabled; vendor preset: disabled)
   Active: **active** (**running**) since Wed 2023-03-15 05:10:19 EDT; 6s ago
     Docs: http://kafka.apache.org/documentation.html
 Main PID: 93631 (java)
    Tasks: 69 (limit: 23699)
   Memory: 321.2M
   CGroup: /system.slice/kafka.service
           └─93631 /usr/lib/jvm/jre-11-openjdk/bin/java -Xmx1G -Xms1G -server -...
...
sudo systemctl status zookeeper
**Output**
● zookeeper.service - Apache Zookeeper Server
   Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; vendor preset: disabled)
   Active: **active** (**running**) since Wed 2023-03-15 05:10:13 EDT; 50s ago
 Main PID: 93247 (java)
    Tasks: 32 (limit: 23699)
   Memory: 72.2M
   CGroup: /system.slice/zookeeper.service
           └─93247 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMi...
...

Install CMAK on Rocky Linux 8

CMAK (formerly known as Kafka Manager) is an open-source tool developed by Yahoo for managing Apache Kafka clusters. Clone the CMAK repository from GitHub:

# cd ~
# sudo git clone https://github.com/yahoo/CMAK.git
**Output**
Cloning into 'CMAK'...
remote: Enumerating objects: 6542, done.
remote: Counting objects: 100% (266/266), done.
remote: Compressing objects: 100% (144/144), done.
remote: Total 6542 (delta 150), reused 195 (delta 110), pack-reused 6276
Receiving objects: 100% (6542/6542), 3.96 MiB | 12.45 MiB/s, done.
Resolving deltas: 100% (4211/4211), done.

Configure Cluster Manager for Apache Kafka

Make the necessary configuration changes in the CMAK configuration file:

sudo vi ~/CMAK/conf/application.conf

Modify the cmak.zkhosts parameter to point to your Zookeeper host(s). You can specify multiple hosts separated by commas. Use IP addresses or hostnames.

cmak.zkhosts="localhost:2181"

Save and close the file.

Create a zip file that can be used to deploy the application. This process downloads and compiles necessary files, which may take some time.

# cd ~/CMAK/
# ./sbt clean dist

Upon completion, you should see the following output:

**Output**
[info] Your package is ready in /root/CMAK/target/universal/cmak-3.0.0.7.zip

Navigate to the directory containing the zip file and extract it:

# cd /root/CMAK/target/universal
# unzip cmak-3.0.0.7.zip
# cd cmak-3.0.0.7

3. Access CMAK Service

With the previous steps completed, you can run the Cluster Manager for Apache Kafka service on Rocky Linux 8:

bin/cmak

By default, the service runs on port 9000. Open your web browser and access the CMAK interface at http://ip-or-domain-name-of-server:9000. If you have a firewall enabled, allow access to port 9000 externally:

sudo firewall-cmd --zone=public --permanent --add-port 9000/tcp

The CMAK interface should now be visible.

Add Cluster From the CMAK

Add your Kafka cluster through the CMAK interface. Click "cluster" and then "add cluster".

Fill in the requested details, such as the Cluster Name, Zookeeper Hosts (separated by commas if you have multiple), and other relevant information based on your setup.

Create a Topic in the CMAK interface

Within your newly added Apache Kafka cluster on Rocky Linux 8, click "Topic" and then "create." Input the necessary details for the new topic, including the Replication Factor, Partitions, and any other required configurations. Click "Create" to finalize the topic creation.

Then, click on Cluster view to see your topics.

From there you can add topics, delete them, configure them, etc. The process to Install Apache Kafka on Rocky Linux 8 is now complete.

Conclusion

Apache Kafka is a powerful tool for handling high-throughput, fault-tolerant, and scalable messaging and event processing. By following these steps, you have successfully completed the Apache Kafka download and Install Apache Kafka on Rocky Linux 8.

Alternative Solutions for Installing Apache Kafka on Rocky Linux 8

While the previous guide outlines a manual installation process, alternative methods exist that can simplify the deployment and management of Apache Kafka on Rocky Linux 8. Here are two such alternatives:

1. Using Docker Compose

Docker Compose allows you to define and manage multi-container Docker applications. We can use it to create a Kafka cluster with Zookeeper and CMAK, all in separate containers. This approach provides isolation, portability, and ease of management.

Explanation:

  • Docker: Containerization technology that allows you to package an application and its dependencies into a standardized unit for software development.
  • Docker Compose: A tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration.

Steps:

  1. Install Docker and Docker Compose: If you haven’t already, install Docker and Docker Compose on your Rocky Linux 8 server. Follow the official Docker documentation for installation instructions.

  2. Create a docker-compose.yml file: Create a file named docker-compose.yml in a suitable directory. This file will define the Kafka, Zookeeper, and CMAK services.

    version: '3.8'
    services:
      zookeeper:
        image: confluentinc/cp-zookeeper:latest
        hostname: zookeeper
        container_name: zookeeper
        ports:
          - "2181:2181"
        environment:
          ZOOKEEPER_CLIENT_PORT: 2181
          ZOOKEEPER_TICK_TIME: 2000
    
      kafka:
        image: confluentinc/cp-kafka:latest
        hostname: kafka
        container_name: kafka
        depends_on:
          - zookeeper
        ports:
          - "9092:9092"
          - "9999:9999"
        environment:
          KAFKA_BROKER_ID: 1
          KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,BROKER://localhost:9092
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,BROKER:PLAINTEXT
          KAFKA_INTER_BROKER_LISTENER_NAME: BROKER
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
          KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
          KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
    
      cmak:
        image: sheepkiller/kafka-manager:latest
        hostname: cmak
        container_name: cmak
        depends_on:
          - kafka
        ports:
          - "9000:9000"
        environment:
          ZK_HOSTS: "zookeeper:2181"
    
  3. Start the Kafka cluster: Navigate to the directory containing the docker-compose.yml file and run the following command:

    docker-compose up -d

    This command starts all the services defined in the docker-compose.yml file in detached mode (-d).

  4. Access CMAK: Once the containers are running, access the CMAK interface in your browser at http://ip-or-domain-name-of-server:9000.

Advantages:

  • Simplified deployment and management.
  • Isolation of Kafka, Zookeeper, and CMAK.
  • Portability across different environments.

2. Using Ansible

Ansible is an automation tool that can be used to provision and configure servers. We can create an Ansible playbook to automate the installation and configuration of Kafka, Zookeeper, and CMAK on Rocky Linux 8.

Explanation:

  • Ansible: An open-source automation tool that simplifies application deployment, system configuration, and task automation.
  • Playbook: A YAML file that contains a set of instructions for Ansible to execute.

Steps:

  1. Install Ansible: Install Ansible on your control machine (the machine from which you’ll run the playbook). Follow the official Ansible documentation for installation instructions.

  2. Create an Ansible playbook: Create a YAML file named kafka_install.yml (or any name you prefer) and define the tasks required to install and configure Kafka, Zookeeper, and CMAK.

    ---
    - hosts: kafka_servers
      become: true
      tasks:
        - name: Install required packages
          dnf:
            name:
              - wget
              - git
              - unzip
              - java-11-openjdk
            state: present
    
        - name: Download Kafka
          get_url:
            url: https://downloads.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz
            dest: /tmp/kafka_2.13-3.4.0.tgz
    
        - name: Create Kafka directory
          file:
            path: /usr/local/kafka-server
            state: directory
            owner: root
            group: root
            mode: '0755'
    
        - name: Extract Kafka
          command: tar -xvzf /tmp/kafka_2.13-3.4.0.tgz --strip 1 -C /usr/local/kafka-server
    
        - name: Create Zookeeper systemd unit file
          template:
            src: templates/zookeeper.service.j2
            dest: /etc/systemd/system/zookeeper.service
    
        - name: Create Kafka systemd unit file
          template:
            src: templates/kafka.service.j2
            dest: /etc/systemd/system/kafka.service
    
        - name: Reload systemd daemon
          systemd:
            name: systemd
            daemon_reload: yes
    
        - name: Enable and start Zookeeper
          systemd:
            name: zookeeper
            state: started
            enabled: yes
    
        - name: Enable and start Kafka
          systemd:
            name: kafka
            state: started
            enabled: yes
    
        - name: Clone CMAK repository
          git:
            repo: https://github.com/yahoo/CMAK.git
            dest: /opt/cmak
    
        - name: Configure CMAK
          template:
            src: templates/application.conf.j2
            dest: /opt/cmak/conf/application.conf
    
        # Add tasks to build and run CMAK (omitted for brevity)
    
  3. Create template files: Create Jinja2 template files for zookeeper.service, kafka.service, and application.conf. These templates will be used to dynamically generate the configuration files based on your environment. Store these templates in a directory named templates within the same directory as the playbook.

    Example: templates/zookeeper.service.j2

    [Unit]
    Description=Apache Zookeeper Server
    Requires=network.target remote-fs.target
    After=network.target remote-fs.target
    
    [Service]
    Type=simple
    ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties
    ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh
    Restart=on-abnormal
    
    [Install]
    WantedBy=multi-user.target
  4. Configure your Ansible inventory: Define the kafka_servers group in your Ansible inventory file, listing the IP addresses or hostnames of the servers where you want to install Kafka.

  5. Run the playbook: Execute the playbook from your control machine using the following command:

    ansible-playbook kafka_install.yml -i <inventory_file>

    Replace <inventory_file> with the path to your Ansible inventory file.

Advantages:

  • Automated installation and configuration.
  • Idempotent execution (the playbook can be run multiple times without causing unintended changes).
  • Centralized management of Kafka infrastructure.

These alternative solutions offer different approaches to simplifying the deployment and management of Apache Kafka on Rocky Linux 8, allowing you to choose the method that best suits your needs and technical expertise.