Set up Apache Kafka on Centos 7: Best Setup Guide – OrcaCore

Posted on

Set up Apache Kafka on Centos 7: Best Setup Guide - OrcaCore

Set up Apache Kafka on Centos 7: Best Setup Guide – OrcaCore

This guide on the Orcacore website will teach you how to Set up Apache Kafka on Centos 7. Kafka is a powerful tool used for real-time data streams, collecting big data, and performing real-time analysis. It’s often used with in-memory microservices to ensure data durability and can feed events to Complex Event Processing (CEP) systems and IoT/IFTTT-style automation platforms. The core functionality lies in its ability to handle high-throughput, fault-tolerant, and scalable messaging. If you need to Set up Apache Kafka on Centos 7, this guide is for you.

Before you begin, ensure you’re logged into your Centos 7 server as a non-root user with sudo privileges and have a basic firewall set up. You can find guides for these prerequisites on Orcacore: Initial Server Setup with Centos 7 and Setting Up a Firewall with firewalld on Centos 7. This initial setup is crucial for a secure and properly configured Kafka environment. You can also check the Orcacore website to find guides on how to Set up Apache Kafka on Centos 7.

1. Install Required Packages For Kafka

First, prepare your Centos server for the Kafka installation. Update and upgrade your local package index using the following command:

sudo yum update -y && sudo yum upgrade -y

Next, install the necessary packages, including JDK, using this command:

sudo yum install wget git unzip java-11-openjdk -y

This command installs wget for downloading Kafka, git for potentially cloning CMAK, unzip for extracting the Kafka archive, and java-11-openjdk as Kafka requires a Java Runtime Environment.

2. Install Apache Kafka on Centos 7

Now, download and install the latest release of Kafka.

Download Kafka Centos 7

Visit the Apache Kafka downloads page, locate the latest release, and obtain the binary downloads link. Use the wget command to download the recommended version:

sudo wget https://downloads.apache.org/kafka/3.3.2/kafka_2.13-3.3.2.tgz

Then, create a directory for Kafka under /usr/local and navigate into it:

sudo mkdir /usr/local/kafka-server && sudo cd /usr/local/kafka-server

Extract the downloaded file into this directory:

sudo tar -xvzf ~/kafka_2.13-3.3.2.tgz --strip 1

The --strip 1 option removes the top-level directory in the archive, placing all the files directly into /usr/local/kafka-server.

Create Zookeeper Systemd Unit File

Create a Zookeeper systemd unit file to manage Zookeeper service actions (start, stop, restart).

Zookeeper is a centralized service used to maintain naming and configuration data and provides synchronization within distributed systems. It tracks the status of Kafka cluster nodes, topics, and partitions.

Create the zookeeper systemd unit file using a text editor like vi:

sudo vi /etc/systemd/system/zookeeper.service

Add the following content to the file:

[Unit]
Description=Apache Zookeeper Server
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties
ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file.

Create Systemd Unit File for Kafka

Now, create a systemd unit file for Kafka:

sudo vi /etc/systemd/system/kafka.service

Add the following content to the file:

Note: Ensure your JAVA_HOME configuration is correct, or Kafka will fail to start.

[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/jre-11-openjdk"
ExecStart=/usr/local/kafka-server/bin/kafka-server-start.sh /usr/local/kafka-server/config/server.properties
ExecStop=/usr/local/kafka-server/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file.

Reload the systemd daemon to apply the changes and start the services:

sudo systemctl daemon-reload
sudo systemctl enable --now zookeeper
sudo systemctl enable --now kafka

Verify that Kafka and Zookeeper services are active and running:

sudo systemctl status kafka
**Output**
● kafka.service - Apache Kafka Server
   Loaded: loaded (/etc/systemd/system/kafka.service; enabled; vendor preset: disabled)
   Active: **active** (**running**) since Tue 2023-02-07 03:23:23 EST; 6s ago
     Docs: http://kafka.apache.org/documentation.html
 Main PID: 9077 (java)
   CGroup: /system.slice/kafka.service
           └─9077 /usr/lib/jvm/jre-11-openjdk/bin/java -Xmx1G -Xms1G -server
...
sudo systemctl status zookeeper
**Output**
● zookeeper.service - Apache Zookeeper Server
   Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; vendor pres: disabled)
   Active: **active** (**running**) since Tue 2023-02-07 03:23:16 EST; 23s ago
 Main PID: 8685 (java)
   CGroup: /system.slice/zookeeper.service
           └─8685 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseM.
...

3. Install CMAK on Centos 7

CMAK (previously known as Kafka Manager) is an open-source tool for managing Apache Kafka clusters. Clone CMAK from GitHub:

cd ~
sudo git clone https://github.com/yahoo/CMAK.git
**Output**
Cloning into 'CMAK'...
remote: Enumerating objects: 6542, done.
remote: Counting objects: 100% (266/266), done.
remote: Compressing objects: 100% (143/143), done.
remote: Total 6542 (delta 150), reused 196 (delta 111), pack-reused 6276
Receiving objects: 100% (6542/6542), 3.96 MiB | 0 bytes/s, done.
Resolving deltas: 100% (4211/4211), done.

4. Configure CMAK on Centos 7

Make configuration changes in the CMAK config file:

sudo vi ~/CMAK/conf/application.conf

Change cmak.zkhosts="my.zookeeper.host.com:2181" and specify multiple Zookeeper hosts by comma delimiting them, like so: cmak.zkhosts="my.zookeeper.host.com:2181,other.zookeeper.host.com:2181". Host names can be IP addresses.

cmak.zkhosts="localhost:2181"

Save and close the file.

Create a zip file to deploy the application. This process downloads and compiles files, which may take some time.

cd ~/CMAK/
./sbt clean dist
**Output**
[info] Your package is ready in /root/CMAK/target/universal/cmak-3.0.0.7.zip

Navigate to the directory where the zip file is located and unzip it:

cd /root/CMAK/target/universal
unzip cmak-3.0.0.7.zip
cd cmak-3.0.0.7

Access CMAK Service

Run the Cluster Manager for Apache Kafka service:

bin/cmak

By default, it uses port 9000. Open your browser and go to http://ip-or-domain-name-of-server:9000. If your firewall is active, allow external access to the port:

sudo firewall-cmd --zone=public --permanent --add-port 9000/tcp

You should see the CMAK interface.

Cluster Manager for Apache Kafka
Cluster Manager for Apache Kafka

Add Cluster From the CMAK

Add your Kafka cluster from the CMAK interface. Click cluster and then add cluster.

Apache kafka add cluster Centos 7
Add Cluster

Fill out the form with the requested details (Cluster Name, Zookeeper Hosts, etc.). If you have multiple Zookeeper Hosts, separate them with commas. Configure other details as needed.

Cluster information
Cluster Info

Create a Topic in the CMAK interface

From your newly added cluster, click on Topic and then create. Enter all the necessary details for the new topic (Replication Factor, Partitions, etc.). Fill in the form and click "Create".

Cluster topic Kafka
Create Topic

Click on Cluster view to see your topics.

Topic info
Topic Summary

From here, you can add, delete, and configure topics. The Orcacore guide has shown you how to Set up Apache Kafka on Centos 7

Conclusion

Kafka offers numerous benefits. It’s easy to understand, reliable, fault-tolerant, and scales effectively. With this Orcacore guide you can now Set up Apache Kafka on Centos 7.

Hope you found this guide on Setting up (Installing and Configuring) Apache Kafka on Centos 7 helpful. You might also be interested in these articles:

Install Bash 5 on Centos 7

Install OpenJDK 19 on Centos 7

How to Install CMake on Fedora 40 or 39

Install Wine on Fedora 39

How to Install FirewallD GUI on Fedora 40/39 Linux

How to Install WordPress on Fedora

Alternative Solutions for Setting up Apache Kafka on Centos 7

While the above guide provides a solid foundation for setting up Apache Kafka on Centos 7, there are alternative approaches that may be more suitable depending on your specific needs and infrastructure. Here are two different methods:

1. Using Docker and Docker Compose

Docker provides a containerization platform, and Docker Compose helps define and manage multi-container Docker applications. Using Docker simplifies the deployment process and ensures consistency across different environments.

Explanation:

Docker packages Kafka and its dependencies into a container. This eliminates the need to manually install and configure components like Java and Zookeeper on your Centos 7 server. Docker Compose allows you to define the services (Kafka, Zookeeper) and their configurations in a single docker-compose.yml file. This makes it easy to start, stop, and scale the Kafka cluster.

Steps:

  1. Install Docker and Docker Compose:

    Follow the official Docker documentation to install Docker and Docker Compose on your Centos 7 server.

  2. Create a docker-compose.yml file:

    Create a docker-compose.yml file with the following content (adjust versions as needed):

    version: '3.7'
    services:
      zookeeper:
        image: confluentinc/cp-zookeeper:7.3.0
        hostname: zookeeper
        container_name: zookeeper
        ports:
          - "2181:2181"
        environment:
          ZOOKEEPER_CLIENT_PORT: 2181
          ZOOKEEPER_TICK_TIME: 2000
    
      kafka:
        image: confluentinc/cp-kafka:7.3.0
        hostname: kafka
        container_name: kafka
        depends_on:
          - zookeeper
        ports:
          - "9092:9092"
          - "9999:9999"
        environment:
          KAFKA_BROKER_ID: 1
          KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,BROKER://localhost:9092
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,BROKER:PLAINTEXT
          KAFKA_INTER_BROKER_LISTENER_NAME: BROKER
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
          KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
          KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
    
  3. Start the Kafka cluster:

    Navigate to the directory containing the docker-compose.yml file and run:

    docker-compose up -d

    This command will download the necessary images and start the Zookeeper and Kafka containers in detached mode.

  4. Verify the deployment:

    Check the container status using:

    docker ps

Benefits:

  • Simplified Deployment: Reduces the complexity of setting up Kafka and Zookeeper.
  • Consistency: Ensures consistent environments across development, testing, and production.
  • Scalability: Easily scale the Kafka cluster by adjusting the number of containers.
  • Isolation: Containers provide isolation, preventing conflicts between Kafka and other applications on the server.

2. Using Ansible for Automated Configuration

Ansible is an open-source automation tool that can be used to provision and configure software on remote servers.

Explanation:

Instead of manually executing commands on the server, you define the desired state of the Kafka installation in an Ansible playbook. Ansible then automatically performs the necessary steps to achieve that state. This includes installing packages, configuring files, starting services, and more.

Steps:

  1. Install Ansible:

    Install Ansible on your control machine (the machine from which you will run the playbook).

    sudo yum install ansible
  2. Create an Ansible Inventory File:

    Create an inventory file (e.g., hosts) that lists the target Centos 7 server:

    [kafka]
    kafka_server ansible_host=<your_server_ip> ansible_user=<your_user> ansible_become=true

    Replace <your_server_ip> with the IP address of your Centos 7 server and <your_user> with the username that has sudo privileges.

  3. Create an Ansible Playbook (kafka.yml):

    Create an Ansible playbook named kafka.yml with the following content (adjust versions and configurations as needed):

    ---
    - hosts: kafka
      become: true
      tasks:
        - name: Update yum cache
          yum:
            name: '*'
            state: latest
            update_cache: yes
    
        - name: Install required packages
          yum:
            name:
              - wget
              - git
              - unzip
              - java-11-openjdk
            state: present
    
        - name: Create kafka directory
          file:
            path: /usr/local/kafka-server
            state: directory
            owner: root
            group: root
            mode: '0755'
    
        - name: Download Kafka
          get_url:
            url: https://downloads.apache.org/kafka/3.3.2/kafka_2.13-3.3.2.tgz
            dest: /tmp/kafka_2.13-3.3.2.tgz
    
        - name: Extract Kafka
          command: tar -xvzf /tmp/kafka_2.13-3.3.2.tgz --strip 1 -C /usr/local/kafka-server
          args:
            creates: /usr/local/kafka-server/bin/kafka-server-start.sh
    
        - name: Create Zookeeper systemd unit file
          template:
            src: templates/zookeeper.service.j2
            dest: /etc/systemd/system/zookeeper.service
            owner: root
            group: root
            mode: '0644'
          notify:
            - restart zookeeper
    
        - name: Create Kafka systemd unit file
          template:
            src: templates/kafka.service.j2
            dest: /etc/systemd/system/kafka.service
            owner: root
            group: root
            mode: '0644'
          notify:
            - restart kafka
    
        - name: Reload systemd daemon
          systemd:
            name: systemd
            daemon_reload: yes
            state: restarted
    
        - name: Enable and start Zookeeper
          systemd:
            name: zookeeper
            enabled: yes
            state: started
    
        - name: Enable and start Kafka
          systemd:
            name: kafka
            enabled: yes
            state: started
    
      handlers:
        - name: restart zookeeper
          systemd:
            name: zookeeper
            state: restarted
    
        - name: restart kafka
          systemd:
            name: kafka
            state: restarted
    

    Note: You will also need to create the templates directory and the zookeeper.service.j2 and kafka.service.j2 files with the same content as the original systemd unit files, but using Jinja2 templating for dynamic values if needed.

  4. Run the Playbook:

    Execute the Ansible playbook from your control machine:

    ansible-playbook -i hosts kafka.yml

Benefits:

  • Automation: Automates the entire Kafka setup process, reducing manual effort and potential errors.
  • Idempotency: Ansible ensures that the desired state is achieved regardless of the current state of the server. Running the playbook multiple times will have the same result.
  • Configuration Management: Provides a centralized way to manage Kafka configurations.
  • Scalability: Easily deploy Kafka to multiple servers by adding them to the inventory file.

These alternative solutions offer different advantages and may be more suitable depending on your specific requirements. Docker provides a simple and consistent deployment environment, while Ansible offers powerful automation and configuration management capabilities. By understanding these options, you can choose the best approach for setting up Apache Kafka on Centos 7.

Leave a Reply

Your email address will not be published. Required fields are marked *