Install Apache Cassandra on Rocky Linux 9: Best Distributed Database

Posted on

Install Apache Cassandra on Rocky Linux 9: Best Distributed Database

Install Apache Cassandra on Rocky Linux 9: Best Distributed Database

In this guide, we’ll explore how to Install Apache Cassandra on Rocky Linux 9 and configure a basic Cassandra cluster. Apache Cassandra is a powerful, open-source NoSQL database designed for handling massive amounts of data across multiple nodes. Its distributed architecture ensures high availability, scalability, and fault tolerance, making it a popular choice for applications requiring real-time data processing and big data analytics. Cassandra is managed by the Apache Software Foundation.

Key features of Apache Cassandra:

  • Decentralized: No single point of failure.
  • Scalable: Easily add more nodes to increase capacity.
  • Fault-Tolerant: Data is automatically replicated across multiple nodes.
  • High Availability: Minimal downtime due to node failures.
  • Tunable Consistency: Choose the level of consistency that best suits your application’s needs.
  • CQL (Cassandra Query Language): A SQL-like language for interacting with the database.

Steps To Install and Configure Apache Cassandra on Rocky Linux 9

Before you begin, ensure you have a Rocky Linux 9 server with a non-root user with sudo privileges. You can follow a guide like "Initial Server Setup with Rocky Linux 9" to set this up.

1. Install Dependencies for Cassandra

Apache Cassandra is written in Java, so you need to have Java installed on your server.

First, update your local package index:

sudo dnf update -y

Next, install OpenJDK:

sudo dnf install java-1.8.0-openjdk-devel -y

Verify the Java installation:

java -version
**Output**
openjdk version "1.8.0_352"
OpenJDK Runtime Environment (build 1.8.0_352-b08)
OpenJDK 64-Bit Server VM (build 25.352-b08, mixed mode)

Now, install Python and Pip 3:

sudo yum install python3 python-pip

Finally, install cqlsh using pip:

sudo pip3 install cqlsh

2. Install Apache Cassandra on Rocky Linux 9

This section details how to Install Apache Cassandra on Rocky Linux 9.

Add Apache Cassandra Repository

Create a yum repository file:

sudo vi /etc/yum.repos.d/cassandra.repo

Add the following content:

[cassandra]
name=Apache Cassandra
baseurl=https://redhat.cassandra.apache.org/41x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS

Save and close the file.

Update the system:

sudo dnf update -y

Note: If you encounter GPG key errors, update the crypto policies to LEGACY:

# sudo update-crypto-policies --set LEGACY
# sudo reboot

Install and configure Cassandra

Install Cassandra:

sudo dnf install cassandra -y

Create a systemd Unit File for Cassandra

Create a systemd unit file:

sudo vi /etc/systemd/system/cassandra.service

Add the following content:

[Unit]
Description=Apache Cassandra
After=network.target

[Service]
PIDFile=/var/run/cassandra/cassandra.pid
User=cassandra
Group=cassandra
ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid
Restart=always

[Install]
WantedBy=multi-user.target

Save and close the file.

Reload the daemon:

sudo systemctl daemon-reload

Start and enable the Cassandra service:

# sudo systemctl start cassandra
# sudo systemctl enable cassandra

Verify that Cassandra is running:

sudo systemctl status cassandra

You should see output similar to the image in the original article indicating an active and running service.

Alternatively, use the nodetool status command:

nodetool status

You should see output similar to the image in the original article, with "UN" indicating that the service is up and normal.

Log in to Default Cassandra Cluster (cqlsh shell)

Connect to the Cassandra cluster:

cqlsh

Your output should resemble the image in the original article.

Note: If you get an ImportError: cannot import name 'authproviderhandling' from 'cqlshlib', follow these steps:

Find the path to cqlshlib:

find /usr/lib/ -name cqlshlib

The path obtained (example):

/usr/lib/python3.6/site-packages/cqlshlib

Export the path:

export PYTHONPATH=$PYTHONPATH:/usr/lib/python3.6/site-packages/

Then, re-run the cqlsh command.

Now that you have Install Apache Cassandra on Rocky Linux 9, let’s configure your default cluster.

3. Change Apache Cassandra Default Cluster

You can change the default cluster name. From the cqlsh shell, run:

cqlsh> UPDATE system.local SET cluster_name = 'Orca Cluster' WHERE KEY = 'local';

Remember to replace "Orca Cluster" with your desired name.

Exit the cqlsh shell:

cqlsh> exit

Edit the cassandra.yaml file:

sudo vi /etc/cassandra/default.conf/cassandra.yaml

Find the cluster_name directive and update it:

cluster_name: 'Orca Cluster'

Save and close the file.

Restart Cassandra:

sudo systemctl restart cassandra

Log in again to verify the change:

cqlsh

You should see the new cluster name in the cqlsh output.

Conclusion

You have now learned how to Install Apache Cassandra on Rocky Linux 9 and configure a basic cluster. Cassandra’s distributed architecture makes it ideal for handling large-scale, distributed databases with high availability and fault tolerance. This makes it suitable for various applications including big data applications and real-time analytics.

Now, let’s explore some alternative ways to achieve the same result.

Alternative Installation Methods

While the above method is a standard approach, here are two alternative methods for installing and managing Apache Cassandra on Rocky Linux 9: using Docker and using Kubernetes (specifically Minikube for local testing).

1. Installing Apache Cassandra with Docker

Docker provides a containerized environment, simplifying deployment and ensuring consistency across different systems. Here’s how to Install Apache Cassandra on Rocky Linux 9 using Docker:

a. Install Docker:

If you don’t have Docker installed, use the following commands:

sudo dnf install docker -y
sudo systemctl start docker
sudo systemctl enable docker

b. Pull the Cassandra Docker Image:

sudo docker pull cassandra

This command downloads the official Cassandra image from Docker Hub.

c. Run the Cassandra Container:

sudo docker run -d --name my-cassandra -p 9042:9042 cassandra

This command creates and starts a Cassandra container named "my-cassandra". The -d flag runs the container in detached mode (in the background). The -p 9042:9042 flag maps the container’s port 9042 (Cassandra’s default port) to the host’s port 9042.

d. Access Cassandra:

You can now connect to your Cassandra instance using cqlsh from within the container. First, execute into the container:

sudo docker exec -it my-cassandra bash

Then, run cqlsh:

cqlsh

You are now inside the Cassandra CQL shell. You can perform the same cluster name change as described in the main guide.

Explanation:

Docker simplifies the installation process by providing a pre-configured environment for Cassandra. This eliminates the need to manually install dependencies like Java and configure the system. Docker ensures consistency and portability across different environments.

2. Installing Apache Cassandra with Kubernetes (Minikube)

Kubernetes is a container orchestration platform that automates the deployment, scaling, and management of containerized applications. While a full Kubernetes cluster is complex, Minikube provides a lightweight, single-node Kubernetes environment for local development and testing. This option allows you to Install Apache Cassandra on Rocky Linux 9 in a highly scalable, cloud-native manner, preparing you for larger deployments.

a. Install Minikube and kubectl:

Follow the official Minikube documentation to install Minikube and kubectl (the Kubernetes command-line tool) on your Rocky Linux 9 system. This typically involves downloading binaries and adding them to your PATH.

b. Start Minikube:

minikube start

This command starts the Minikube cluster.

c. Deploy Cassandra using a Helm Chart:

Helm is a package manager for Kubernetes. We’ll use a Helm chart to deploy Cassandra. First, add the DataStax Helm repository (which provides a Cassandra chart):

helm repo add datastax https://helm.datastax.com/public
helm repo update

Now, install the Cassandra chart:

helm install my-cassandra datastax/cass-operator --set cassandra.clusterName=my-cluster --set cassandra.datacenterName=dc1 --set cassandra.size=1

This command deploys a Cassandra cluster named "my-cluster" with a single datacenter "dc1" and a single node. You can adjust the size parameter to increase the number of Cassandra nodes.

d. Access Cassandra:

After a few minutes, Cassandra should be running. To access it, you’ll need to port-forward the Cassandra service to your local machine:

kubectl port-forward service/my-cassandra-dc1-service 9042:9042 -n default

This command forwards port 9042 on the Cassandra service to port 9042 on your local machine.

Now, you can connect to Cassandra using cqlsh:

cqlsh localhost 9042

You are now connected to the Cassandra cluster running in Minikube. You can perform the same cluster name change as described in the main guide.

Explanation:

Kubernetes provides a robust platform for managing Cassandra deployments. Helm simplifies the deployment process using pre-configured charts. Kubernetes automates scaling, rolling updates, and other operational tasks, making it easier to manage Cassandra in production environments. This method provides valuable experience with cloud-native deployments. While more complex initially, it offers significant long-term benefits for managing and scaling Cassandra.

Leave a Reply

Your email address will not be published. Required fields are marked *