Install Apache Cassandra on Rocky Linux 9: Best Distributed Database
In this guide, we’ll explore how to Install Apache Cassandra on Rocky Linux 9 and configure a basic Cassandra cluster. Apache Cassandra is a powerful, open-source NoSQL database designed for handling massive amounts of data across multiple nodes. Its distributed architecture ensures high availability, scalability, and fault tolerance, making it a popular choice for applications requiring real-time data processing and big data analytics. Cassandra is managed by the Apache Software Foundation.
Key features of Apache Cassandra:
- Decentralized: No single point of failure.
- Scalable: Easily add more nodes to increase capacity.
- Fault-Tolerant: Data is automatically replicated across multiple nodes.
- High Availability: Minimal downtime due to node failures.
- Tunable Consistency: Choose the level of consistency that best suits your application’s needs.
- CQL (Cassandra Query Language): A SQL-like language for interacting with the database.
Steps To Install and Configure Apache Cassandra on Rocky Linux 9
Before you begin, ensure you have a Rocky Linux 9 server with a non-root user with sudo privileges. You can follow a guide like "Initial Server Setup with Rocky Linux 9" to set this up.
1. Install Dependencies for Cassandra
Apache Cassandra is written in Java, so you need to have Java installed on your server.
First, update your local package index:
sudo dnf update -y
Next, install OpenJDK:
sudo dnf install java-1.8.0-openjdk-devel -y
Verify the Java installation:
java -version
**Output**
openjdk version "1.8.0_352"
OpenJDK Runtime Environment (build 1.8.0_352-b08)
OpenJDK 64-Bit Server VM (build 25.352-b08, mixed mode)
Now, install Python and Pip 3:
sudo yum install python3 python-pip
Finally, install cqlsh
using pip:
sudo pip3 install cqlsh
2. Install Apache Cassandra on Rocky Linux 9
This section details how to Install Apache Cassandra on Rocky Linux 9.
Add Apache Cassandra Repository
Create a yum repository file:
sudo vi /etc/yum.repos.d/cassandra.repo
Add the following content:
[cassandra]
name=Apache Cassandra
baseurl=https://redhat.cassandra.apache.org/41x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS
Save and close the file.
Update the system:
sudo dnf update -y
Note: If you encounter GPG key errors, update the crypto policies to LEGACY:
# sudo update-crypto-policies --set LEGACY
# sudo reboot
Install and configure Cassandra
Install Cassandra:
sudo dnf install cassandra -y
Create a systemd Unit File for Cassandra
Create a systemd unit file:
sudo vi /etc/systemd/system/cassandra.service
Add the following content:
[Unit]
Description=Apache Cassandra
After=network.target
[Service]
PIDFile=/var/run/cassandra/cassandra.pid
User=cassandra
Group=cassandra
ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid
Restart=always
[Install]
WantedBy=multi-user.target
Save and close the file.
Reload the daemon:
sudo systemctl daemon-reload
Start and enable the Cassandra service:
# sudo systemctl start cassandra
# sudo systemctl enable cassandra
Verify that Cassandra is running:
sudo systemctl status cassandra
You should see output similar to the image in the original article indicating an active and running service.
Alternatively, use the nodetool status
command:
nodetool status
You should see output similar to the image in the original article, with "UN" indicating that the service is up and normal.
Log in to Default Cassandra Cluster (cqlsh shell)
Connect to the Cassandra cluster:
cqlsh
Your output should resemble the image in the original article.
Note: If you get an ImportError: cannot import name 'authproviderhandling' from 'cqlshlib'
, follow these steps:
Find the path to cqlshlib
:
find /usr/lib/ -name cqlshlib
The path obtained (example):
/usr/lib/python3.6/site-packages/cqlshlib
Export the path:
export PYTHONPATH=$PYTHONPATH:/usr/lib/python3.6/site-packages/
Then, re-run the cqlsh
command.
Now that you have Install Apache Cassandra on Rocky Linux 9, let’s configure your default cluster.
3. Change Apache Cassandra Default Cluster
You can change the default cluster name. From the cqlsh
shell, run:
cqlsh> UPDATE system.local SET cluster_name = 'Orca Cluster' WHERE KEY = 'local';
Remember to replace "Orca Cluster" with your desired name.
Exit the cqlsh
shell:
cqlsh> exit
Edit the cassandra.yaml
file:
sudo vi /etc/cassandra/default.conf/cassandra.yaml
Find the cluster_name
directive and update it:
cluster_name: 'Orca Cluster'
Save and close the file.
Restart Cassandra:
sudo systemctl restart cassandra
Log in again to verify the change:
cqlsh
You should see the new cluster name in the cqlsh
output.
Conclusion
You have now learned how to Install Apache Cassandra on Rocky Linux 9 and configure a basic cluster. Cassandra’s distributed architecture makes it ideal for handling large-scale, distributed databases with high availability and fault tolerance. This makes it suitable for various applications including big data applications and real-time analytics.
Now, let’s explore some alternative ways to achieve the same result.
Alternative Installation Methods
While the above method is a standard approach, here are two alternative methods for installing and managing Apache Cassandra on Rocky Linux 9: using Docker and using Kubernetes (specifically Minikube for local testing).
1. Installing Apache Cassandra with Docker
Docker provides a containerized environment, simplifying deployment and ensuring consistency across different systems. Here’s how to Install Apache Cassandra on Rocky Linux 9 using Docker:
a. Install Docker:
If you don’t have Docker installed, use the following commands:
sudo dnf install docker -y
sudo systemctl start docker
sudo systemctl enable docker
b. Pull the Cassandra Docker Image:
sudo docker pull cassandra
This command downloads the official Cassandra image from Docker Hub.
c. Run the Cassandra Container:
sudo docker run -d --name my-cassandra -p 9042:9042 cassandra
This command creates and starts a Cassandra container named "my-cassandra". The -d
flag runs the container in detached mode (in the background). The -p 9042:9042
flag maps the container’s port 9042 (Cassandra’s default port) to the host’s port 9042.
d. Access Cassandra:
You can now connect to your Cassandra instance using cqlsh
from within the container. First, execute into the container:
sudo docker exec -it my-cassandra bash
Then, run cqlsh
:
cqlsh
You are now inside the Cassandra CQL shell. You can perform the same cluster name change as described in the main guide.
Explanation:
Docker simplifies the installation process by providing a pre-configured environment for Cassandra. This eliminates the need to manually install dependencies like Java and configure the system. Docker ensures consistency and portability across different environments.
2. Installing Apache Cassandra with Kubernetes (Minikube)
Kubernetes is a container orchestration platform that automates the deployment, scaling, and management of containerized applications. While a full Kubernetes cluster is complex, Minikube provides a lightweight, single-node Kubernetes environment for local development and testing. This option allows you to Install Apache Cassandra on Rocky Linux 9 in a highly scalable, cloud-native manner, preparing you for larger deployments.
a. Install Minikube and kubectl:
Follow the official Minikube documentation to install Minikube and kubectl (the Kubernetes command-line tool) on your Rocky Linux 9 system. This typically involves downloading binaries and adding them to your PATH.
b. Start Minikube:
minikube start
This command starts the Minikube cluster.
c. Deploy Cassandra using a Helm Chart:
Helm is a package manager for Kubernetes. We’ll use a Helm chart to deploy Cassandra. First, add the DataStax Helm repository (which provides a Cassandra chart):
helm repo add datastax https://helm.datastax.com/public
helm repo update
Now, install the Cassandra chart:
helm install my-cassandra datastax/cass-operator --set cassandra.clusterName=my-cluster --set cassandra.datacenterName=dc1 --set cassandra.size=1
This command deploys a Cassandra cluster named "my-cluster" with a single datacenter "dc1" and a single node. You can adjust the size
parameter to increase the number of Cassandra nodes.
d. Access Cassandra:
After a few minutes, Cassandra should be running. To access it, you’ll need to port-forward the Cassandra service to your local machine:
kubectl port-forward service/my-cassandra-dc1-service 9042:9042 -n default
This command forwards port 9042 on the Cassandra service to port 9042 on your local machine.
Now, you can connect to Cassandra using cqlsh
:
cqlsh localhost 9042
You are now connected to the Cassandra cluster running in Minikube. You can perform the same cluster name change as described in the main guide.
Explanation:
Kubernetes provides a robust platform for managing Cassandra deployments. Helm simplifies the deployment process using pre-configured charts. Kubernetes automates scaling, rolling updates, and other operational tasks, making it easier to manage Cassandra in production environments. This method provides valuable experience with cloud-native deployments. While more complex initially, it offers significant long-term benefits for managing and scaling Cassandra.