Deploy a Highly Available Apache Cassandra Cluster

Architecture

This reference architecture shows a 6-node deployment of an Apache Cassandra cluster running on Oracle Cloud Infrastructure compute instances.

Description of cassandra-oci.eps follows

Description of the illustration cassandra-oci.eps

The architecture has the following components:

Region
An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

All the components in this architecture are deployed in a single region.
Availability domains
Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.
Fault domains
A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain has three fault domains with independent power and hardware. When you distribute resources across multiple fault domains, your applications can tolerate physical server failure, system maintenance, and power failures inside a fault domain.
Virtual cloud network (VCN) and subnets
A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.
Apache Cassandra cluster
This architecture shows an Apache Cassandra cluster that consists of three seed nodes and three non-seed nodes running on Oracle Cloud Infrastructure Compute instances. The nodes are distributed across the fault domains within a single availability domain. All the compute instances are attached to a single public subnet.
Internet gateway
The internet gateway in this architecture allows traffic between the public subnet and the public internet.
Security lists
For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

This architecture includes ingress rules for TCP ports 7000, 7001, 7199, 9042, and 9160. Apache Cassandra uses port 7000 for communication between clusters (or port 7001 if SSL is enabled) and port 7199 for JMX. Port 9042 is the client port, and 9160 is the native transport port.
Route table
Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

Recommendations

Your requirements might differ from the architecture described here. Use the following recommendations as a starting point.

Compute shape and OS
The Terraform template that's provided for this architecture deploys compute instances running Oracle Linux 7.8. Choose an appropriate shape for the compute instances depending on your requirements. The more memory an Apache Cassandra node has, the better its read performance. A higher number of CPUs translates to better write performance.
VCN
When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

After you create a VCN, you can change, add, and remove its CIDR blocks.

When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

Use a regional subnet.
Security
- Use Oracle Cloud Guard to monitor and maintain the security of your resources in Oracle Cloud Infrastructure proactively. Cloud Guard uses detector recipes that you can define to examine your resources for security weaknesses and to monitor operators and users for risky activities. When any misconfiguration or insecure activity is detected, Cloud Guard recommends corrective actions and assists with taking those actions, based on responder recipes that you can define.
- For resources that require maximum security, Oracle recommends that you use security zones. A security zone is a compartment associated with an Oracle-defined recipe of security policies that are based on best practices. For example, the resources in a security zone must not be accessible from the public internet and they must be encrypted using customer-managed keys. When you create and update resources in a security zone, Oracle Cloud Infrastructure validates the operations against the policies in the security-zone recipe, and denies operations that violate any of the policies.

Considerations

When you implement this architecture, consider the following factors:

Scalability
This architecture deploys one Apache Cassandra seed node and one non-seed node in each fault domain. You might need more nodes to meet your application’s performance or high-availability requirements.

You can scale the Apache Cassandra cluster horizontally by adding more compute instances. You can distribute the seed nodes across the fault domains.

You can scale the cluster vertically by changing the shape of each compute instance. Using a shape with a higher core count increases the memory allocated to the compute instance and its network bandwidth.
Application availability
In this architecture, compute instances that perform the same tasks are distributed redundantly across multiple fault domains. This design eliminates any single point of failure in the topology.

After the architecture is deployed, you can connect to the public IP address of the Apache Cassandra nodes by using SSH tools such as PuTTY or Git Bash. You can use the Cassandra Query Language (CQL) for DDL and DML operations on the Apache Cassandra database.
Cost
A bare metal shape provides better read and write performance. If your application doesn’t need high performance, you can select a VM shape based on the cores, memory, and network bandwidth that you need for your database. You can start with a 1-core shape for the Apache Cassandra nodes, and change the shape later if you need more performance, memory, or network bandwidth.

Deploy

The code required to deploy this reference architecture is available in GitHub. You can pull the code into Oracle Cloud Infrastructure Resource Manager with a single click, create the stack, and deploy it. Alternatively, download the code from GitHub to your computer, customize the code, and deploy the architecture by using the Terraform CLI.

Deploy by using Oracle Cloud Infrastructure Resource Manager:
1. Click
  If you aren't already signed in, enter the tenancy and user credentials.
2. Review and accept the terms and conditions.
3. Select the region where you want to deploy the stack.
4. Follow the on-screen prompts and instructions to create the stack.
5. After creating the stack, click Terraform Actions, and select Plan.
6. Wait for the job to be completed, and review the plan.
  To make any changes, return to the Stack Details page, click Edit Stack, and make the required changes. Then, run the Plan action again.
7. If no further changes are necessary, return to the Stack Details page, click Terraform Actions, and select Apply.
Deploy by using the Terraform CLI:
1. Go to GitHub.
2. Clone or download the repository to your local computer.
3. Follow the instructions in the README document.

Explore More

Change Log

This log lists significant changes:

February 4, 2021

Added recommendations about using Oracle Cloud Guard and security zones.
Added information about the Apache Cassandra cluster, internet gateway, and route table in the Architecture section.
Added instructions for deploying the architecture by using Oracle Cloud Infrastructure Resource Manager.
Updated the link to the GitHub repository.