{"id":3394,"date":"2015-08-19T14:06:23","date_gmt":"2015-08-19T14:06:23","guid":{"rendered":"http:\/\/blog.cloudthat.com\/?p=3394"},"modified":"2024-06-25T11:13:36","modified_gmt":"2024-06-25T11:13:36","slug":"cassandra-multi-az-data-replication","status":"publish","type":"blog","link":"https:\/\/www.cloudthat.com\/resources\/blog\/cassandra-multi-az-data-replication","title":{"rendered":"Cassandra Multi-AZ Data Replication"},"content":{"rendered":"<p>Apache Cassandra is an open source non-relational\/NOSQL database. It is massively scalable and is designed to handle large amounts of data across multiple servers (Here, we shall use Amazon EC2 instances), providing high availability. In this blog, we shall replicate data across nodes running in multiple Availability Zones (AZs) to ensure reliability and fault tolerance.\u00a0We will also learn how to ensure that the data remains intact even when an entire AZ goes down.<\/p>\n<p>The initial setup consists of a Cassandra cluster with 6 nodes with 2 nodes (EC2s) spread across AZ-1a , 2 in AZ-1b and 2 in AZ-1c.<\/p>\n<h4><strong>Initial Setup:<\/strong><\/h4>\n<p>Cassandra Cluster with six\u00a0nodes.<\/p>\n<ul>\n<li>AZ-1a: us-east-1a: Node 1, Node 2<\/li>\n<li>AZ-1b: us-east-1b: Node 3, Node 4<\/li>\n<li>AZ-1c: us-east-1c: Node 5, Node 6<\/li>\n<\/ul>\n<p>Next, we have to make changes in the Cassandra configuration file. <em>cassandra.yaml<\/em> file is the main configuration file for Cassandra. We can control how nodes are configured within a cluster, including inter-node communication, data partitioning and replica placement etc., in this config file. The key value which we need to define in the config file in this context is called <em>Snitch<\/em>. Basically, a <em>snitch<\/em> indicates as to which Region and Availability zones does each node in the cluster belongs to. It gives information about the network topology so as to the requests are routed efficiently. Additionally, Cassandra has replication strategies which place the replicas based on the information provided by the snitch. There are different types of snitches available. But, in this case, we shall use <em>EC2Snitch<\/em> as all of our nodes in the cluster are within a single region.<\/p>\n<p>We shall set the snitch value as shown below: <a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/snitch.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3413\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/snitch.png\" alt=\"snitch\" width=\"652\" height=\"63\" \/><\/a><\/p>\n<p>Also, since we are using multiple nodes, we need to group our nodes. We shall do so, by defining <em>seeds<\/em> key in the configuration file (Cassandra.yaml) . Cassandra nodes use seeds for finding each other and learning the topology of the ring. It is used during startup to discover the cluster.<\/p>\n<p>For instance:<\/p>\n<p><strong>Node 1: <\/strong>Set the <em>seeds<\/em> value as shown below: <a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3401\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node1.png\" alt=\"node1\" width=\"574\" height=\"84\" \/><\/a><\/p>\n<p>Similarly on the other nodes:<\/p>\n<p><strong>Node 2:<\/strong><\/p>\n<p><a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3402\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node2.png\" alt=\"node2\" width=\"561\" height=\"77\" \/><\/a><\/p>\n<p><strong>Node 3:<\/strong><\/p>\n<p><a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3403\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node3.png\" alt=\"node3\" width=\"554\" height=\"76\" \/><\/a><\/p>\n<p><strong>Node 4:<\/strong><\/p>\n<p><a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3404\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node4.png\" alt=\"node4\" width=\"574\" height=\"80\" \/><\/a><\/p>\n<p><strong>Node 5:<\/strong><\/p>\n<p><a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3405\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node5.png\" alt=\"node5\" width=\"579\" height=\"77\" \/><\/a><\/p>\n<p><strong>Node 6:<\/strong><\/p>\n<p><a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node6.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3406\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/node6.png\" alt=\"node6\" width=\"555\" height=\"79\" \/><\/a><\/p>\n<p>Cassandra nodes use this list of hosts to find each other and learn the topology of the ring. The <em>nodetool<\/em> utility is a command line interface for managing a cluster. We shall check the status of the cluster using this command as shown below: <a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/cass.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-3397\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/cass-1024x249.png\" alt=\"cass\" width=\"940\" height=\"228\" \/><\/a> The <em>owns<\/em> field above indicates the percentage of data owned by each node. As we can see, the <em>owns<\/em> field above is nil as there are no keyspaces\/databases created. So, let us go ahead and create a sample keyspace. We shall create a keyspace with data replication strategy &amp; replication factor. Replication strategy indicates the nodes where replicas are placed. And, the total number of replicas across the cluster is known as replication factor. We shall use NetworkTopology replication strategy since we have our cluster deployed across multiple availability zones. NetworkTopologyStrategy places replicas on distinct racks\/AZs as sometimes, nodes in the same rack\/AZ might usually fail at the same time due to power, cooling or network issues.<\/p>\n<p>Let us set the replication factor to 3 for our \u201cfirst\u201d keyspace:<\/p>\n<pre class=\"lang:default decode:true\">CREATE KEYSPACE \"first\" WITH REPLICATION ={'class' :'NetworkTopologyStrategy', 'us-east' : 3};<\/pre>\n<p>The above CQL command creates a database\/keyspace &#8216;first&#8217; with class as NetworkTopologyStrategy and 3 replicas in us-east (In this case, one replica in AZ\/rack 1a, one replica in rack AZ\/1b and one replica in rack AZ\/1c). Cassandra uses a command prompt called Cassandra Query Language Shell, also known as CQLSH, which acts as an interface for users to communicate with it. Using CQLSH, you can execute queries using Cassandra Query Language (CQL).<\/p>\n<p>Next, we shall create a table <em>user<\/em> with 5 records for tests.<\/p>\n<pre class=\"lang:default decode:true\">CREATE TABLE user(user_id text,login text,region text,PRIMARY KEY (user_id));<\/pre>\n<p>Now, let us insert some queries in this table:<\/p>\n<pre class=\"lang:default decode:true \">insert into user (user_id,login,region) values ('1','test.1,'IN');\r\ninsert into user (user_id,login,region) values ('2','test.2','IN');\r\ninsert into user (user_id,login,region) values ('3','test.3','IN');\r\ninsert into user (user_id,login,region) values ('4','test.4','IN');\r\ninsert into user (user_id,login,region) values ('5','test.5','IN');<\/pre>\n<p>&nbsp;<\/p>\n<pre class=\"lang:default decode:true\">cqlsh&gt; select * from user;<\/pre>\n<p><a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/query.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3412\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/query.png\" alt=\"query\" width=\"284\" height=\"202\" \/><\/a><\/p>\n<p>Now that our keyspace\/database consists of data, let us check for ownership &amp; effectiveness:<\/p>\n<p><a href=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/12-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3396\" src=\"https:\/\/content.cloudthat.com\/resources\/wp-content\/uploads\/2022\/11\/12-1.png\" alt=\"12 (1)\" width=\"928\" height=\"231\" \/><\/a> As we can see here, the <em>owns<\/em> field above is NOT nil after defining the keyspace. The owns field indicates the percentage of data owned by the node.<\/p>\n<p>Let us perform some tests to make sure the data was replicated intact across multiple Availability Zones.<\/p>\n<p>Test 1:<\/p>\n<ul>\n<li>\u00a0Node 1 was stopped.<\/li>\n<li>Connection was made to the Cluster on remaining nodes and records were read from the table <em>user<\/em>.<\/li>\n<li>All records were intact.<\/li>\n<li>Node 1 was started.<\/li>\n<li>\u00a0On Node 1, &#8216;nodetool repair -h hostname_of_Node1 repair first&#8217; was run.<\/li>\n<li>Connection was made to the Cluster on Node 1 and records were read from the table <em>user<\/em>.<\/li>\n<li>\u00a0All records were intact.<\/li>\n<\/ul>\n<p>Test 2:<\/p>\n<ul>\n<li>\u00a0Node 1 and Node 2 were stopped. (Scenario wherein an entire AZ i.e; us-east-1a would go down)<\/li>\n<li>\u00a0Connection was made to the Cluster on remaining nodes in the other AZs (us-east-1b, us-east-1c) and records were read from the table <em>user.<\/em><\/li>\n<li>\u00a0All records were intact.<\/li>\n<li>\u00a0Node 1 and Node 2 were started.<\/li>\n<li>\u00a0&#8216;nodetool repair -h hostname_of_Node1 repair first&#8217; was run on Node 1<\/li>\n<li>\u00a0&#8216;nodetool repair -h hostname_of_Node2 repair first&#8217; was run on Node 2<\/li>\n<li>\u00a0Connection was made to the Cluster on Node 1 and Node 2 and records were read from the table <em>user<\/em>.<\/li>\n<li>\u00a0All records were intact.<\/li>\n<\/ul>\n<p>Similar tests were done by shutting down nodes in us-east-1b &amp; us-east-1c AZs to check if the records were intact even when an entire Availability Zone goes down. Hence, from the above tests, it is quite clear and is recommended to use 6 node cassandra cluster spread across three availability zones and with minimum replication factor of 3 (1 replica in all the 3 AZs) to make cassandra fault tolerant from one whole Availability Zone going down. This strategy will also help in case of disaster recovery.<\/p>\n<p>Stay tuned for more blogs!!<\/p>\n","protected":false},"author":219,"featured_media":0,"parent":0,"comment_status":"open","ping_status":"open","template":"","blog_category":[3606,3607,3805,3665],"user_email":"prarthitm@cloudthat.com","published_by":"324","primary-authors":"","secondary-authors":"","acf":[],"_links":{"self":[{"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/blog\/3394"}],"collection":[{"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/users\/219"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/comments?post=3394"}],"version-history":[{"count":1,"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/blog\/3394\/revisions"}],"predecessor-version":[{"id":45941,"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/blog\/3394\/revisions\/45941"}],"wp:attachment":[{"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/media?parent=3394"}],"wp:term":[{"taxonomy":"blog_category","embeddable":true,"href":"https:\/\/www.cloudthat.com\/resources\/wp-json\/wp\/v2\/blog_category?post=3394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}