Cassandra data modelling:
- Cassandra database is distrubuted over multiple machines that perform together.
- The outermost container is known as the Cluster. For failure handling, every node have a replica, and in case of a failure, the replica takes charge.
- Cassandra order the nodes in a cluster, in a ring format, and assigns data to them.
- The data model of Cassandra is significantly various from normally what can we see in an Relational Database Management System.
Cluster:
- The Group of independent servers (usually in close proximity to one another) interconnected via network to work as one centralized data processing resource.
- Clusters are working for multiple complex instructions by distributing workload across all connected servers.
- Clustering develops the system’s accessibility to users, its aggregate performance, and overall tolerance to faults and component failures.
- A failed server is automatically shut down and its users are switched immediately to the other servers.
Key space:
- Key space is the edge container for data in Cassandra.
The basic attributes of a Key space in Cassandra are,
- Replication factor
- Replica placement strategy
- Column families
Replication factor :
- It represent the cluster contain the number of machines that will collect duplicate of the same data.
Replica placement strategy :
- It is nothing but the strategy to point replicas in the ring.
- We have strategies such as simple strategy (rack-aware strategy), old network topology strategy (rack-aware strategy), and network topology strategy (datacentre-shared strategy).
Column families :
- Key space is a edge for a order of one or more column groups.
- A column groups, in turn, is a edge of a order of rows. Each row have a ordered columns.
- Column families define the structure of your data.
- Each keyspace contain at least one and often many column families.