(what drives Cassandra behind the wheels)

Every node within the cluster should have information about every other node. Cassandra nodes should get the information when a new node is added so they can rearrange the token ranges. They should know whenever a node is down so they can store hints. So, there has to be some sort of communication between the nodes so each node can maintain the entire cluster view up to date at all times. Cassandra uses a gossip protocol for internode communication. https://www.oreilly.com/library/view/learning-apache-cassandra/9781787127296/840cf7c5-8eca-4b88-ae21-0ff457619f36.xhtml

Replica synchronization is used to bring nodes up to date after a failure, and for periodically synchronizing replicas with each other.

Gossip is a probabilistic technique for synchronizing replicas. The pattern of communication (e.g. which node contacts which node) is not determined in advance. Instead, nodes have some probability p of attempting to synchronize with each other. Every t seconds, each node picks a node to communicate with. This provides an additional mechanism beyond the synchronous task (e.g. the partial quorum writes) which brings the replicas up to date.

Gossip is scalable, and has no single point of failure, but can only provide probabilistic guarantees.

https://medium.com/@swarnimsinghal/implementing-cassandras-gossip-protocol-part-1-b9fd161e5f49


🌱 Back to Garden