Now, let’s go through a few main concepts.
About Consistency in Distributed Systems
Weak Consistency – Replicas misalignment is tolerated.
Normally, it is not guaranteed that at any time replicas in a distributed system are aligned: for some reason, inconsistency windows can be experienced, and have to be tolerated.
Weak Consistency can be further classified in Eventual and Strong-Eventual.
Replicas in a distributed system will eventually be consistent, if updates stop. Conflicts like stale read or concurrent writes can happen, as summarized below.
A guarantee that can be achieved in case Conflict-free Replicated Data Types (CRDT) are used. According to such guarantee, conflicts cannot happen, the replicas will be eventually consistent.
Strong Consistency – Replicas misalignment is considered like a failure.
About CAP Theorem
ACID Vs BASE
ACID is an acronym for Atomicity, Consistency, Isolation and Durability. It is a set of properties that are normally desired in transactional contextes. On the other hand, BASE is an acronym for Basically Available, Soft state and Eventual consistent. It is a set of properties often respected in distributed systems that preserve Availability in case of Network Partitioning, and prefer to resolve conflicts as soon as network connectivity is restored.
Demystifying Eventual Consistency
Eventual Consistency does not mean lack of transactionality, neither lack of consistency, it only means: the overall Distributed System will converge to a coordinated and consistent global state in a non-predictable time frame. In other words: after an update and in case of no successive updates, at a certain point in time, the system will converge to a consistent state, but since the network is supposed to be asynchronous and partition-prone no one can tell or predict how much time the convergence will take.
From a local perspective (one or a quorum of nodes in a region), transactions are to be intended as always, almost in the traditional sense; the scenario changes when the consistency is observed from the global state perspective (set of replicated nodes over regions): a global snapshot can likely spot slight inconsistencies among the nodes, due to many causes like drift in the propagation delay, partitioning, etc.
Systems in the regime of Eventual Consistency suffer of conflicts: every time concurrent modifications of not synced data happen, a conflict raises up. Such conflicts can be managed applying the concept of vector clocks and time machines: each data has a story in time, and progressive versions, the easiest way to solve a conflict consists in spotting the time frame where the conflict happened, and so undo the changes up to the stable version and reapply the updates in order – an intrinsic concept of state machine is sketched by this approach.
Eventually, all reads among the replicas will provide the same output, the point: it is impossible a-priori to establish the time when all replicas will be aligned. Such guarantee may be not acceptable in business-critical contexts where each transaction equates to money; but, on the average case of Internet and Cloud Services, eventual consistency can be tolerated in favor of Availability and Resistance to Network Partitioning. A good example of Eventual Consistent Data Store is the DNS, a scalable and fully distributed directory service at the backbone of today’s Internet. How the DNS works: from central registries the data is propagated among the recursive resolvers, and then stub resolvers, but on the path towards the final client service many levels of caching are passed through.
According to the proposed picture, SQL is a rock-solid piece of technology that has served for many years now; modern volumes of data, mostly considering Internet and Cloud services, create some difficulties to SQL: relational databases have not been designed to scale out, so propagating ACID transactions over a multitude of distributed machines requires a trade-off, strongly impacting the Availability of the overall systems (i.e. long stop-the-world times to align the replicas synchronously).