CAP Distributed System Field

Content

The principle of CAP was first proposed by Eric Brewer in a seminar organized by ACM in 2000, which was later proved by Lynch et al.

• C (consistency): the data on all nodes are kept in sync at all times, that is, the data is consistent

• A (availability): Each request can receive a response within a certain period of time, that is, low latency

• P (partition fault tolerance): can still run when the system is partitioned

Theorem: Any distributed system can only satisfy two points at the same time, and cannot take care of all three. That is, the data is consistent, the response is timely, and the partitionable execution cannot be satisfied at the same time.

for example:

On a distributed network, a certain node has a set of dependent data A. When the network has no delay and no blockage, operations that depend on X can proceed normally. However, no delay and blocking of the network cannot be 100% guaranteed in the real world. When the network is abnormal, partitions and islands of the distributed system will inevitably occur. Then when an execution operation is outside the A partition, if you want to ensure that P , That is, when the system is partitioned, it can still run, it needs to have X backup data in multiple nodes in the distributed system to deal with the partition situation. At this time, you need to choose between C and A.

If you choose C, that is, to ensure the consistency of the data in the distributed network, you need to refresh the X data of the entire network node to the latest state every time X changes, then wait for the data refresh to complete, The distributed system cannot respond to X's dependent operations, that is, the function of A is missing

If you choose A, you must highlight low-latency real-time response. Then when responding, the X data of the full node may not be synchronized to the latest state, which will cause the loss of C.

The above looks a bit convoluted, so you just have to remember this sentence,