CAP Theorem and Its Implications

The CAP theorem tells us that when data is distributed across multiple servers, you can only guarantee two out of these three properties: Consistency, Availability, and Partition Tolerance. Let's understand what these properties mean.

Consistency means all users see the same data at the same time, no matter which server they connect to. If you update some data, all subsequent reads will return that updated value.

Availability means the system remains operational and responds to requests even if some servers fail. Every request to a working server should receive a response, whether it succeeds or fails.

Partition Tolerance means the system continues to work even when network issues prevent servers from communicating with each other. This is crucial in distributed systems because network problems are inevitable.

Since network partitions are inevitable, you're forced to choose between consistency and availability.

If you choose Consistency over Availability (CP systems): The system will return an error or timeout if it can't ensure consistent data. Example: Google Cloud Spanner

If you choose Availability over Consistency (AP systems): The system will return the most recent available data, which might be stale. Example: Apache Cassandra

The choice between CP and AP depends on your application's needs:

Use CP when accurate data is crucial (banking, stock trading)

Use AP when the system must stay operational even with stale data (social media, content delivery)

Here's a comparison of CP vs AP systems:

PropertyCP SystemsAP Systems
During Network PartitionMay become unavailableMay return stale data
Data AccuracyAlways consistentEventually consistent
Use CasesFinancial systemsSocial media
ExamplesSpannerCassandra
© 2024 DrawSystem Design. All rights reserved.