Skip to main content
Version: 2.0
đź•‘Estimated time for completion

This section takes about 25 minutes to complete.

Data Milky Way: Distributed Data Systems (NoSQL & CAP)

We talked a little about NoSQL earlier. Let's just skim the surface and talk about some important concepts driving it.

NoSQL Technologies​

Note: "Introduction to NoSQL" by Martin Fowler (watch until 10:35, the remaining 45 minutes are out of scope for the course).

CAP Theorem​

Watch: "CAP Theorem Illustrated" by Mark Richards

Summary: Pick Two:

  • C (consistency)
  • A (availability)
  • P (partition tolerance)

But is it really true in the real world?

  • What unrealistic assumption are we making here?
    • Can we really assume that network communications won’t fail?
    • Is there really such thing as a distributed system that won’t have partitions?
  • In Reality: not a binary choice between C and A
  • Several NoSQL solutions offer a tunable tradeoff between C and A

Further Reading (bonus content): What is the CAP Theorem? 3 types of NoSQL databases

Summary​

A CP system will say “Sorry, I can’t be sure yet” to the client, in order to avoid giving an out-of-date answer.

cap.png

An AP system tries to spits out an answer even if it might not be the most up-to-date one.