Distributed systems and consistency

Distributed Systems
and Consistency

Because everything else is easy.

What we're talking about
● What are distributed systems?
● Why are they good, why are they bad?
● CAP theorem
● Possible CAP configurations
● Strategies for consistency, including:
● Point-in-time consistency with LSS
● Vector clocks for distributed consistency
● CRDTs for consistency from the data structure
● Bloom, a natively consistent distributed language

What's a distributed system?
● Short answer: big data systems
● Lots of machines, geographically distributed
● Technical answer:
● Any system where events are not global
● Where events can happen simultaneously

Why are they good?
● Centralized systems scale poorly & expensively
● More locks, more contention
● Expensive hardware
● Vertical scaling
● Distributed systems scale well & cheaply
● No locks, no contention
● (Lots of) cheap hardware
● Linear scaling

So what's the catch?
● Consistency
● “Easy” in centralized systems
● Hard in distributed systems

CAP Theorem
● Consistency
● All nodes see the same data at the same time
● Availability
● Every request definitely succeeds or fails
● Partition tolerance
● System operates despite message loss, failure
● Pick two!

No P
● No partition tolerance = centralized
● Writes can't reach the store? Broken.
● Reads can't find the data? Broken.
● The most common database type
● MySQL
● Postgres
● Oracle

No A
● An unavailable database = a crappy database
● Read or write didn't work? Try again.
● Everything sacrifices A to some degree
● Has some use-cases
● High-volume logs & statistics
● Google BigTable
● Mars orbiters!

No C
● Lower consistency = distributed systems
● “Eventual consistency”
● Writes will work, or definitely fail
● Reads will work, but might not be entirely true
● The new hotness
● Amazon S3, Riak, Google Spanner

Why is this suddenly cool?
● The economics of computing have changed
● Networking was rare and expensive
● Now cheap and ubiquitous – lots more P
● Storage was expensive
● Now ridiculously cheap – allows new approaches
● Partition happens
● Deliberately sacrifice Consistency
● Instead of accidentally sacrificing Availability

Ways to get to eventual consistency
● App level:
● Write locking
● Last write wins
● Infrastructure level
● Log structured storage
● Multiversion concurrency control
● Vector clocks and siblings
● New: language level!
● Bloom

Write-time consistency 1
● Write-time locking
● Distributed reads
● (Semi)-centralized writes
● Cheap, fast reads (but can be stale)
● Slower writes, potential points of failure
● In the wild:
● Clipboard.com
● Awe.sm!

Write-time consistency 2
● Last write wins
● Cheap reads
● Cheap writes
● Can silently lose data!
– A sacrifice of Availability
● In the wild:
● Amazon S3

Side note: Twitter
● Twitter is eventually consistent!
● Your timeline isn't guaranteed correct
● Older tweets can appear or disappear
● Twitter sacrifices C for A and P
● But doesn't get a lot of A

Infrastructure level consistency 1
● Log structured storage
● Also called append-only databases
● A new angle on consistency: external consistency
● a.k.a. Point-in-time consistency
● In the wild:
● BigTable
● Spanner

How LSS Works
● Every write is appended
● Indexes are built and appended
● Reads work backwards through the log
● Challenges
● Index-building can get chunky
– Build them in memory, easily rebuilt
● Garbage collection
– But storage is cheap now!

Why is LSS so cool?
● Easier to manage big data
● Size, schema, allocation of storage simplified
● Indexes are impossible to corrupt
● Reads and writes are cheap
● Point-in-time consistency is free!
● Called Multiversion Concurrency Control

Infrastructure level consistency 2
● Vector clocks
● Vectors as in math
● Basically an array

Distributed systems and consistency

Not enough for consistency
● Different nodes know different things!
● Quorum reads
● N or more nodes must agree
● Quorum writes
● N or more nodes must receive new value
● Can tune N for your application

Dealing with siblings
● 1: Consistency at read time
● Slower reads
● Pay every time
● 2: Consistency at write time
● Slower writes
● Pay once
● 3: Consistency at infrastructure level
● CRDTs: Commutative Replicated Data Types
● Monotonic lattices of commutative operations

Don't Panic
● We're going to go slowly
● There's no math

Monotonicity
● Operations only affect the data in one way
● e.g. increment vs. set
● Instead of storing values, store operations

Commutativity
● Means the order of operations isn't important
● 1 + 5 + 10 == 10 + 5 + 1
● Also: (1+5) + 10 == (10+5) + 1
● You don't need to know when stuff happened
● Just what happened

Lattices
● A data structure of operations
● Like a vector clock, sets of operations
● “Partially” ordered
● Means you can throw away oldest operations

Put it all together: CRDTs
● Commutative Replicated Data Types
● Each node stores every entry as a lattice
● Lattices are distributed and merged
● Operations are commutative
– So collisions don't break stuff

CRDTs are monotonic
● Each new operation adds information
● Data is never deleted or destroyed
● Applications don't need to know
● Everything is in the store

CRDTs are pretty awesome
● But
● use a lot more space
● garbage collection is non-trivial
● In the wild:
● The data processor!

Language level consistency
● Bloom
● A natively distributed-safe language
● All operations are monotonic and commutative
● Allows compiler-level analysis
● Flag where unsafe things are happening
– And suggest fixes and coordination
● Crazy future stuff

In Summary
● Big data is easy
● Just use distributed systems!
● Consistency is hard
● The solution may be in data structures
● Making use of radically cheaper storage
● Store operations, not values
● And make operations commutative
● Data is so cool!

More reading
● Log Structured Storage:
● http://blog.notdot.net/2009/12/Damn-Cool-Algorithms-Log-structured-
storage
● Lattice data structures and CALM theorem:
● http://db.cs.berkeley.edu/papers/UCB-lattice-tr.pdf
● Bloom:
● http://www.bloom-lang.net/
● Ops: Riak in the Cloud
● https://speakerdeck.com/u/randommood/p/getting-starte

Even more reading
● http://en.wikipedia.org/wiki/Multiversion_concurrency_control
● http://en.wikipedia.org/wiki/Monotonic_function
● http://en.wikipedia.org/wiki/Commutative_property
● http://en.wikipedia.org/wiki/CAP_theorem
● http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing
● http://pagesperso-systeme.lip6.fr/Marc.Shapiro/papers/RR-6956.pdf
● http://en.wikipedia.org/wiki/Vector_clock

Distributed systems and consistency

More Related Content

What's hot (20)

Viewers also liked (18)

Similar to Distributed systems and consistency (20)

Recently uploaded (20)

Distributed systems and consistency

Editor's Notes