Thoughts on CRDTs and threshold reads

Update (November 30, 2013): I updated this post to reflect some of my current thinking on threshold reads for CRDTs and why I think high-water-mark reads aren’t necessarily the way to go.

Since getting back from RICON, I’ve been thinking about what LVar-style threshold reads would mean for CRDTs.

One way in which CRDTs differ from LVars is that some CRDTs support non-monotonic operations. For instance, OR-Sets and 2P-Sets support removal of elements, while PN-Counters can count down as well as up. Combining threshold reads with non-monotonic operations like removal of elements from a set would seem to pose a problem for determinism. If only monotonic operations are allowed, as is the case for LVars, then the answer to the question of whether or not a given threshold read will unblock is the same from run to run (although it may unblock at different times, depending on scheduling). But if the state of the data structure being queried were allowed to move both up and down in a lattice, then that would no longer be the case. For instance, if a read were thresholded on an element that was added to and later removed from a set, then it might block forever, or not, depending on whether the read ran after or before that element’s removal. Hence some schedule nondeterminism would be exposed.

What’s the deal with LVars and CRDTs?

Earlier this week, I gave a talk on LVars at RICON West, a distributed systems industry conference. It covered a lot of the same material as my talk at FHPC last month, but since my speaking slot at RICON was a little longer, I used the extra time to try to explain why a programming languages academic was speaking at a distributed systems industry conference, anyway. This post covers some of the same material in written form.

The short version: The mathematical framework of conflict-free replicated data types that distributed systems researchers have developed for reasoning about and enforcing the eventual consistency of distributed replicas is almost exactly the framework that we use in our own work on LVars for reasoning about and enforcing the determinism of parallel computations.

Update (December 13, 2013): A video of my RICON talk is now available. Thanks to everyone at RICON for being an amazing audience for this talk!

Some example MVar, IVar, and LVar programs in Haskell

In the writing and speaking I’ve done about LVars, I’ve done my best to show a lot of example code. For the most part, though, the examples have been written in a toy language created specifically for the purpose of discussing the LVars work.

For some people, this approach is fine. Others, however, find it less than satisfying. So, let’s see what it’s like to use LVars in a real language! If you’d like to run the examples yourself, all of the code from this post, plus some scaffolding for running it, is available on GitHub.

Also, an announcement: I’m going to be giving a talk about my work on LVars tomorrow, September 23, at FHPC. If you’re interested in this stuff and you’re here in Boston for ICFP and affiliated events, please come to my talk!