Draft: "Deterministic Threshold Queries of Distributed Data Structures"

I’m happy to announce a draft paper on my work (in collaboration with my advisor, Ryan Newton) on bringing LVar-style threshold reads to the setting of C(v)RDTs. In this paper, we define what it means for a CvRDT to support threshold reads, and we show that threshold reads of CvRDTs behave deterministically.

Determinism means something a little different in the distributed setting than it does in the shared-memory setting that we’re used to with LVars. The determinism property we show in the paper is: if a threshold query on a replica returns a particular result, then (1) subsequent runs of that query on that replica will always return that same result, and (2) any run of that query on any replica will eventually return that same result, and will block until it does so. (All this is under certain assumptions, of course, which we spell out in more detail in the paper.)

Although this paper is about bringing together threshold reads and CvRDTs, I think it also serves pretty well as an introduction to both topics — a big chunk of the paper is just background material. Here’s the abstract:

Convergent replicated data types, or CvRDTs, are lattice-based data structures for enforcing the eventual consistency of replicated objects in a distributed system. Although CvRDTs are provably eventually consistent, queries of CvRDTs nevertheless allow inconsistent intermediate states of replicas to be observed; and although in practice, many systems allow a mix of eventually consistent and strongly consistent queries, CvRDTs only support the former. Taking inspiration from our previous work on LVars for deterministic parallel programming, we show how to extend CvRDTs to support deterministic, strongly consistent queries using a mechanism called threshold queries. The threshold query technique generalizes to any lattice, and hence any CvRDT, and allows deterministic observations to be made of replicated objects before the replicas’ states have converged.

I mentioned this draft in passing a couple months ago, but this is a new version that tries to address some problems in the earlier one. In particular, we try to do a better job of explaining what threshold queries might be good for, and we try to make clear that threshold queries (and lattices, for that matter) are a semantic modeling and reasoning tool only, and not something that has any kind of runtime representation. Moreover, only the author of an LVar data structure (or of a threshold-queryable CvRDT) would ever have to think about lattices or threshold queries. Clients of that data structure’s API could just use the provided operations without having to think about that stuff, assuming they trust1 the author of the data structure they’re using.

  1. Using the jargon of “trusted code” and “untrusted code”, the LVar data structure is “trusted”, while the client code is “untrusted”. I’m not a big fan of that usage, though, because although “untrusted code” sounds like something nasty that you want to have as little of as possible, it’s actually just the opposite: “trusted” means “code that has to be trusted”, while “untrusted” means “code that doesn’t have to be trusted”. Therefore, you want to keep the amount of “trusted” code small, so that there’s less verification work to do, while the amount of “untrusted” code can grow arbitrarily without adding any verification burden.