# The LVar that wasn't

Suppose we want to write a program in which two threads, say, t1 and t2, each contribute a Boolean value, either `True` or `False`, to a shared result. If both threads write `True`, we want the result to be `True`; if either writes `False` (or if both do), then we want the result to be `False`. That is, we want the shared result to be the outcome of a logical “and” operation.

Let’s further suppose that we want the threads to be able to run in parallel, with their writes arriving in arbitrary order, and we want to be able to guarantee a deterministic result in spite of parallel execution. Sounds like a job for LVars!

Update (February 19, 2014): I’ve updated the links in this post to point to code that runs against the 1.1.2 release of LVish.

## Six states

Consider the states that the shared result might take on as the program runs. There seem to be six such states, falling into four equivalence classes:

• No information: First of all, there’s the state in which neither thread has written yet, so we have no information. We always begin in this state.
• Some information, but not enough: Then there’s the state in which t1 has written `True` but t2 hasn’t written anything, and conversely, the state in which t2 has written `True` and t1 has been silent so far. If we’re in one of these states, we’re not ready to conclude that the result is `True` or `False` yet, because although we have some information, it’s still not enough to draw a conclusion.
• Exactly enough information: Then there’s the state in which at least one thread has written `False`. In this case, we can go ahead and conclude that the result is `False`. On the other hand, there’s the state in which both threads have written `True`, in which case we can conclude that the result is `True`.
• Too much information: Finally, there’s one more possibility — a state that represents the situation in which conflicting information has arrived. For instance, suppose that the shared result is in the “both are true” state, but then a write of `False` comes along. This couldn’t actually happen under the assumptions we’re making — which are that each thread can only write `True` or `False`, and that threads cannot change their answers — but let’s throw this state in anyway. If our program ever did enter this state, it would be the equivalent of shouting, “TMI!”, indicating that something had gone horribly wrong. Why would we need to include such a state? We’ll come back to this in a little while. We can give the states names, order them by how much information we have when we’re in each one, and make a diagram showing us the relationships between states according to that order. `Bot`, short for “bottom”, is the state in which we have no information. `TrueBot` and `BotTrue` are, respectively, the state in which t1 has written `True` but t2 hasn’t written anything, and the state in which t1 hasn’t written anything but t2 has written `True`. `TrueTrue` is the state in which two writes of `True` have arrived, and `F` is the state in which at least one write of `False` has arrived. Finally, `Top` is the “TMI!” state.

Whenever a state is lower down in the diagram than another, we can interpret it as meaning that the lower state has less information than the higher state. For instance, `Bot` has less information than `BotTrue`, but `TrueTrue` does not have less information than `F`. Therefore, we can say that our six states are ordered by a “has less information than” relation. This relation is a partial order, rather than a total order, because some pairs of states, such as `TrueTrue` and `F`, have no relationship according to it — that is, neither one has less information than the other. But other pairs of states, like `Bot` and `BotTrue`, do belong to the relation. Together, the set of states and the partial order are called a partially ordered set, or poset.

## Writing it down with LVish

How are we going to express all of this in code? Well, we want a shared data structure whose states are a partially ordered set, and whose contents grow monotonically with respect to that partial order. Using the LVish Haskell library, we can define an LVar type for the shared result. For instance, we might define it like this:

Here, we’re using the `PureLVar` type constructor provided by the LVish library’s `Data.LVar.Internal.Pure` module, in order to define a type called `Result`. The states that a `Result` can take on are the six states we just discussed.

So far, our code says nothing about how the partial order relates these states to each other; there’s no indication that, for instance, `Bot` has less information than `F`. In LVish, the way we express “has less information than” relationships is by writing a `join` operation that takes two states and returns a state. The join of two states `x` and `y` will be `y` if `x` has less information than `y`, and it will be `x` if `y` has less information than `x`. If neither has less information than the other — for instance, if we take the join of `TrueTrue` and `F` — then the result will be neither one of them; instead, it will be the least state that is greater than or equal to them both. In our case, by “least”, we mean having the least information, and by “greater than or equal to”, we mean having more information or the same amount of information. In other words, when we take the join of two states, the result will be a state that has just enough information to subsume both of them, and no more than that.

For example, `TrueTrue` is the join of `TrueBot` and `BotTrue`, because it’s at or above both `TrueBot` and `BotTrue` in our diagram, and (unlike `Top`) it’s the least element in the poset that meets that condition. Likewise, `TrueBot` is the join of `Bot` and `TrueBot`; `Top` is the join of `TrueTrue` and `F`; and so on.

Here’s the entire definition of the `join` operation, which we’ll call `joinStates`:

That’s kind of a mouthful, so we might want to take a moment to convince ourselves that it corresponds to the diagram up above. Happily, we only have six states, and so there are only thirty-six possible joins that could occur, since each state could be joined with itself or any of the others. That’s few enough that it’s tractable to just write out all thirty-six possibilities and verify by hand that they agree with the diagram. The `printAllJoins` function defined in this file can help by printing out the thirty-six possibilities, and there’s example output from `printAllJoins` in the comments in that file.

## Using `joinStates`

Assuming that `joinStates` does, in fact, express the same relationships between states as the ordering shown in the diagram, then we should be good to go. But why did we have to define `joinStates`, anyway?

Since we defined `Result` to be a `PureLVar`, we can now manipulate `Result`s using the operations on `PureLVar`s that are provided by the `Data.LVar.Internal.Pure` module. For example, the `getPureLVar` operation allows us to do a “threshold read” of the contents of a `PureLVar`. Here’s the signature of `getPureLVar`:

`getPureLVar` takes two arguments: a `PureLVar`, whose state is of type `t`, and a list of possible states (again, of type `t`) that the `PureLVar` might be in. It returns a state wrapped in a `Par` computation.

Notice that there’s a `JoinSemiLattice` typeclass constraint on `t`. We need that constraint because the implementation of `getPureLVar` is written in terms of a `join` function on states of `PureLVars`, like the `joinStates` that we defined up above. Through the `JoinSemiLattice` typeclass constraint, `getPureLVar`’s type expresses its expectation that such a `join` function exists. So we need to declare `State` to be an instance of `JoinSemiLattice` and indicate that the `joinStates` function we wrote is to be used as `join`.

So, that’s why we had to define `joinStates` — so that we could use it to declare `State` to be an instance of `JoinSemiLattice`, allowing us to make use of library functions like `getPureLVar`.

How about the `Eq` constraint on `t`? This part is slightly more subtle. We mentioned that `getPureLVar` is written in terms of `join`, but that’s actually not quite right; it’s actually written in terms of another function, `joinLeq`, and that is written in terms of `join`.

What is `joinLeq`? Suppose we have two elements `x` and `y`, and we want to know if `x` is “less than or equal to” `y` — which would imply that `x` has no more information than `y`. Can we determine the answer to this question if all we have is the `join` operation? Yes! All we have to do is take the join of `x` and `y`. If the result is `y`, then yes, `x` does indeed have no more information than `y`. In fact, the `Algebra.Lattice` library provides just such a `joinLeq` function.

But wait: in the implementation of `joinLeq`, once we’ve done the join of `x` and `y`, we need to compare the result of it with `y`, and we can’t do that comparison unless we have an `==` operation defined on `State`s! This operation is provided by Haskell’s `Eq` typeclass. So, the type of `joinLeq` in `Algebra.Lattice` has an `Eq` constraint, along with a `JoinSemiLattice` constraint, and therefore our own `getPureLVar` function needs the `Eq` constraint, too — because `getPureLVar` calls `joinLeq`. We’ll see in a moment why `getPureLVar` needs to do that!

## Reading a `Result`

There are two things we want to be able to do with a `Result`: write to it from our two parallel threads, and read from it when all the writes that are going to matter are finished. Let’s consider the latter first.

We only want to read from a `Result` when it’s in one of the two “exactly enough information” states: `TrueTrue` or `F`. Together, we call the states `TrueTrue` and `F` a “threshold set”, because they represent the threshold that a `Result` has to reach before we read from it. The `getResult` operation just takes a `Result`, in this case `res`, and calls `getPureLVar` on `res` with the threshold set of `[TrueTrue, F]` as the second argument to `getPureLVar`.

The semantics of `getPureLVar` are that it will block until `res`’s state reaches or surpasses an element of the threshold set. It’s because of this “reaches or surpasses” check that `getPureLVar` has to call `joinLeq`. If `getPureLVar` returns, it will return the member of the threshold set that the current state of `res` is at or above, wrapped in `Par`. `getResult` then returns that `Par`-wrapped `State`.

Recall that `res` is going to be written into from multiple threads that share access to it, and we don’t know what order those writes will occur in. The nice thing about `getResult` is that it will always behave the same way for a given set of writes to `res`, regardless of the order in which those writes occur. This is guaranteed to be the case because only one of `TrueTrue` or `F` can ever be the state of `res`.

If we had provided, say, `[TrueTrue, TrueBot]` as the threshold set, then we wouldn’t have that guarantee. Our program would allow schedule nondeterminism to be observed: `getResult` might return different values depending on the order in which t1 and t2 ran. For example, if t1 and t2 were both going to write `True`, and t1 ran first, then `getResult` might unblock and return `TrueBot` (wrapped in `Par`) immediately, without waiting for t2 to run. But if t2’s write of `True` happened first, followed by t1’s,, then `getResult` would unblock and return `TrueTrue` wrapped in `Par`.

So, we shouldn’t use `[TrueTrue, TrueBot]` as a threshold set. In fact, if we want to ensure that schedule nondeterminism is not observable, then the only valid threshold sets are those in which, for every two elements in the threshold set, their `join` is `Top`. (This is one explanation for why we need a `Top` state: because the validity of threshold sets is defined in terms of it.) Notice that `[TrueTrue, F]`, the threshold set we actually used, is just fine, since `join TrueTrue F` is `Top`.

## Writing a `Result`

Next, let’s consider how we’re going to write into a `Result` in the first place. For that, we’ll write a function called `asyncAnd`. `asyncAnd` takes two `Par`-wrapped `Bool`s, which are the two Boolean values that threads t1 and t2 respectively contribute to the result, and it returns a `Par`-wrapped `Bool`: `True` if the result was `TrueTrue`, and `False` otherwise.

`asyncAnd` creates a new `Result`, named `res`, by calling `newPureLVar`. It then forks two threads, each of which looks at a particular one of the provided `Bool`s and writes the corresponding `State` into `res`. All this is followed by a call to `getResult`. Notice that if either thread writes `F`, then `getResult` can unblock immediately thereafter, without waiting for the other thread to finish. Therefore `asyncAnd` does a “short-circuit” computation: as soon as it encounters something that would make the result false, it doesn’t bother to wait for further results. (This is the reason for the “async” in the name.)

Also worth pointing out is that `True` and `False` don’t belong to the same universe as `TrueTrue`; `True` and `False` are mere `Bool` values, while `TrueTrue` is one of the six inhabitants of the `State` type. So, `asyncAnd` has to do the work of converting a `TrueTrue` back into a `True`, so that it can return the correct type. It just does this by returning the result of a `==` test that checks if the contents of the `Result` are `TrueTrue`.

## Running `asyncAnd`

We’re finally ready to actually try calling `asyncAnd` with some arguments and see what happens. Since `asyncAnd` returns a `Par` computation, we’ll need to run that computation with the `runPar` function provided by LVish to see its results.

We can load up our code in GHCi and try out some example calls. Here’s a simple one:

Here, we’re giving `asyncAnd` two arguments, `True` and `False`, each wrapped in a `return` to give it the `Par Bool` type that `asyncAnd` needs. Unsurprisingly, the result of combining `True` and `False` is `False`. We can go ahead and check the rest of the truth table, too:

How about something a little more interesting? `asyncAnd` only takes two arguments, but we can fold it over a much longer list of arguments, which will give us the conjunction of them all. Here we’re folding it over a long list of alternating `True`s and `False`s, with the result being `False`, of course. We use `(return True)` as the starting value because `True` is the identity element of logical “and”.

Or, how about a list of lots of `True`s with a stray `False` in the middle?

Of course, if we leave out the stray `False`, the result will be `True`:

Finally, while it’s good that these examples seem to work in GHCi, to really exercise `asyncAnd` we’ll want to compile our code with GHC using the `-threaded` flag, and then use the `-N` option to run the program on multiple processors.

## An embarrassing mistake

The determinism guarantee of LVars is premised on the expectation that the states an LVar can take on form a partially ordered set, and that every two elements in that set have a join. In other words, the set of states it can take on are a join-semilattice. If that isn’t true, then all determinism bets are off.

There are two equivalent ways to define a join-semilattice. In fact, if S is a poset, the following three properties are equivalent, meaning that they’re either all true or all false:

1. S is a join-semilattice.

2. For all elements a and b of S, the join of a and b exists.

3. For all elements a, b, and c of S, the following identities hold:

• Commutativity: a join b = b join a.
• Associativity: a join (b join c) = (a join b) join c.
• Idempotence: a join a = a.

Property (2) certainly seems like it ought to be true of `Result`. After all, we defined `joinStates`, and it’s a total function: give it any two `State`s and it’ll give us an answer. Surely we can say that all joins exist.

How about property (3)? In some cases, we can write code to check that the commutativity, associativity, and idempotence identities hold. In particular, for a finite set of states like the one we’re dealing with — and particularly for a small finite set, like ours — it’s no problem to just iterate over all combinations of a, b, and c and check that commutativity, associativity, and idempotence hold. In fact, the current head-of-tree version of `Data.LVar.Internal.Pure` exposes a function called `verifyFiniteJoin` that does just such a check. It’s a quick-and-dirty function that takes as arguments a list of states and a join operation and returns an error message if any of the three properties fail. Here’s the code:

When I wrote `verifyFiniteJoin` a month ago, I felt smug about how easy it would now be to verify that finite posets were join-semilattices. I couldn’t wait to try it on the first finite set of states I had handy, which happened to be the states of `Result`. When I did, here’s what I saw.

Uh-oh.

## What went wrong here?

It turns out that `joinStates` really isn’t associative. In particular, we have:

since `BotTrue` joined with `F` is `F`, which, joined with `TrueBot`, is again `F`. But:

since `TrueBot` joined with `BotTrue` is `TrueTrue`, which, joined with `F`, is `Top`! Therefore, the set of states that `Result` can take on isn’t a join-semilattice, and so, despite all our efforts, `Result` is not “really” an LVar.

Even more embarrassingly, this isn’t the first time I’ve made this mistake. The three-way parallel “or” computation that I wrote about in the second half of this post has the same associativity bug! Consider, for instance, the three states `yes`, `{ horace_no, franz_no }`, and `{ horace_no, kat_no }`. The join of `yes` and `{ horace_no, franz_no }` is supposed to be `yes`, which when joined with `{ horace_no, kat_no }`, is still `yes`. But the join of `{ horace_no, franz_no }` and `{ horace_no, kat_no }` is `{ horace_no, franz_no, kat_no }`, which, when joined with `yes`, results in ⊤, which represents an error. Most of the post still makes sense, but with that particular parallel “or” example, just like with `asyncAnd`, I messed up because I was reaching too hard for an interesting threshold set.

## Property (2) doesn’t hold, either

Returning to property (2) up above: since we’ve shown that property (3) doesn’t hold, it must also be the case the property (2) is false. But how is that possible? Doesn’t `joinStates` return a state for every pair of states it’s given as arguments? It never goes into an infinite loop or raises an error! Isn’t that enough to satisfy property (2)?

The trouble is that, although `joinStates` computes a total function, it doesn’t actually compute a join. Recall that the join of two elements is supposed to be the least element that is greater than or equal to them both. Because of this, the join of two elements is also known as their least upper bound. While an element may, in general, have many upper bounds — for instance, `BotTrue` has `TrueTrue` and `Top` as upper bounds — only one of those can be its least upper bound.

How can we know whether a given join function actually computes a least upper bound? For a give poset S, it would suffice for the following two properties to hold:

• For all elements v1, v2, and v of S, if v1 <= v and v2 <= v, then (v1 join v2) <= v.

• For all elements v1 and v2 of S, v1 <= (v1 join v2) and v2 <= (v1 join v2).

The first of these properties says that, of all the upper bounds v1 and v2 might share, then whatever their join is, it had better be no bigger than any of those upper bounds. This property captures the “least” part of “least upper bound”. The second property just says that, when you join two elements, you have to get something that’s an upper bound on both of them; this property captures the “upper bound” part of “least upper bound”.

Again, with a finite set of states like we’re dealing with, it’s easy to check both of these properties, and as it turns out, `joinStates` fails the first one. `TrueTrue` and `F` are both upper bounds of `TrueBot` and `BotTrue`, and neither `TrueTrue` nor `F` is less then the other. Therefore, `TrueBot` and `BotTrue` don’t actually have a least upper bound, regardless of what the definition of `joinStates` says.

## Does it matter?

So, `joinStates` breaks associativity, and so the set of states that a `Result` can take on is not a join-semilattice. But, is this a problem? Does breaking associativity actually matter in practice?

In the case of the `asyncAnd` example, the lack of associativity doesn’t seem to make a difference in practice, since every call to `asyncAnd` allocates one LVar and then writes to it twice. Because there are never more than two writes to any LVar, the associativity-breaking case of `joinStates` never occurs. Therefore we have a notion of “observable associativity”.

One thing to make note of is that we could have avoided any question about associativity by defining the `State`s and the partial order on them differently. Instead of conflating all the states involving a write of `False` into a single state `F`, we could have defined a poset that looks like this: With this poset, the join operation would have been associative, and we could have written `getResult` with a threshold set of all four final states: `[TrueTrue, FalseTrue, TrueFalse, FalseFalse]`. The disadvantage of doing it this way is that we lose the ability to get early answers from the computation. If a single `False` is written right away, the advantage of our original version is that it allows `getResult` to unblock immediately, since no later write can change the fact that the whole thing is going to be `False`. With this version, we have to wait for the second write regardless of what the first write is. Is associativity at odds with the ability to get early answers?

## Try it yourself

All the code from this post, plus some additional scaffolding, is available on GitHub. If you want to play with it yourself, be sure to check the README for details on how to get set up.

Compared to the version on GitHub, a few of the types in this post are simplified. In the real code, you’ll notice two type variables, `s` and `d`, in a number of places:

• The definition of the `Result` type is really `type Result s = PureLVar s State`.
• `getPureLVar`’s type is really `(JoinSemiLattice t, Eq t) => PureLVar s t -> [t] -> Par d s t`.
• Likewise, `getResult`’s type is really `Result s -> Par d s State`.
• Finally, `asyncAnd`’s type is really `Par d s Bool -> Par d s Bool -> Par d s Bool`.

The `s` type variable is for “session sealing”, and `d` is a “determinism level”; both are described in more detail in a previous post.