composition.al

Using the simplex algorithm for SMT solving

This year, as I’ve started to dive into neural network verification, I’ve had the chance to learn a little about how SMT solvers are implemented. One thing I learned that surprised me is that under the hood, many state-of-the-art SMT solvers use the simplex algorithm to solve satisfiability problems.

The simplex algorithm? Isn’t that for optimization problems?

This post isn’t an introduction to the simplex algorithm itself; for that, there are many good references, such as chapter 29 of CLRS. If you don’t have a physical copy of CLRS handy, or if yours is currently in use as a doorstop, some of chapter 29 is on Google Books. Better yet, try chapter 2 of Linear Programming by Vašek Chvátal, a book that Guy Katz pointed me to and that I like a lot, and almost all of which is on Google Books.

The simplex algorithm is a standard technique for solving linear programming (LP) problems. What’s a linear programming problem? Wikipedia has an example:

Suppose that a farmer has a piece of farm land, say $L$ km2, to be planted with either wheat or barley or some combination of the two. The farmer has a limited amount of fertilizer, $F$ kilograms, and pesticide, $P$ kilograms. Every square kilometer of wheat requires $F_1$ kilograms of fertilizer and $P_1$ kilograms of pesticide, while every square kilometer of barley requires $F_2$ kilograms of fertilizer and $P_2$ kilograms of pesticide. Let $S_1$ be the selling price of wheat per square kilometer, and $S_2$ be the selling price of barley. If we denote the area of land planted with wheat and barley by $x_1$ and $x_2$ respectively, then profit can be maximized by choosing optimal values for $x_1$ and $x_2$.

This is an optimization problem that can be solved by maximizing a function of $x_1$ and $x_2$ — in particular, $S_1 \cdot x_{1} + S_2 \cdot x_2$ — subject to constraints that capture what we know about the relationships between the other variables involved. The simplex algorithm is a recipe for doing that. The steps of the algorithm can be carried out by hand, but LP solver software packages such as GLPK or Gurobi also use some version of the algorithm to find an optimal solution to LP problems.

SAT and SMT solvers, on the other hand, solve satisfiability problems: problems where, given a formula with Boolean variables in it, we need to figure out whether or not there is some assignment of values to variables that will cause the expression to evaluate to “true”. (In the case of SMT solving, the formula is something fancier than a plain old Boolean formula, but the basic idea is the same.)

Determining whether a formula is satisfiable is a decision problem, not an optimization problem. Since SMT problems are decision problems and LP problems are optimization problems, I didn’t imagine that SMT solvers and LP solvers would have much, if anything, in common. But it turns out that that’s not the case!

The two phases of the simplex algorithm

To use the simplex algorithm to solve an LP problem, you first need to find what’s called a feasible solution to the problem, in which all the constraints are satisfied. The feasible solution is sometimes called a primal feasible solution or initial feasible solution, and the process of finding it is sometimes called initialization. Once that’s done, you can use the simplex method to iterate toward an optimal solution.

Some presentations of the simplex algorithm focus on the second part of the process only. In fact, in chapter 2 of the Chvátal textbook, it’s assumed from the outset that you already have a feasible solution, and now you just want to find an optimal one. For many real-world LP problems, this is a realistic assumption to make. For instance, suppose you’re the farmer from the example problem above. Whatever you’re already doing is a feasible solution: assuming you’ve set up the problem correctly, the amounts of fertilizer, pesticide, and land you’re using already fall within the constraints, because it would be impossible for you to do otherwise (for instance, you can’t use more land than exists, or put a negative amount of fertilizer on it). Of course, what you’re currently doing is probably not optimal, which is why you need the simplex algorithm! But at least you have an initial feasible solution from which to start iterating toward an optimal one.

But what do you do when you don’t have a feasible solution to begin with? It turns out that the simplex method is of use here, too. Chapter 3 of Chvátal (on pages 39-42) covers the situation in which you don’t have a feasible solution and need to come up with one. You do this by solving what’s known as an auxiliary problem. From the original problem, we can mechanically construct an auxiliary problem that always has an initial feasible solution, and then use the simplex method to optimize the solution to the auxiliary problem. Once that’s done, if it turns out that the optimal solution to the auxiliary problem is zero, then a feasible solution to the original problem exists and can be easily obtained, and we can then go on and carry out the steps of the simplex method on that feasible solution, resulting in an optimal solution to the original problem. And if the optimal solution to the auxiliary problem turns out to be nonzero, then that means that no feasible solution to the original problem exists.

Chvátal explains that this approach is what’s known as the two-phase simplex method.1 The first phase is setting up and and finding an optimal solution to the auxiliary problem. If doing that results in a feasible solution to the original problem, we can then go on to the second phase, which finds an optimal solution to the original problem. Both phases use the same iterative process.

For satisfiability problems, phase one is all you need

So, you can use an auxiliary problem to either get to a feasible solution for the original problem, or determine that there is no feasible solution. Figuring out whether or not there’s a feasible solution sounds like a decision problem, and it is! But, interestingly, we use an optimization technique to solve it. We turn the decision problem into an optimization problem: if the optimal answer to the auxiliary problem turns out to be zero, then the answer to our decision problem is yes, and if the optimal answer to the auxiliary problem turns out to be nonzero, then the answer to the decision problem is no.

Back in May, I wrote about how for certain kinds of linear real arithmetic formulas, SMT solving and LP solving coincide, but I didn’t mention the simplex algorithm in that earlier post. I can now be more specific and say that for certain kinds of SMT formulas, determining the satisfiability of the formula coincides with carrying out just the first phase of the simplex algorithm that LP solvers use. Finding a feasible solution is the entire problem, and once we have one, we can stop. After all, the goal of SMT solving is just to determine whether a formula is satisfiable or not — we don’t care about finding an “optimal” way to satisfy it!

  1. In fact, if we peek into the guts of GLPK, which is the LP solver on top of which the proof-of-concept implementation of Reluplex is built, we see that in its internal representation of an LP problem, there’s a field called phase that represents which of the two phases we’re in. Often, computer implementations of an algorithm look pretty different from descriptions of how to carry out the algorithm by hand, so I think it’s interesting to see that the notion of phases isn’t specific to by-hand applications of the simplex method, but is instead fundamental enough that at least one computer implementation makes use of it, too!

Comments