Thoughts on “Safety Verification of Deep Neural Networks”

A couple months back, I wrote about Pulina and Tacchella’s 2010 paper on verification of safety properties of neural networks. That paper is the earliest that I’ve found in the literature that makes a serious attempt at verification of neural networks using an SMT solver, and I found it really interesting.

In that post, I also mentioned a number of limitations of Pulina and Tacchella’s approach. First, the network used in the paper’s case study was tiny, and it wasn’t clear that the technique would scale to realistically sized neural networks. Second, the property being verified (ensuring that the network’s output fell within a specified range) seemed a bit boring; I was interested in whether it was possible to verify more interesting properties. Third, the function the network was learning — the position of the end of a robot arm, given its joint angles — seemed like something that could be worked out analytically, without the need for a neural network (or any machine learning at all). And finally, it wasn’t clear that the proposed verification approach would work for neural networks that did classification tasks, rather than regression tasks like the one in the case study. A 2011 follow-up paper by the authors addressed some of those concerns, but not all of them.

An economics analogy for why adversarial examples work

One of the most interesting results from “Explaining and Harnessing Adversarial Examples” is the idea that adversarial examples for a machine learning model do not arise because of the supposed complexity or nonlinearity of the model, but rather because of high dimensionality of the input space. I want to take a stab at explaining “Explaining”’s result with an economics analogy. Take it with a grain of salt, since I have little to no formal training in either machine learning or economics. Let’s go!

The one-sentence version of this post: There are economies of scale with dimensions when constructing adversarial examples.

Where does OBT research go after OBT?

A friend who’s listened to me hype up the Off the Beaten Track workshops for the past several years now asks:

out of curiosity, do y’all have an idea of how OBT papers do after they appear in OBT? i know 5 years is probably not enough time to tell, but have these papers indeed influenced people to open new lines of PL research, or gone on to appear in full-length form elsewhere?

Thoughts on “An Abstraction-Refinement Approach to Verification of Artificial Neural Networks”

Neural networks (NNs) have become a ubiquitous machine learning technique in a wide range of domains. In 2003, some researchers surveyed the literature on verification of NNs and concluded that they “represent a class of systems that do not fit into the current paradigms of software development and certification”. Wondering if that was still the case after the NN renaissance that’s taken place over the last ten years or so, I recently began looking for recent work on verifying interesting properties of neural networks. The first relevant paper I found in the software verification literature was “An Abstraction-Refinement Approach to Verification of Artificial Neural Networks”, by Luca Pulina and Armando Tacchella, which appeared in CAV 2010. Here are some thoughts on that paper.

Thoughts on “Adversarial examples in the physical world”

In machine learning, computers “learn” by being shown lots of examples; for instance, we might use millions of pictures to train a model that can identify pictures of cats. The idea is that if we do a good job training the model, then it should be able to correctly identify new cat pictures (that is, ones it never saw during training) most of the time.

But it’s possible to take a picture that the model classifies correctly as being of a cat, and then tweak that picture very slightly in a way that would be undetectable to a human, but that fools the same model into classifying it with high confidence as, say, a philodendron. These subtly tweaked images are called adversarial examples, and there are known techniques for generating lots of them. Such techniques rely on making tiny, precise perturbations to the original correctly-classified example.