An economics analogy for why adversarial examples work

One of the most interesting results from “Explaining and Harnessing Adversarial Examples” is the idea that adversarial examples for a machine learning model do not arise because of the supposed complexity or nonlinearity of the model, but rather because of high dimensionality of the input space. I want to take a stab at explaining “Explaining”’s result with an economics analogy. Take it with a grain of salt, since I have little to no formal training in either machine learning or economics. Let’s go!

The one-sentence version of this post: There are economies of scale with dimensions when constructing adversarial examples.

Where does OBT research go after OBT?

A friend who’s listened to me hype up the Off the Beaten Track workshops for the past several years now asks:

out of curiosity, do y’all have an idea of how OBT papers do after they appear in OBT? i know 5 years is probably not enough time to tell, but have these papers indeed influenced people to open new lines of PL research, or gone on to appear in full-length form elsewhere?

Thoughts on “An Abstraction-Refinement Approach to Verification of Artificial Neural Networks”

Neural networks (NNs) have become a ubiquitous machine learning technique in a wide range of domains. In 2003, some researchers surveyed the literature on verification of NNs and concluded that they “represent a class of systems that do not fit into the current paradigms of software development and certification”. Wondering if that was still the case after the NN renaissance that’s taken place over the last ten years or so, I began looking for recent work on verifying interesting properties of neural networks. The first relevant paper I found in the software verification literature was “An Abstraction-Refinement Approach to Verification of Artificial Neural Networks”, by Luca Pulina and Armando Tacchella, which appeared in CAV 2010. Here are some thoughts on that paper.

Thoughts on “Adversarial examples in the physical world”

In machine learning, computers “learn” by being shown lots of examples; for instance, we might use millions of pictures to train a model that can identify pictures of cats. The idea is that if we do a good job training the model, then it should be able to correctly identify new cat pictures (that is, ones it never saw during training) most of the time.

But it’s possible to take a picture that the model classifies correctly as being of a cat, and then tweak that picture very slightly in a way that would be undetectable to a human, but that fools the same model into classifying it with high confidence as, say, a philodendron. These subtly tweaked images are called adversarial examples, and there are known techniques for generating lots of them. Such techniques rely on making tiny, precise perturbations to the original correctly-classified example.