composition.al

An economics analogy for why adversarial examples work

One of the most interesting results from “Explaining and Harnessing Adversarial Examples” is the idea that adversarial examples for a machine learning model do not arise because of the supposed complexity or nonlinearity of the model, but rather because of high dimensionality of the input space. I want to take a stab at explaining “Explaining”’s result with an economics analogy. Take it with a grain of salt, since I have little to no formal training in either machine learning or economics. Let’s go!

The one-sentence version of this post: There are economies of scale with dimensions when constructing adversarial examples.

The widget store and the image classifier

Let’s say you run a retail store. You buy widgets from a widget wholesaler for $5 each, then resell them in your store at a markup. Your goal is to make a profit; to accomplish that, you need to sell the widgets for more than you bought them for.

Ideally, you want to make a big profit, which means selling your widgets for a lot more than you bought them for. But you have to be careful: you don’t want customers to notice any substantial increase from the wholesale price. Your customers are okay with a small markup in exchange for getting to shop at your nice store (you play excellent music there, after all), but if they notice a big increase from the wholesale price, they might just go to the wholesaler themselves. You want the prices to be high enough that you rake in a lot of profit, but low enough that a shopper will look at them and still say, “Yep, that’s reasonable!”

To make this more concrete, let’s say that you want to bring in at least $1000 in profit per day. If you were selling 100 widgets in your store each day, you’d need to sell each widget for $15 — a $10 markup from the wholesale price. (This is ignoring overhead costs, which we’ll get to later.) You could, of course, choose to sell some of your 100 widgets for less than $15, but then you’d have to sell others for even more than $15 to make up for it, in order to hit an average markup of $10 per widget.

That’s a pretty huge markup — you’d be selling widgets for three times the wholesale cost! No matter how you tried to hide things by charging less for some widgets, your customers would notice how steep your prices were on average. They would abandon you and go shop at the wholesaler, your great taste in music notwithstanding.

But let’s say you’re selling, say, 2500 widgets a day. Now, all you need to do is sell each widget for an average markup of $0.40. From your customers’ perspective, the difference between $5 and $5.40 isn’t noticeable, but since you’re selling 2500 widgets, you’re still bringing in the desired $1000 in profit per day. In general, if you’re selling a lot of widgets, you can get away with making tiny per-widget markups and still make a big profit.

Okay, so how does that relate to adversarial examples and high dimensionality?

Let’s say you have a machine learning system that classifies pictures. You give it an image, and it tells you what it’s a picture of, with some degree of confidence. Let’s say you want to create an adversarial example for that model: you have a picture of a cat, and you want to tweak your cat picture so that it will be misclassified as a philodendron with high confidence, but in such a way that a human would look at the tweaked image and still say, “Yep, that’s a cat! Nothing seems particularly strange about this cat picture!”

Now, let’s say your original cat picture is only 10x10 pixels. Suppose those 100 pixels are represented by a 100-element vector.1 This means that the input space has 100 dimensions. If you want to tweak the image in a way that will fool the classifier, there are only 100 pixels that you can change, and this really doesn’t give you a lot of leeway: you might need to change some of those 100 pixels a lot in order to fool the system into classifying the image as a philodendron. Since you would have to make a large average change per pixel, a human looking at the tweaked image would be able to tell that it had been tweaked, in much the same way that shoppers at your store would be able to notice the average markup of $10 if you were only selling 100 widgets a day.

But let’s say your cat picture is 50x50 pixels. Now you have 2500 pixels, represented by a 2500-element vector. Your input space has 2500 dimensions. there’s now a lot more leeway to make only a tiny change to each pixel. The changes will be so small, a human looking at the tweaked image might not even suspect anything, much like shoppers at your store wouldn’t notice a difference between a wholesale price of $5 and your average price of $5.40.

Overheads and epsilons

Let’s come back to the notion of overhead of running your store. You have more expenses than just the cost of buying widgets wholesale: there’s also rent, employee wages, buying all that music, and so on. Because of overhead, your per-widget average markup can’t be too small: it needs to add up to enough to cover your overhead costs. After that, all the rest of the overhead is profit, and as we’ve seen, the profit margin can be quite small if you’re selling lots of widgets. But, because of overhead, even if you’re selling hojillions of widgets, there’s always going to be some amount below which your per-widget markup cannot go. For instance, if your overhead is, say, $500 a day and you’re selling 2500 widgets a day, then you need there to be, on average, at least a twenty-cent markup per widget to cover overhead. So, if you’re selling the 2500 widgets for $5.40 each, then only $0.20 per widget will be profit, and you’ll make $500 in profit a day.

$500 in profit is still not bad, but what if you wanted to be really sure to make $1000 in profit per day? One thing you could try is counting that $1000 per day as part of your overhead. Then, your daily overhead would be $1500 (that is, the actual overhead of $500, plus the profit you want to be sure to make), and you could set your average price at $5.60. As long as you still sell your 2500 widgets, you get your desired $1000 in profit.

You have to be careful here, though: the higher average price of $5.60 might be more noticeable to some customers. If you get greedy and you decide you want $2000 in profit per day, the price will go up to $6.00, and even more customers might notice. Choosing the right amount to mark up your items requires careful consideration of what your actual overhead is, what kind of profit you want to make, and how much you’re willing to allow the markup to be noticeable.

Is there a concept analogous to “overhead” for the cat picture you’re trying to tweak? Yes! To start with, there is a minimum amount that a pixel value can be tweaked. That minimum amount is the smallest difference that your storage medium can represent. For instance, if you’re representing each pixel value with 8 bits, then there are only 255 distinct values that a pixel can take on. Therefore, you cannot make a change that brightens or darkens a pixel by only, say, 0.00001, because that’s smaller than 1/255, and the difference will be unrepresentable with the bits that you have.

The literature on adversarial examples has a name for the minimum representable amount that each element of your input can be changed: they call that amount epsilon. The choice of epsilon will depend on characteristics of your storage medium or data representation. In the “Explaining and Harnessing” paper, for instance, the adversarial perturbation shown in Figure 1 uses an epsilon of 0.007, which, they explain, “corresponds to the magnitude of the smallest bit of an 8 bit image encoding after GoogLeNet’s conversion to real numbers.”

Now, the goal of an adversarial perturbation is to get a classifier to misclassify the image. What if you wanted to make a misclassification very likely? One thing you could try is increasing your epsilon value beyond the minimum that it needs to be to represent a change. For instance, in “Explaining and Harnessing”, some experiments use an epsilon of 0.1, while others use a larger epsilon of 0.25. In general, a larger epsilon will make a misclassification more likely.

You have to be careful here, though: as “Adversarial examples in the physical world” points out, “higher values of $\epsilon$ essentially destroys the content of the image and makes it unrecognizable by humans”. If you get greedy and decide that you need your images to be misclassified at a very high rate, then humans looking at the image might notice. Choosing the right epsilon requires careful consideration of what your actual storage medium characteristics are, what kind of error rate you want to achieve, and how much you’re willing the tweaks you make to the image to be noticeable.

They’re losing money on each sale, but they make up for it in volume!

By itself, selling a lot of widgets doesn’t mean you’ll make a profit. If you’re losing money per item, selling a lot of them just makes things worse. Selling in volume gives you a chance to have a big profit even with slim profit margins, but that’s all it is — a chance. That said, it’s okay if you aren’t making money on every single widget, as long as you make more than you lose overall. (Hence the idea of loss leaders.) In the end, when all income and expenses have been accounted for, you want your income minus your expenses to be a positive number.

Likewise, if you’re trying to create an adversarial image, a high-dimensional input space gives you a chance, but now you need to do something with that chance. Returning to our cat picture example, we want to change the image in a way that will make your system think it’s more like a philodendron.2 It turns out that there is an optimal direction of change3 in which to take the image. For a 2500-pixel image, this direction is a 2500-dimensional vector, representing the change each pixel needs to go in — just as, when you move through space in a direction, that direction is represented by a three-dimensional vector.

However, just as you don’t necessarily need to make money on every widget in your store in order to make money overall, you don’t necessarily need to change your image in the optimal direction: you only need to go in a direction that’s close enough to the optimal direction! In the end, when every dimension has been accounted for, the direction you go needs to have positive dot product with the optimal direction.

Is there something analogous to loss leaders in the setting of adversarial examples? I’m not sure. You might choose to make a change in a merely “pretty good” direction if the optimal direction is difficult to compute, but it’s hard to imagine why you would intentionally change some dimensions in the wrong direction. On the other hand, you might choose to leave some dimensions fixed (say, for instance, if they were pixels of an image that you thought a human might be particularly suspicious of any change in!), and you would then have to make a larger change to other dimensions to make up for it. So maybe the fixed dimensions could be thought of as analogous to “loss leaders”! If anyone wants to point me to any work that’s been done on this, I’d be interested to know about it.

Different perspectives; different vector norms

A shopper in your high-volume widget store might be surprised to learn that you’re raking in $1000 a day in profit. After all, from their point of view, each widget is only a few cents more expensive than its wholesale price. From your point of view, though, all those small markups add up. How “big” are your markups, then? Are they small because your customers perceive them as small, or are they large because you’re making a lot of money? It’s a question of perspective.

Linear algebra gives us a way to talk about these differing perspectives: vector norms. A norm is a way to talk about how “big” a vector is. One choice of norm is the maximum norm. The maximum norm of a vector, also known as the infinity norm, is just the absolute value of the largest element in it. For instance, let’s say your 2500 widgets each have a markup of $0.60. If you represent all the markups as a 2500-element vector v, then the maximum norm of v is 0.6. In Julia:

1
2
3
4
5
6
7
8
9
10
11
12
julia> v = fill(0.6, 2500)
2500-element Array{Float64,1}:
 0.6
 0.6
 0.6
 
 0.6
 0.6
 0.6

julia> vecnorm(v, Inf)
0.6

0.6 would also be the maximum norm of v even if only one element were 0.6 and the rest were, say, 0:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
julia> v2 = fill(0.0, 2500)
2500-element Array{Float64,1}:
 0.0
 0.0
 0.0
 
 0.0
 0.0
 0.0

julia> v2[42] = 0.6
0.6

julia> vecnorm(v2, Inf)
0.6

In our analogy, the maximum norm represents the largest markup that a shopper is going to see in our store. However, it doesn’t tell us very much about what things look like from our perspective as the storekeeper. For that perspective, let’s look at a different norm, the Euclidean norm. The Euclidean norm, also known as the $L^{2}$ norm, is the square root of the sum of all its elements squared. Let’s see what the $L^{2}$ norm of our vector v is:

1
2
julia> vecnorm(v, 2)
30.0

The $L^{2}$ norm of v is 30 — a respectably sized $L^{2}$ norm. How about v2, the vector where every element is 0.0 except for the lonely element at index 42?

1
2
julia> vecnorm(v2, 2)
0.6

v2’s $L^{2}$ norm is the same as its maximum norm was — only 0.6. The big difference between the $L^{2}$ norms of v2 and v is analogous to the big difference in profit you’d see if you sold only one widget at a markup of $0.60 and the rest at no markup, versus selling all your widgets at a markup of $0.60. But the fact that v2 and v have the same maximum norm suggests that from a different perspective — the shopper’s perspective — those two pricing strategies look about the same.

This isn’t a perfect analogy; obviously, most shoppers will care about more than just the maximum price of an item in the store. But the maximum norm and the Euclidean norm demonstrate that there are different valid ways of looking at the size of a vector — or, analogously, different valid ways of looking at the size of the markups on items in a store.

Now consider the adversarial perturbation we’re making to our 50x50 cat picture. The tweaks we’re making to each pixel can be represented as a vector, the perturbation vector. If we tweak every pixel by 0.6, then the maximum norm of the perturbation vector will be 0.6; meanwhile, its $L^{2}$ norm will be 30, as we saw above.

What if our cat picture is 100x100? Now we have 10,000 pixels to work with. Again, if we tweak every pixel by 0.6, the maximum norm of the perturbation vector will be the same as before:

1
2
3
4
5
6
7
8
9
10
11
12
julia> v3 = fill(0.6, 10000)
10000-element Array{Float64,1}:
 0.6
 0.6
 0.6
 
 0.6
 0.6
 0.6

julia> vecnorm(v3, Inf)
0.6

But look at what happens to the $L^{2}$ norm as we go to 10,000 pixels:

1
2
julia> vecnorm(v3, 2)
60.0

Now the $L^{2}$ norm is even bigger compared to the maximum norm! Adding more dimensions to our input space opens up the possibility of a huge difference between the maximum norm and the $L^{2}$ norm.4

In our widget store analogy, the maximum norm and the $L^{2}$ norm of the markup vector can be thought of as representing, respectively, markups that shoppers see versus profits that we see as the store owner. Selling more items opens up the possibility of a huge profit, even though individual item markups don’t change.

In the case of adversarial images, the maximum norm and the $L^{2}$ norm of the perturbation vector can be thought of as representing, respectively, the tweaks to an image that a human perceives versus the tweaks to an image that the classifier perceives. Adding more pixels to the input space opens up the possibility of a huge difference in how the classifier perceives the image, even though the human perception of it doesn’t change. For humans, a large change to a few pixels of an image might be a lot more noticeable than a tiny change to many pixels. This accidental difference in human perception and machine perception is what adversarial examples exploit. As section 3 of the “Explaining and Harnessing” paper puts it (replacing some of the notation with English):

Since [the maximum norm of the perturbation vector] does not grow with the dimensionality of the problem but the change in activation caused by perturbation by [the perturbation vector] can grow linearly with [the number of dimensions], then for high dimensional problems, we can make many infinitesimal changes to the input that add up to one large change to the output. We can think of this as a sort of “accidental steganography,” where a linear model is forced to attend exclusively to the signal that aligns most closely with its weights, even if multiple signals are present and other signals have much greater amplitude.

This explanation shows that a simple linear model can have adversarial examples if its input has sufficient dimensionality.

It seems like in general, strange phenomena (loss leaders; adversarial examples) can crop up whenever the same thing (markups of items for sale; tweaks to an image) is being measured in two different ways (by shoppers versus by store owners; according to the maximum norm versus according to the $L^{2}$ norm).

Conclusion

In conclusion, there are economies of scale with dimensions when constructing adversarial examples.

It’s possible for your widget store to make a big profit even if you sell each widget at only a tiny markup, as long as you take overhead into account. In fact, you can even sell some of them at a loss as long as the overall profit is positive, and as long as you sell enough widgets in total.

Likewise, it’s possible to create an adversarial input that will be misclassified with high probability even if you change each dimension of the input a tiny bit, as long as you take epsilon into account. In fact, you can even change some of them non-optimally, as long as the overall direction you change them has positive dot product with the optimal direction, and as long as there are enough dimensions in total.

Thanks to Vaibhav Sagar, Sean Talts, and Julia Evans for discussions that contributed to this post. Thanks to Matt Siegel for spotting an arithmetic error.

  1. For simplicity, let’s say it’s a black-and-white image, so we don’t have to worry about RGB values.

  2. The technique for generating adversarial images in “Explaining and Harnessing” is actually just “change it to make it less like a cat”, not “change it to make it more like a philodendron”; you’ll end up with a misclassified image, but the misclassification may or may not be especially interesting. For more-like-a-philodendron perturbations, see the “iterative least-likely class method” in the follow-up paper “Adversarial examples in the physical world”, or just see Julia Evans’ “How to trick a neural network into thinking a panda is a vulture” article for Code Words, which, as far as I can tell, independently arrived at something similar!

  3. I bet I would have liked calculus better in undergrad if I had known what it had to do with misclassifying cat pictures.

  4. Indeed, even at 28x28 or so, the dimensionality of the input space is starting to be high enough that it’s possible to generate adversarial images that are fairly unnoticeable to humans, as various experiments with the MNIST (28x28) and CIFAR (32x32) image datasets have shown.

Comments