Square Attack

This section has a series of coding problems using PyTorch. As always, we highly recommend you read all the content on this page before starting the coding exercises.

Introduction

The Square Attack (Andriushchenko et al., 2020) is a black-box method used to generate adversarial samples. Unlike 'white-box' approaches such as PGD or FGSM, the Square Attack does not require knowing model weights or gradients.

While other black-box attacks take many queries to perform attacks, the square attack requires relatively few. The attack works by taking repeated alterations in the shape of a square on the image, keeping it if it increases the loss of the model. The Square Attack, upon release, was successful enough that it even outperformed some existing white-box approaches on benchmarks.

While both $L_\infty$ and $L_2$ variations of the attack exist, we focus on the $L_\infty$ attack. We believe that the $L_\infty$ attack is sufficient to learn the main concepts behind the attack. At the bottom of this writeup we include limited information about the $L_2$ attack as well but we warn readers that this content is quite technical and not necessary to understand for future sections of the course.

A descriptive alt text
Fig. 1
Source: (Andriushchenko et al., 2020)

The Square Attack Loop

The Square Attack works through a random sampling algorithm. First, the adversarial image $\hat{x}$ is initialized as the input image, and the loss is initialized as the loss function of $model(x)$ and $y$ . For each iteration, a square of pixels is randomly chosen and perturbed. If the addition of this square to $\hat{x}$ increases loss, this addition is kept. If not, the square is rejected. The size of the square is controlled by the variable $h$ , which is gradually reduced over time to simulate convergence.

The algorithm for the attack loop from the paper is shown below. Although it is not necessary for you to understand everything now, we encourage you to try to parse each line and guess what is happening. This is good practice for future sections which use this kind of notation!

1	$\hat{x} \leftarrow \text{init}(x), \quad l^* \leftarrow L(f(x), y), \quad i \leftarrow 1$
2	while $i < N$ and $\hat{x}$ is not adversarial do
3	$h^{(i)} \leftarrow$ side length of the square to modify (according to some schedule)
4	$\delta \sim P(\epsilon, h^{(i)}, w, c, \hat{x}, x)$
5	$\hat{x}_{\text{new}} \leftarrow \text{Project } \hat{x} + \delta \text{ onto } \{z \in \mathbb{R}^d : \\|z - x\\|_p \le \epsilon\} \cap [0, 1]^d$
6	$l_{\text{new}} \leftarrow L(f(\hat{x}_{\text{new}}), y)$
7	if $l_{\text{new}} < l^$ then* $\hat{x} \leftarrow \hat{x}_{\text{new}}, l^* \leftarrow l_{\text{new}};$
8	$i \leftarrow i + 1$
9	end

$L_\infty$ Square Attack

For the $L_\infty$ attack, the $\delta$ tensor from line 4 of the previous algorithm, is generated by picking a random location for the square on the image. Then for each color channel, a value for $\delta$ is randomly chosen uniformly between $-2\epsilon$ and $2 \epsilon$ where $\epsilon$ is the $L_\infty$ budget.

More concretely, $\delta$ can be visualized below assuming an input image with $32 \times 32$ pixels and 3 color channels. Note that although each color channel has the same square location, the value of the change at each color channel is different.

$Example square from L_\infty square attack$
Fig. 2
Example $\delta$ for $L_\infty$ square attack

Next this delta would be added to the current adversarial tensor before being projected such that all values are between 0 and 1 and the adversarial image is in the $L_\infty$ budget. For the $L_\infty$ attack this "projection" is done by clipping the image (similar to what you saw in the FGSM/PGD section).

BONUS: $L_2$ Square Attack

Once again, we will note that the $L_2$ attack is quite technical and not necessary to understand for future sections of this course. If interested a brief writeup of the attack can be found below with more details in the paper itself.

Give feedback on this section

NextEnsemble Attacks

References

Andriushchenko, M., Croce, F., Flammarion, N., & Hein, M. (2020). Square Attack: a query-efficient black-box adversarial attack via random search. https://arxiv.org/abs/1912.00049

Square Attack

Introduction

The Square Attack Loop

L∞L_\inftyL∞​ Square Attack

BONUS: L2L_2L2​ Square Attack

References

Contents

$L_\infty$ Square Attack

BONUS: $L_2$ Square Attack