Square Attack
This section has a series of coding problems using PyTorch. As always, we highly recommend you read all the content on this page before starting the coding exercises.
Introduction
The Square Attack (Andriushchenko et al., 2020) is a black-box method used to generate adversarial samples. Unlike 'white-box' approaches such as PGD or FGSM, the Square Attack does not require knowing model weights or gradients.
While other black-box attacks take many queries to perform attacks, the square attack requires relatively few. The attack works by taking repeated alterations in the shape of a square on the image, keeping it if it increases the loss of the model. The Square Attack, upon release, was successful enough that it even outperformed some existing white-box approaches on benchmarks.
While both and variations of the attack exist, we focus on the attack. We believe that the attack is sufficient to learn the main concepts behind the attack. At the bottom of this writeup we include limited information about the attack as well but we warn readers that this content is quite technical and not necessary to understand for future sections of the course.

Fig. 1
Source: (Andriushchenko et al., 2020)
The Square Attack Loop
The Square Attack works through a random sampling algorithm. First, the adversarial image is initialized as the input image, and the loss is initialized as the loss function of and . For each iteration, a square of pixels is randomly chosen and perturbed. If the addition of this square to increases loss, this addition is kept. If not, the square is rejected. The size of the square is controlled by the variable , which is gradually reduced over time to simulate convergence.
The algorithm for the attack loop from the paper is shown below. Although it is not necessary for you to understand everything now, we encourage you to try to parse each line and guess what is happening. This is good practice for future sections which use this kind of notation!
| 1 | |
| 2 | while and is not adversarial do |
| 3 | side length of the square to modify (according to some schedule) |
| 4 | |
| 5 | |
| 6 | |
| 7 | if then |
| 8 | |
| 9 | end |
Square Attack
For the attack, the tensor from line 4 of the previous algorithm, is generated by picking a random location for the square on the image. Then for each color channel, a value for is randomly chosen uniformly between and where is the budget.
More concretely, can be visualized below assuming an input image with pixels and 3 color channels. Note that although each color channel has the same square location, the value of the change at each color channel is different.

Fig. 2
Example for square attack
Next this delta would be added to the current adversarial tensor before being projected such that all values are between 0 and 1 and the adversarial image is in the budget. For the attack this "projection" is done by clipping the image (similar to what you saw in the FGSM/PGD section).
BONUS: Square Attack
Once again, we will note that the attack is quite technical and not necessary to understand for future sections of this course. If interested a brief writeup of the attack can be found below with more details in the paper itself.