AutoAttack and RobustBench

This section has a series of coding problems with PyTorch. To run the code locally, you can follow the installation instructions at the bottom of this page. As always, we highly recommend you read all the content on this page before starting the coding exercises.

Background

High-quality research on adversarial robustness requires an effective way to measure attack and defense quality. Under one attack, a model may retain its performance, while under another, it may break entirely. This point cannot be overemphasized: a model may be highly robust to one common attack while giving nearly 0% accuracy against another.

While this may seem like an obvious problem, many papers have been published in credible conferences that show high robustness, but under a more comprehensive benchmark, their performance slips. To address this issue and other problems with evaluating robustness, Francesco Croce and Matthias Hein proposed AutoAttack (Croce & Hein, 2020). Croce and Hein applied AutoAttack to published defenses and found that the robust accuracy dropped by more than 10% in 13 cases. This illustrates both the difficulty in evaluating one's own defenses and in comparing the effectiveness of defenses across papers. In the AutoAttack paper, the authors lament:

Due to the many broken defenses, the field is currently in a state where it is very difficult to judge the value of a new defense without an independent test. This limits the progress as it is not clear how to distinguish bad from good ideas.

The following year, RobustBench (Croce et al., 2021) followed up AutoAttack, to make evaluation and comparison of defenses more accessible for the research community. RobustBench uses AutoAttack to evaluate defenses and hosts a public leaderboard to track the research community's progress.

In this section, you will learn how to use RobustBench to evaluate published defenses. In the next section, you will learn about how researchers at Lawrence Livermore National Laboratory achieved state-of-the-art performance on RobustBench by scaling up compute and data.

Robust Bench Rules:

To use RobustBench correctly, you will need to be aware of the restrictions of the benchmark. In general, any model is fair game as long as it follows the requirements that the authors lay out in the original paper:

Models submitted must "have in general non-zero gradients with respect to the inputs"

For example, if you preprocess an image by rounding down all values to the nearest tenth (i.e., torch.floor(tensor * 10) / 10), the partial derivative of the loss with respect to a pixels in the image will always be 0. This is not allowed because it makes attacks that rely on backpropagation to find the gradient obsolete.
Models submitted must "have a fully deterministic forward pass."

Doing a random zoom, crop, or other transformation to an image before sending it through the model can be an effective defense, but RobustBench does not allow it because it makes benchmarking difficult and makes common attacks less effective.
Models submitted must "not have an optimization loop in the forward pass."

This is because even if there are non-zero gradients through the forward pass, the backward pass will be very expensive to calculate.

Installation

The best way to ensure that you will have the latest RobustBench features is to run the command below.

pip install git+https://github.com/RobustBench/robustbench.git

You will also want to install AutoAttack:

pip install -q git+https://github.com/fra31/auto-attack

After installation, you should be ready to go with the exercises!

Give feedback on this section

NextState of the Art

References

Croce, F., Andriushchenko, M., Sehwag, V., Debenedetti, E., Flammarion, N., Chiang, M., Mittal, P., & Hein, M. (2021). RobustBench: a standardized adversarial robustness benchmark. Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track. https://openreview.net/forum?id=SSKZPJCt7B

Croce, F., & Hein, M. (2020). Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. ICML.

AutoAttack and RobustBench

Background

Robust Bench Rules:

Installation

References

Contents