Homework 2: Advanced Attacks
- Due May 8, 2024 by 11:59pm
- Points 10
- Submitting a file upload
- Available until May 11, 2024 at 11:59pm
In Homework 1, we generated untargeted adversarial examples to attack models designed for the CIFAR-100 classification task. For Homework 2, we will continue utilizing the same dataset to delve into more interesting and sophisticated attack strategies. Homework 2 is divided into two sections: the "targeted" attack and the "universal" attack.
[Targeted attacks (1pt)]
Following Homework 1, your task is to launch targeted attacks on the same 500 images. Please manipulate those 500 images that the victim model classifies all of them as the "apple" (label 0). The allowable attack budget, epsilon, is set at 4/255. It's crucial that each pixel's value remains between 0 and 255 after perturbation to ensure the images remain valid. The victim model will be an ensemble composed of five models: ["resnet110_cifar100", "preresnet164bn_cifar100", "seresnet110_cifar100", "densenet40_k36_bc_cifar100", "diaresnet164bn_cifar100"]. You should save your 500 perturbed images in a folder adv_imgs/. No defense will be implemented during TA’s evaluation phase.
-
- Ensemble: average the probs of the five models (after softmax.)
- Evaluation: (pred==0)/500. The higher, the better.
[Universal attacks (4pts)]
The goal of universal attack is to use a “single” perturbation to attack all images. You can use untargeted attack setting in this task. Given the increased difficulty of this attack, a larger attack budget of 12/255 is allowed. To save computing time, the dataset will be narrowed down to images named *_0.png and *_1.png, resulting in a subset dataset that comprises only the first two images for each of the 100 classes (thus, there are 200 different images in the new subset dataset). For the evaluation, TA will employ the "resnet20_cifar100" model, as detailed on https://github.com/osmr/imgclsmob. No defense will be adopted during evaluation. You are instructed to save your universal perturbation as a PNG file, named universal.png. Assuming represents your universal perturbation array within the 0-1 domain, this array can be saved as
. Thus, the pixel value of your saved PNG is [0/255-24/255].
You need to write a report describing your methods. You can talk about, for example, why you choose certain methods and any internal experiments that you did. Please write it using LaTeX with the NeurIPS conference template (https://media.neurips.cc/Conferences/NeurIPS2024/Styles.zip). Report length is at most 4 pages, excluding references (please cite the work that you used in this homework). Please use \usepackage[preprint]{neurips_2024} option. A recommended structure is: Abstract, Introduction, Methodology, Experiment, Analysis, and Conclusion. You can add any other section.
Submission format:
Put everything in a folder named "hw2_(your_student_id)":
- Put the report in the first layer of the folder, with name "hw2_(your_student_id).pdf"
- Put your code in a sub-folder named "src". Please include a README.txt file here.
- Put your universal perturbation as a PNG file named “universal.png”.
- Put your 500 generated adversarial images in a folder named "adv_imgs", and use the same file names as those in the evaluation set that you downloaded, i.e., i_j.png is the adversarial counterpart of i_j.png in the original benign evaluation set.
So, your folder will look like this:
hw2_(your_student_id)
| - hw2_(your_student_id).pdf
| - src/
| - universal.png
| - adv_imgs/
| - 0_0.png
...
| - 99_4.png
Then compress this folder into hw2_(your_student_id).zip
Grading policy:
5% on the performance (1% for the targeted attack and 4% for the universal attack)
5% on the report (novelty, clarity, evaluation completeness)
Late submission:
The late submission policy is as follows: raw score * max((1-0.2*n), 0),
where n is the number of days late, rounding up to the nearest integer.
Rubric
Criteria | Ratings | Pts | |||||
---|---|---|---|---|---|---|---|
Report 5pts
threshold:
pts
|
|
pts
--
|
|||||
Performance (targeted)
We evaluated your submission. (1pts)
threshold:
pts
|
|
pts
--
|
|||||
Performance (universal) 1pt
4pts
threshold:
pts
|
|
pts
--
|
|||||
Total Points:
10
|