@@ -162,7 +161,7 @@ The LeNet model is used as an example. You can also create and train your own mo
x=self.conv2(x)
x=self.relu(x)
x=self.max_pool2d(x)
x=self.reshape(x,(-1,16*5*5))
x=self.flatten(x)
x=self.fc1(x)
x=self.relu(x)
x=self.fc2(x)
...
...
@@ -258,17 +257,17 @@ LOGGER.info(TAG, 'The average costing time is %s',
The attack results are as follows:
```python
```
prediction accuracy after attacking is : 0.052083
mis-classification rate of adversaries is : 0.947917
Theaverageconfidenceofadversarialclassis:0.419824
Theaverageconfidenceoftrueclassis:0.070650
The average confidence of adversarial class is : 0.803375
The average confidence of true class is : 0.042139
The average distance (l0, l2, linf) between original samples and adversarial samples are: (1.698870, 0.465888, 0.300000)
The average structural similarity between original samples and adversarial samples are: 0.332538
The average costing time is 0.003125
```
After the untargeted FGSM attack is performed on the model, the accuracy of model decreases from 98.9% to 5.2% on adversarial examples, while the misclassification ratio reaches 95%, and the Average Confidence of Adversarial Class (ACAC) is 0.419824, the Average Confidence of True Class (ACTC) is 0.070650. The zero-norm distance, two-norm distance, and infinity-norm distance between the generated adversarial examples and the original benign examples are provided. The average structural similarity between each adversarial example and the original example is 0.332538. It takes 0.003125s to generate an adversarial example on average.
After the untargeted FGSM attack is performed on the model, the accuracy of model decreases from 98.9% to 5.2% on adversarial examples, while the misclassification ratio reaches 95%, and the Average Confidence of Adversarial Class (ACAC) is 0.803375, the Average Confidence of True Class (ACTC) is 0.042139. The zero-norm distance, two-norm distance, and infinity-norm distance between the generated adversarial examples and the original benign examples are provided. The average structural similarity between each adversarial example and the original example is 0.332538. It takes 0.003125s to generate an adversarial example on average.
The following figure shows the effect before and after the attack. The left part is the original example, and the right part is the adversarial example generated after the untargeted FGSM attack. From a visual point of view, there is little difference between the right images and the left images, but all images on the right successfully mislead the model into misclassifying the sample as another incorrect categories.
...
...
@@ -341,15 +340,15 @@ LOGGER.info(TAG, 'The average distance (l0, l2, linf) between original '
accuracy of TEST data on defensed model is : 0.974259
accuracy of adv data on defensed model is : 0.856370
defense mis-classification rate of adversaries is : 0.143629
The average confidence of adversarial class is : 0.616670
The average confidence of true class is : 0.177374
The average distance (l0, l2, linf) between original samples and adversarial samples are: (1.493417, 0.432914, 0.300000)
```
After NAD is used to defend against adversarial examples, the model's misclassification ratio of adversarial examples decreases from 95% to 48%, effectively defending against adversarial examples. In addition, the classification accuracy of the model for the original test dataset reaches 97%. The NAD function does not reduce the classification accuracy of the model.
After NAD is used to defend against adversarial examples, the model's misclassification ratio of adversarial examples decreases from 95% to 14%, effectively defending against adversarial examples. In addition, the classification accuracy of the model for the original test dataset reaches 97%. The NAD function does not reduce the classification accuracy of the model.