@@ -175,45 +175,53 @@ The LeNet model is used as an example. You can also create and train your own mo
...
@@ -175,45 +175,53 @@ The LeNet model is used as an example. You can also create and train your own mo
returnx
returnx
```
```
2.Load the pre-trained LeNet model. You can also train and save your own MNIST model. For details, see Quick Start. Use the defined data loading function `generate_mnist_dataset` to load data.
2.Train LeNet model. Use the defined data loading function `generate_mnist_dataset` to load data.
LOGGER.info(TAG,'mis-classification rate of adversaries is : %s',
LOGGER.info(TAG,'mis-classification rate of adversaries is : %s',
attack_evaluate.mis_classification_rate())
attack_evaluate.mis_classification_rate())
LOGGER.info(TAG,'The average confidence of adversarial class is : %s',
LOGGER.info(TAG,'The average confidence of adversarial class is : %s',
...
@@ -256,8 +269,6 @@ LOGGER.info(TAG, 'The average distance (l0, l2, linf) between original '
...
@@ -256,8 +269,6 @@ LOGGER.info(TAG, 'The average distance (l0, l2, linf) between original '
LOGGER.info(TAG,'The average structural similarity between original '
LOGGER.info(TAG,'The average structural similarity between original '
'samples and adversarial samples are: %s',
'samples and adversarial samples are: %s',
attack_evaluate.avg_ssim())
attack_evaluate.avg_ssim())
LOGGER.info(TAG,'The average costing time is %s',
(stop_time-start_time)/(batch_num*batch_size))
```
```
The attack results are as follows:
The attack results are as follows:
...
@@ -269,7 +280,6 @@ The average confidence of adversarial class is : 0.803375
...
@@ -269,7 +280,6 @@ The average confidence of adversarial class is : 0.803375
The average confidence of true class is : 0.042139
The average confidence of true class is : 0.042139
The average distance (l0, l2, linf) between original samples and adversarial samples are: (1.698870, 0.465888, 0.300000)
The average distance (l0, l2, linf) between original samples and adversarial samples are: (1.698870, 0.465888, 0.300000)
The average structural similarity between original samples and adversarial samples are: 0.332538
The average structural similarity between original samples and adversarial samples are: 0.332538
The average costing time is 0.003125
```
```
After the untargeted FGSM attack is performed on the model, the accuracy of model decreases from 98.9% to 5.2% on adversarial examples, while the misclassification ratio reaches 95%, and the Average Confidence of Adversarial Class (ACAC) is 0.803375, the Average Confidence of True Class (ACTC) is 0.042139. The zero-norm distance, two-norm distance, and infinity-norm distance between the generated adversarial examples and the original benign examples are provided. The average structural similarity between each adversarial example and the original example is 0.332538. It takes 0.003125s to generate an adversarial example on average.
After the untargeted FGSM attack is performed on the model, the accuracy of model decreases from 98.9% to 5.2% on adversarial examples, while the misclassification ratio reaches 95%, and the Average Confidence of Adversarial Class (ACAC) is 0.803375, the Average Confidence of True Class (ACTC) is 0.042139. The zero-norm distance, two-norm distance, and infinity-norm distance between the generated adversarial examples and the original benign examples are provided. The average structural similarity between each adversarial example and the original example is 0.332538. It takes 0.003125s to generate an adversarial example on average.
...
@@ -287,60 +297,55 @@ Natural Adversarial Defense (NAD) is a simple and effective adversarial example
...
@@ -287,60 +297,55 @@ Natural Adversarial Defense (NAD) is a simple and effective adversarial example
LOGGER.info(TAG,'accuracy of adv data on defensed model is : %s',
LOGGER.info(TAG,'accuracy of adv data on defensed model is : %s',
np.mean(acc_list))
np.mean(accuracy_adv))
LOGGER.info(TAG,'defense mis-classification rate of adversaries is : %s',
LOGGER.info(TAG,'defense mis-classification rate of adversaries is : %s',
attack_evaluate.mis_classification_rate())
attack_evaluate.mis_classification_rate())
LOGGER.info(TAG,'The average confidence of adversarial class is : %s',
LOGGER.info(TAG,'The average confidence of adversarial class is : %s',
attack_evaluate.avg_conf_adv_class())
attack_evaluate.avg_conf_adv_class())
LOGGER.info(TAG,'The average confidence of true class is : %s',
LOGGER.info(TAG,'The average confidence of true class is : %s',
attack_evaluate.avg_conf_true_class())
attack_evaluate.avg_conf_true_class())
LOGGER.info(TAG,'The average distance (l0, l2, linf) between original '
'samples and adversarial samples are: %s',
attack_evaluate.avg_lp_distance())
```
```
### Defense Effect
### Defense Effect
...
@@ -351,9 +356,7 @@ accuracy of adv data on defensed model is : 0.856370
...
@@ -351,9 +356,7 @@ accuracy of adv data on defensed model is : 0.856370
defense mis-classification rate of adversaries is : 0.143629
defense mis-classification rate of adversaries is : 0.143629
The average confidence of adversarial class is : 0.616670
The average confidence of adversarial class is : 0.616670
The average confidence of true class is : 0.177374
The average confidence of true class is : 0.177374
The average distance (l0, l2, linf) between original samples and adversarial samples are: (1.493417, 0.432914, 0.300000)
```
```
After NAD is used to defend against adversarial examples, the model's misclassification ratio of adversarial examples decreases from 95% to 14%, effectively defending against adversarial examples. In addition, the classification accuracy of the model for the original test dataset reaches 97%. The NAD function does not reduce the classification accuracy of the model.
After NAD is used to defend against adversarial examples, the model's misclassification ratio of adversarial examples decreases from 95% to 14%, effectively defending against adversarial examples. In addition, the classification accuracy of the model for the original test dataset reaches 97%.