Problem with adversarial example crafting and subsequent classifier and model accuracy #1214

LGB7 · 2021-07-06T11:25:59Z

LGB7
Jul 6, 2021

Hello,
I have been working on a project that involves crafting adversarial examples and then testing the pretrained model's accuracy and that of the PyTorchClassifier on those crafted examples.
Before crafting the examples, the accuracies of the original pretrained model and the PyTorchClassifier from ART on the original normalized CIFAR 10 dataset are the same. However, after crafting the examples using the classifier and any of the ART attacks, and testing those examples again on the classifier and the model, the accuracies are different and even some are giving illogical results like accuracies greater than 100% (the code example given below).

Here are samples of the code used:

I am using PyTorch.
The pretrained model used is the VGG11, and the classifier is imported from art.estimators.classification. The loss function used is the torch.nn.HingeEmbeddingLoss() and the optimizer is the Adam optimizer (torch.optim.Adam)

'''
classifier = PyTorchClassifier(
model=model,
clip_values=(min_pixel_value, max_pixel_value),
loss=criterion,
optimizer=optimizer,
input_shape=(3, 32, 32),
nb_classes=10,
)
'''

1- For crafting the adversarial examples:
'''
#Crafting the adversarial example with DeepFool
logger.info("Create DeepFool attack")
adv_crafter = DeepFool(classifier, max_iter=10, verbose=True)
logger.info("Craft attack test examples")
x_test_adv = adv_crafter.generate(x_test_norm)
x_test_adv=torch.from_numpy(x_test_adv)
y_test_adv=torch.from_numpy(y_test)
'''
with the "x_test_norm" being the normalized testing samples of the cifar10 dataset, "y_test" the classes, and the DeepFool attack was imported from art.attacks.evasion.

2- And then for the testing and getting the accuracy results:

'''
#Evaluate the classifier on the adversarial samples
preds = np.argmax(classifier.predict(x_test_adv), axis=1)
acc = np.sum(preds == np.argmax(y_test, axis=1)) / y_test.shape[0]
logger.info("Classifier before adversarial training")
logger.info("Accuracy on adversarial samples: %.2f%%", (acc * 100))

testset_adv=torch.utils.data.TensorDataset(x_test_adv,y_test_adv)
testloader_adv=torch.utils.data.DataLoader(testset_adv, batch_size=10000,
shuffle=False, num_workers=2)

#Evaluate the original model
correct = 0
total = 0
with torch.no_grad():
for batch_idx, (inputs, targets) in enumerate(testloader_adv):
inputs, targets = inputs.to(device), targets.to(device)
outputs = model(inputs)
targets=targets.permute(1,0)
_, predicted = outputs.max(1)
total += targets.size(1)
#total += targets.size(0)
correct += predicted.eq(targets).sum().item()

print('Acc: ', 100.*correct/total, '%')
'''
Has anyone encountered a similar problem and can help?

Thank you!

beat-buesser · 2021-07-12T12:14:50Z

beat-buesser
Jul 12, 2021
Maintainer

Hi @LGB7 Thank you very much for the first contribution to ART Discussions! I have missed it for a few days, but I'll take a closer look today.

10 replies

LGB7 Jul 16, 2021
Author

Sorry for the late reply, I hadn't saved those values individually so I had to rerun the code. I got an accuracy of 179.08% for the pretrained model on the adversarial samples with an attack success rate of 68.07%.
The correct variable has a value of 28194, the total has a value of 20 000 and the inputs has 10000 samples.

beat-buesser Jul 16, 2021
Maintainer

Are these numbers as expected? I think the value of total should be equal to the number of samples in inputs? It might be that samples are counted twice. The same for variable correct, its current value would mean that there are almost 3 times as many correct samples as there are input samples.

LGB7 Jul 16, 2021
Author

No they are not as expected. As you mentioned, it appears as if it's iterating several times over the data when evaluating the accuracy of the pretrained model on the adversarial examples. I can't seem to find however the reason behind it doing so because, on the original samples, before crafting the examples, the accuracies of both the pretrained model and the PyTorch classifier were correct. This problem is just appearing when testing the pretrained model on the data composed of the crafted adversarial examples (x_test_adv) .

beat-buesser Jul 19, 2021
Maintainer

@LGB7 Would you have an example script that I could run to reproduce the result?

LGB7 Jul 27, 2021
Author

Sorry for the late reply. Here's an example script that you could run:

import torch
import torchvision

from cifar10_models.vgg import vgg11_bn

device = 'cuda' if torch.cuda.is_available() else 'cpu'

#Pretrained model
model = vgg11_bn(pretrained=True).to(device)
model.eval() # for evaluation

import logging
from art.attacks.evasion import DeepFool
from art.estimators.classification import PyTorchClassifier
from art.utils import load_dataset
import numpy as np

#Configure a logger to capture ART outputs;
logger = logging.getLogger()
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
formatter = logging.Formatter("[%(levelname)s] %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)

(x_train, y_train), (x_test, y_test), min_pixel_value, max_pixel_value = load_dataset('cifar10')

mean=(0.4914, 0.4822, 0.4465)
std=(0.2471, 0.2435, 0.2616)

x_test_norm=torch.from_numpy(x_test) #tensor => (10000,32,32,3) initially

mean2=torch.from_numpy(np.asarray(mean)) #tensor => size[3]
std2=torch.from_numpy(np.asarray(std)) #tensor => size[3]

x_test_norm.sub_(mean2).div_(std2)

x_test_norm = x_test_norm.cpu().detach().numpy()

#Swap axes to PyTorch's NCHW format
x_test_norm = np.transpose(x_test_norm, (0, 3, 1, 2)).astype(np.float32) #(10000,3,32,32)

x_test_norm=torch.from_numpy(x_test_norm) #tensor => [(10000,3,32,32)]

#Define the loss function and the optimizer
criterion = torch.nn.HingeEmbeddingLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

#Create the ART classifier
classifier = PyTorchClassifier(
model=model,
clip_values=(min_pixel_value, max_pixel_value),
loss=criterion,
optimizer=optimizer,
input_shape=(3, 32, 32),
nb_classes=10,
)

#Crafting the adversarial example with DeepFool
logger.info("Create DeepFool attack")
adv_crafter = DeepFool(classifier, max_iter=10, verbose=True)

logger.info("Craft attack test examples")
x_test_adv = adv_crafter.generate(x_test_norm)
x_test_adv=torch.from_numpy(x_test_adv)

y_test_adv=torch.from_numpy(y_test)

#Evaluate the classifier on the adversarial samples
preds = np.argmax(classifier.predict(x_test_adv), axis=1)
acc = np.sum(preds == np.argmax(y_test, axis=1)) / y_test.shape[0]
logger.info("Classifier before adversarial training")
logger.info("Accuracy on adversarial samples: %.2f%%", (acc * 100))

testset_adv=torch.utils.data.TensorDataset(x_test_adv,y_test_adv)
testloader_adv=torch.utils.data.DataLoader(testset_adv, batch_size=10000,
shuffle=False, num_workers=2)

##PROBLEM
#Evaluate the original model
correct = 0
total = 0
with torch.no_grad():
for batch_idx, (inputs, targets) in enumerate(testloader_adv):
inputs, targets = inputs.to(device), targets.to(device)
outputs = model(inputs)
targets=targets.permute(1,0)
_, predicted = outputs.max(1)
total += targets.size(1)
correct += predicted.eq(targets).sum().item()

print('Acc: ', 100.*correct/total, '%')

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with adversarial example crafting and subsequent classifier and model accuracy #1214

{{title}}

Replies: 1 comment 10 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Problem with adversarial example crafting and subsequent classifier and model accuracy #1214

LGB7 Jul 6, 2021

Replies: 1 comment · 10 replies

beat-buesser Jul 12, 2021 Maintainer

LGB7 Jul 16, 2021 Author

beat-buesser Jul 16, 2021 Maintainer

LGB7 Jul 16, 2021 Author

beat-buesser Jul 19, 2021 Maintainer

LGB7 Jul 27, 2021 Author

LGB7
Jul 6, 2021

Replies: 1 comment 10 replies

beat-buesser
Jul 12, 2021
Maintainer

LGB7 Jul 16, 2021
Author

beat-buesser Jul 16, 2021
Maintainer

LGB7 Jul 16, 2021
Author

beat-buesser Jul 19, 2021
Maintainer

LGB7 Jul 27, 2021
Author