9.5.2 在Cleverhans中使用FGSM算法
9.5.2 在cleverhans中使用fgsm算法
下面我们以mnist为例介绍如何在cleverhans中使用fgsm算法,代码路径为:
https://github.com/duoergun0729/adversarial_examples/blob/master/code/9-cleverhans-mnist-fgsm.ipynb
首先加载需要使用的python库,使用的深度学习框架为tensorflow。cleverhans中对攻击算法的封装在cleverhans.attacks中,识别mnist的模型使用modelbasiccnn。
import logging
import numpy as np
import tensorflow as tf
from cleverhans.loss import crossentropy
from cleverhans.dataset import mnist
from cleverhans.utils_tf import model_eval
from cleverhans.train import train
from cleverhans.attacks import fastgradientmethod
from cleverhans.utils import accuracyreport, set_log_level
from cleverhans_tutorials.tutorial_models import modelbasiccnn
定义全局变量,其中包括训练的轮数、批处理的大小、学习速率和cnn模型的卷积核个数。
#定义全局变量
nb_epochs = 6
batch_size = 128
learning_rate = 0.001
clean_train = true
backprop_through_attack = false
nb_filters = 64
获取mnist数据集的训练集和测试集,以及图像数据的长宽及通道数据。
# 获取mnist数据
mnist = mnist(train_start=train_start, train_end=train_end,
test_start=test_start, test_end=test_end)
x_train, y_train = mnist.get_set('train')
x_test, y_test = mnist.get_set('test')
# 使用图像参数
img_rows, img_cols, nchannels = x_train.shape[1:4]
nb_classes = y_train.shape[1]
定义模型的输入tensor以及训练参数。
# 定义输入的tf placeholder
x = tf.placeholder(tf.float32, shape=(none, img_rows, img_cols,
nchannels))
y = tf.placeholder(tf.float32, shape=(none, nb_classes))
# 训练一个mnist模型
train_params = {
'nb_epochs': nb_epochs,
'batch_size': batch_size,
'learning_rate': learning_rate
}
定义校验函数,其中preds代表预测结果的tensor,y_set代表数据集x_set对应的真实标签。在tensorflow环境下,session加载了预先定义的计算图,输入x_set后,preds即为对应的预测结果。
def do_eval(preds, x_set, y_set, report_key, is_adv=none):
acc = model_eval(sess, x, y, preds, x_set, y_set, args=eval_params)
setattr(report, report_key, acc)
if is_adv is none:
report_text = none
elif is_adv:
report_text = 'adversarial'
else:
report_text = 'legitimate'
if report_text:
print('test accuracy on %s examples: %0.4f' % (report_text, acc))
使用modelbasiccnn在训练集上进行训练,损失函数使用交叉熵。训练完毕后,在测试集上进行验证。
model = modelbasiccnn('model1', nb_classes, nb_filters)
preds = model.get_logits(x)
loss = crossentropy(model, smoothing=label_smoothing)
def evaluate():
do_eval(preds, x_test, y_test, 'clean_train_clean_eval', false)
train(sess, loss, x_train, y_train, evaluate=evaluate,
args=train_params, rng=rng, var_list=model.get_params())
# 计算训练误差
if testing:
do_eval(preds, x_train, y_train, 'train_clean_train_clean_eval')
经过6轮训练后,在测试集上获得了99.29%的准确率。
test accuracy on legitimate examples: 0.9929
设置fgsm的攻击参数,并初始化fastgradientmethod对象,使用测试集生成对抗样本,并使用训练好的modelbasiccnn对生成的对抗样本进行预测。
fgsm_params = {
'eps': 0.3,
'clip_min': 0.,
'clip_max': 1.
}
# 初始化fastgradientmethod对象
fgsm = fastgradientmethod(model, sess=sess)
adv_x = fgsm.generate(x, **fgsm_params)
preds_adv = model.get_logits(adv_x)
# evaluate the accuracy of the mnist model on adversarial examples
do_eval(preds_adv, x_test, y_test, 'clean_train_adv_eval', true)
预测结果表明,modelbasiccnn仅能正确识别14.32%的对抗样本。
test accuracy on adversarial examples: 0.1432