An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Last update: Jan 04, 2023

Overview

Neural Attention Distillation

This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

NAD: Quick start with pretrained model

We have already uploaded the all2one pretrained backdoor student model(i.e. gridTrigger WRN-16-1, target label 5) and the clean teacher model(i.e. WRN-16-1) in the path of ./weight/s_net and ./weight/t_net respectively.

For evaluating the performance of NAD, you can easily run command:

$ python main.py

where the default parameters are shown in config.py.

The trained model will be saved at the path weight/erasing_net/.tar

Please carefully read the main.py and configs.py, then change the parameters for your experiment.

Erasing Results on BadNets

Dataset	Baseline ACC	Baseline ASR	NAD ACC	NAD ASR
CIFAR-10	85.65	100.0	82.12	3.57

Training your own backdoored model

We have provided a DatasetBD Class in data_loader.py for generating training set of different backdoor attacks.

For implementing backdoor attack(e.g. GridTrigger attack), you can run the below command:

$ python train_badnet.py

This command will train the backdoored model and print clean accuracies and attack rate. You can also select the other backdoor triggers reported in the paper.

Please carefully read the train_badnet.py and configs.py, then change the parameters for your experiment.

How to get teacher model?

we obtained the teacher model by finetuning all layers of the backdoored model using 5% clean data with data augmentation techniques. In our paper, we only finetuning the backdoored model for 5~10 epochs. Please check more details of our experimental settings in section 4.1 and Appendix A; The finetuning code is easy to get by just setting all the param beta = 0, which means the distillation loss to be zero in the training process.

Other source of backdoor attacks

Attack

CL: Clean-label backdoor attacks

SIG: A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning

Paper

Barni, M., Kallas, K., & Tondi, B. (2019). > A new Backdoor Attack in CNNs by training set corruption without label poisoning. > arXiv preprint arXiv:1902.11237 superimposed sinusoidal backdoor signal with default parameters """ alpha = 0.2 img = np.float32(img) pattern = np.zeros_like(img) m = pattern.shape[1] for i in range(img.shape[0]): for j in range(img.shape[1]): for k in range(img.shape[2]): pattern[i, j] = delta * np.sin(2 * np.pi * j * f / m) img = alpha * np.uint32(img) + (1 - alpha) * pattern img = np.uint8(np.clip(img, 0, 255)) # if debug: # cv2.imshow('planted image', img) # cv2.waitKey() return img ">

## reference code
def plant_sin_trigger(img, delta=20, f=6, debug=False):
    """
    Implement paper:
    > Barni, M., Kallas, K., & Tondi, B. (2019).
    > A new Backdoor Attack in CNNs by training set corruption without label poisoning.
    > arXiv preprint arXiv:1902.11237
    superimposed sinusoidal backdoor signal with default parameters
    """
    alpha = 0.2
    img = np.float32(img)
    pattern = np.zeros_like(img)
    m = pattern.shape[1]
    for i in range(img.shape[0]):
        for j in range(img.shape[1]):
            for k in range(img.shape[2]):
                pattern[i, j] = delta * np.sin(2 * np.pi * j * f / m)

    img = alpha * np.uint32(img) + (1 - alpha) * pattern
    img = np.uint8(np.clip(img, 0, 255))

    #     if debug:
    #         cv2.imshow('planted image', img)
    #         cv2.waitKey()

    return img

Refool: Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

Defense

MCR: Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

Fine-tuning & Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks

STRIP: A Defence Against Trojan Attacks on Deep Neural Networks

Library

Note: TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.

Backdoors 101 — is a PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models.

References

If you find this code is useful for your research, please cite our paper

@inproceedings{li2021neural,
  title={Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks},
  author={Li, Yige and Lyu, Xixiang and Koren, Nodens and Lyu, Lingjuan and Li, Bo and Ma, Xingjun},
  booktitle={ICLR},
  year={2021}
}

Contacts

If you have any questions, leave a message below with GitHub.

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Related tags

Overview

Neural Attention Distillation

NAD: Quick start with pretrained model

Erasing Results on BadNets

Training your own backdoored model

How to get teacher model?

Other source of backdoor attacks

Attack

Defense

Library

References

Contacts

Owner

Yige-Li

Christmas face app for Decathlon xmas coding party!

A Human-in-the-Loop workflow for creating HD images from text

Regulatory Instruments for Fair Personalized Pricing.

NasirKhusraw - The TSP solved using genetic algorithm and show TSP path overlaid on a map of the Iran provinces & their capitals.

Implementation of the Paper: "Parameterized Hypercomplex Graph Neural Networks for Graph Classification" by Tuan Le, Marco Bertolini, Frank Noé and Djork-Arné Clevert

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Learning nonlinear operators via DeepONet

Distributional Sliced-Wasserstein distance code

Python implementation of ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images, AAAI2022.

Learning Confidence for Out-of-Distribution Detection in Neural Networks

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

Shape-Adaptive Selection and Measurement for Oriented Object Detection

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .

PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Repo 4 basic seminar §How to make human machine readable"

PyElastica is the Python implementation of Elastica, an open-source software for the simulation of assemblies of slender, one-dimensional structures using Cosserat Rod theory.

A pytorch implementation of Paper "Improved Training of Wasserstein GANs"

Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Related tags

Overview

Neural Attention Distillation

NAD: Quick start with pretrained model

Erasing Results on BadNets

Training your own backdoored model

How to get teacher model?

Other source of backdoor attacks

Attack

Defense

Library

References

Contacts

Owner

Yige-Li

Christmas face app for Decathlon xmas coding party!

A Human-in-the-Loop workflow for creating HD images from text

Regulatory Instruments for Fair Personalized Pricing.

NasirKhusraw - The TSP solved using genetic algorithm and show TSP path overlaid on a map of the Iran provinces & their capitals.

Implementation of the Paper: "Parameterized Hypercomplex Graph Neural Networks for Graph Classification" by Tuan Le, Marco Bertolini, Frank Noé and Djork-Arné Clevert

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Learning nonlinear operators via DeepONet

Distributional Sliced-Wasserstein distance code

Python implementation of ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images, AAAI2022.

Learning Confidence for Out-of-Distribution Detection in Neural Networks

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

Shape-Adaptive Selection and Measurement for Oriented Object Detection

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Repo 4 basic seminar §How to make human machine readable"

PyElastica is the Python implementation of Elastica, an open-source software for the simulation of assemblies of slender, one-dimensional structures using Cosserat Rod theory.

A pytorch implementation of Paper "Improved Training of Wasserstein GANs"

Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .