This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

Overview

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video]

Paper Authors: Yifan Jiang, Shiyu Chang, Zhangyang Wang

CVPR 2021

This is re-implementation of TransGAN: Two Transformers Can Make One Strong GAN, and That Can Scale Up, CVPR 2021 in PyTorch.

Generative Adversarial Networks-GAN builded completely free of Convolutions and used Transformers architectures which became popular since Vision Transformers-ViT. In this implementation, CIFAR-10 dataset was used.

0 Epoch 40 Epoch 100 Epoch 200 Epoch

Related Work - Vision Transformers (ViT)

In this implementation, as a discriminator, Vision Transformer(ViT) Block was used. In order to get more info about ViT, you can look at the original paper here

Credits for illustration of ViT: @lucidrains

Installation

Before running train.py, check whether you have libraries in requirements.txt! Also, create ./fid_stat folder and download the fid_stats_cifar10_train.npz file in this folder. To save your model during training, create ./checkpoint folder using mkdir checkpoint.

Training

python train.py

Pretrained Model

You can find pretrained model here. You can download using:

wget https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

or

curl gdrive.sh | bash -s https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

License

MIT

Citation

@article{jiang2021transgan,
  title={TransGAN: Two Transformers Can Make One Strong GAN},
  author={Jiang, Yifan and Chang, Shiyu and Wang, Zhangyang},
  journal={arXiv preprint arXiv:2102.07074},
  year={2021}
}
@article{dosovitskiy2020,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and  Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
  journal={arXiv preprint arXiv:2010.11929},
  year={2020}
}
@inproceedings{zhao2020diffaugment,
  title={Differentiable Augmentation for Data-Efficient GAN Training},
  author={Zhao, Shengyu and Liu, Zhijian and Lin, Ji and Zhu, Jun-Yan and Han, Song},
  booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
  year={2020}
}
Comments
  • GPU memory, Modifying batch size

    GPU memory, Modifying batch size

    Hello,

    I saw your comment in VITA-Group's implementation of TransGAN and started looking at your implementation here.

    Without modifying anything and attempting to run "python train.py" results in CUDA out of memory; I believe the GPU I'm using cannot handle the model size/training images that you've specified. I tried editing the batch size on lines 35 and 36 of train.py (--gener_batch_size, changing default from 64 to 32, etc.), but I get a RuntimeError of:

    Output 0 of UnbindBackward is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such fuctions do not allow the otutput views to be modified inplace. You should replace the inplace operation by an out-of-place one.

    My two questions are:

    1. How would you suggest modifying the training parameters to deal with GPU running out of memory? and,
    2. Is there a better way to edit the batch size, and what else do I need to change in order for the code to not break when the batch size is changed?

    Thanks!

    opened by Andrew-X-Wang 10
  • Create your own FID stats file

    Create your own FID stats file

    Hello and thanks for the implementation. I'm trying to train this model on a different datset, but to do so I need a custom fid_stats file for my dataset. How can I create it ?

    opened by IlyasMoutawwakil 2
  • FID score: nan

    FID score: nan

    Thank you for your contribution. But in the training processing, FID score is Nan. I want to known whether it is appropriate. Should I make some chance to solve this problem?

    opened by Jamie-Cheung 1
  • TransGAN fid problem

    TransGAN fid problem

    hello,I would like to humbly ask you what is the difference beetween TransGAN-main and TransGAN-master?can Trans-main reproduce similar results of the original paper? The results obtained by using CIFAR in TransGAN-main are quite different from those in the paper,and WGAN-EP loss concussion,so I want to ask you.

    opened by Stephenlove 1
  • How do you test on your own dataset with the checkpoint.pth generated?

    How do you test on your own dataset with the checkpoint.pth generated?

    I want to use the checkpoint saved to generate my own results from a testing dataset and use those images later to calculate my own evaluation metrics. Please help

    opened by meh-naz 0
Releases(v2.0)
Owner
Ahmet Sarigun
Yet, another human being!
Ahmet Sarigun
Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information by Masato Tamura, Hiroki Ohashi, and Tomoaki Yosh

105 Dec 23, 2022
[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

153 Dec 14, 2022
IGCN : Image-to-graph convolutional network

IGCN : Image-to-graph convolutional network IGCN is a learning framework for 2D/3D deformable model registration and alignment, and shape reconstructi

Megumi Nakao 7 Oct 27, 2022
A demonstration of using a live Tensorflow session to create an interactive face-GAN explorer.

Streamlit Demo: The Controllable GAN Face Generator This project highlights Streamlit's new hash_func feature with an app that calls on TensorFlow to

Streamlit 257 Dec 31, 2022
Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

Myo Keylogging This is the source code for our paper My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack by Matthias Ga

Secure Mobile Networking Lab 7 Jan 03, 2023
A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

TransPose Code for our SIGGRAPH 2021 paper "TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors". This repository

Xinyu Yi 261 Dec 31, 2022
Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

Easy Few-Shot Learning Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification. This repository is made for you

Sicara 399 Jan 08, 2023
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
Example how to deploy deep learning model with aiohttp.

aiohttp-demos Demos for aiohttp project. Contents Imagetagger Deep Learning Image Classifier URL shortener Toxic Comments Classifier Moderator Slack B

aio-libs 661 Jan 04, 2023
[NeurIPS 2021] Low-Rank Subspaces in GANs

Low-Rank Subspaces in GANs Figure: Image editing results using LowRankGAN on StyleGAN2 (first three columns) and BigGAN (last column). Low-Rank Subspa

112 Dec 28, 2022
Neural Re-rendering for Full-frame Video Stabilization

NeRViS: Neural Re-rendering for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo

Yu-Lun Liu 9 Jun 17, 2022
[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links

LinkBERT: A Knowledgeable Language Model Pretrained with Document Links This repo provides the model, code & data of our paper: LinkBERT: Pretraining

Michihiro Yasunaga 264 Jan 01, 2023
Make your master artistic punk avatar through machine learning world famous paintings.

Master-art-punk Make your master artistic punk avatar through machine learning world famous paintings. 通过机器学习世界名画制作属于你的大师级艺术朋克头像 Nowadays, NFT is beco

Philipjhc 53 Dec 27, 2022
A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

PokeGAN A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon. Dataset The model has been trained on dataset that includes 8

19 Jul 26, 2022
[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

Junha Lee 10 Dec 02, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Multi-tool reverse engineering collaboration solution.

CollaRE v0.3 Intorduction CollareRE is a tool for collaborative reverse engineering that aims to allow teams that do need to use more then one tool du

105 Nov 27, 2022
Framework for evaluating ANNS algorithms on billion scale datasets.

Billion-Scale ANN http://big-ann-benchmarks.com/ Install The only prerequisite is Python (tested with 3.6) and Docker. Works with newer versions of Py

Harsha Vardhan Simhadri 132 Dec 24, 2022
Geometry-Aware Learning of Maps for Camera Localization (CVPR2018)

Geometry-Aware Learning of Maps for Camera Localization This is the PyTorch implementation of our CVPR 2018 paper "Geometry-Aware Learning of Maps for

NVIDIA Research Projects 321 Nov 26, 2022
A repo for Causal Imitation Learning under Temporally Correlated Noise

CausIL A repo for Causal Imitation Learning under Temporally Correlated Noise. Running Experiments To re-train an expert, run: python experts/train_ex

Gokul Swamy 5 Nov 01, 2022