TextureGAN in Pytorch

Overview

TextureGAN

This code is our PyTorch implementation of TextureGAN [Project] [Arxiv]

TextureGAN is a generative adversarial network conditioned on sketch and colors/textures. Users “drag” one or more example textures onto sketched objects and the network realistically applies these textures to the indicated objects.

Setup

Prerequisites

  • Linux or OSX
  • Python 2.7
  • NVIDIA GPU + CUDA CuDNN

Dependency

  • Visdom
  • Ipython notebook
  • Pytorch 0.2 (torch and torchvision)
  • Numpy scikit-image matplotlib etc.

Getting Started

  • Clone this repo
git clone [email protected]:janesjanes/texturegan.git
cd texturegan
  • Prepare Datasets Download the training data:
wget https://s3-us-west-2.amazonaws.com/texturegan/training_handbag.tar.gz
tar -xvcf training_handbag.tar.gz

For shoe: https://s3-us-west-2.amazonaws.com/texturegan/training_shoe.tar.gz

For cloth: https://s3-us-west-2.amazonaws.com/texturegan/training_cloth.tar.gz

  • Train the model from scratch. See python main.py --help for training options. Example arguments (see the paper for the exact parameters value):
python main.py --display_port 7779 --gpu 3 --model texturegan --feature_weight 5e3 --pixel_weight_ab 1e4 
--global_pixel_weight_l 5e5 --local_pixel_weight_l 0 --style_weight 0 --discriminator_weight 5e5 --discriminator_local_weight 7e5  --learning_rate 5e-4 --learning_rate_D 1e-4 --batch_size 36 --save_every 100 --num_epoch 100000 --save_dir [./save_dir] 
--data_path [training_handbags_pretrain/] --learning_rate_D_local  1e-4 --local_texture_size 50 --patch_size_min 20 
--patch_size_max 50 --num_input_texture_patch 1 --visualize_every 5 --num_local_texture_patch 5

Models will be saved to ./save_dir

See more training details in section Train

You can also load our pretrained models in section Download Models.

To view results and losses as the model trains, start a visdom server for the ‘display_port’

python -m visdom.server -port 7779

Test the model

  • See our Ipython Notebook Test_script.ipynb

Train

TextureGAN proposes a two-stage training scheme.

  • The first training state is ground-truth pre-training. We extract input edge and texture patch from the same ground-truth image. Here, we show how to train the ground-truth pretrained model using a combination of pixel loss, color loss, feature loss, and adverserial loss.
python main.py --display_port 7779 --gpu 0 --model texturegan --feature_weight 10 --pixel_weight_ab 1e5 
--global_pixel_weight_l 100 --style_weight 0 --discriminator_weight 10 --learning_rate 1e-3 --learning_rate_D 1e-4 --save_dir
[/home/psangkloy3/handbag_texturedis_scratch] --data_path [./save_dir] --batch_size 16 --save_every 500 --num_epoch 100000 
--input_texture_patch original_image --loss_texture original_image --local_texture_size 50 --discriminator_local_weight 100  
--num_input_texture_patch 1
  • The second stage is external texture fine-tuning. This step is important for the network to reproduce textures for which we have no ground-truth output (e.g. a handbag with snakeskin texture). This time, we extract texture patch from an external texture dataset (see more in Section Download Dataset). We keep the feature and adversarial losses unchanged, but modify the pixel and color losses, to compare the generated result with the entire input texture from which input texture patches are extracted. We fine tune on previous pretrained model with addition of local texture loss by training a separate texture discriminator.
python main.py --display_port 7779 --load 1500 --load_D 1500 --load_epoch 222 --gpu 0 --model texturegan --feature_weight 5e3
--pixel_weight_ab 1e4 --global_pixel_weight_l 5e5 --local_pixel_weight_l 0 --style_weight 0 --discriminator_weight 5e5 
--discriminator_local_weight 7e5  --learning_rate 5e-4 --learning_rate_D 1e-4 --batch_size 36 --save_every 100 --num_epoch
100000 --save_dir [skip_leather_handbag/] --load_dir [handbag_texturedis_scratch/] 
--data_path [./save_dir] --learning_rate_D_local  1e-4 --local_texture_size 50 --patch_size_min 20 --patch_size_max 50 
--num_input_texture_patch 1 --visualize_every 5 --input_texture_patch dtd_texture --num_local_texture_patch 5

Download Datasets

The datasets we used for generating sketch and image pair in this paper are collected by other researchers. Please cite their papers if you use the data. The dataset is split into train and test set.

Edges are computed by HED edge detector + post-processing. [Citation]

The datasets we used for inputting texture patches are DTD Dataset and leather dataset we collected from the internet.

  • DTD Dataset:
  • Leather Dataset:

Download Models

Pre-trained models

Citation

If you find it this code useful for your research, please cite:

"TextureGAN: Controlling Deep Image Synthesis with Texture Patches"

Wenqi Xian, Patsorn Sangkloy, Varun Agrawal, Amit Raj, Jingwan Lu, Chen Fang, Fisher Yu, James Hays in CVPR, 2018.

@article{xian2017texturegan,
  title={Texturegan: Controlling deep image synthesis with texture patches},
  author={Xian, Wenqi and Sangkloy, Patsorn and Agrawal, Varun and Raj, Amit and Lu, Jingwan and Fang, Chen and Yu, Fisher and Hays, James},
  journal={arXiv preprint arXiv:1706.02823},
  year={2017}
}
CVPRW 2021: How to calibrate your event camera

E2Calib: How to Calibrate Your Event Camera This repository contains code that implements video reconstruction from event data for calibration as desc

Robotics and Perception Group 104 Nov 16, 2022
Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

PGT Code for paper PGT: A Progressive Method for Training Models on Long Videos. Install Run pip install -r requirements.txt. Run python setup.py buil

Bo Pang 27 Mar 30, 2022
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

AugMix Introduction We propose AugMix, a data processing technique that mixes augmented images and enforces consistent embeddings of the augmented ima

Google Research 876 Dec 17, 2022
GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms Trying to publish a new machine learning model and can't write a decent title for your pa

264 Nov 08, 2022
lightweight python wrapper for vowpal wabbit

vowpal_porpoise Lightweight python wrapper for vowpal_wabbit. Why: Scalable, blazingly fast machine learning. Install Install vowpal_wabbit. Clone and

Joseph Reisinger 163 Nov 24, 2022
Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

WangRui 14 Sep 14, 2022
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 677 Dec 25, 2022
This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML)

package tests docs license stats support This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML

National Center for Cognitive Research of ITMO University 482 Dec 26, 2022
An energy estimator for eyeriss-like DNN hardware accelerator

Energy-Estimator-for-Eyeriss-like-Architecture- An energy estimator for eyeriss-like DNN hardware accelerator This is an energy estimator for eyeriss-

HEXIN BAO 2 Mar 26, 2022
Array Camera Ptychography

Array Camera Ptychography This repository provides the code for the following papers: Schulz, Timothy J., David J. Brady, and Chengyu Wang. "Photon-li

Brady lab in Optical Sciences 1 Nov 15, 2021
Self-Supervised Learning of Event-based Optical Flow with Spiking Neural Networks

Self-Supervised Learning of Event-based Optical Flow with Spiking Neural Networks Work accepted at NeurIPS'21 [paper, video]. If you use this code in

TU Delft 43 Dec 07, 2022
Editing a Conditional Radiance Field

Editing Conditional Radiance Fields Project | Paper | Video | Demo Editing Conditional Radiance Fields Steven Liu, Xiuming Zhang, Zhoutong Zhang, Rich

Steven Liu 216 Dec 30, 2022
A tool for making map images from OpenTTD save games

OpenTTD Surveyor A tool for making map images from OpenTTD save games. This is not part of the main OpenTTD codebase, nor is it ever intended to be pa

Aidan Randle-Conde 9 Feb 15, 2022
Implementation of Vaswani, Ashish, et al. "Attention is all you need."

Attention Is All You Need Paper Implementation This is my from-scratch implementation of the original transformer architecture from the following pape

Brando Koch 195 Dec 30, 2022
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

19 Sep 29, 2022
This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Developed By Google!

Machine Learning Hand Detector This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Dev

Popstar Idhant 3 Feb 25, 2022
A PyTorch implementation of EfficientDet.

A PyTorch impl of EfficientDet faithful to the original Google impl w/ ported weights

Ross Wightman 1.4k Jan 07, 2023
Python program that works as a contact list

Lista de Contatos Programa em Python que funciona como uma lista de contatos. Features Adicionar novo contato Remover contato Atualizar contato Pesqui

Victor B. Lino 3 Dec 16, 2021
DuBE: Duple-balanced Ensemble Learning from Skewed Data

DuBE: Duple-balanced Ensemble Learning from Skewed Data "Towards Inter-class and Intra-class Imbalance in Class-imbalanced Learning" (IEEE ICDE 2022 S

6 Nov 12, 2022
Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

WIBAM (Work in progress) Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data 3D object dete

Matthew Howe 10 Aug 24, 2022