PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

Overview

SimTrans-Weak-Shot-Classification

This repository contains the official PyTorch implementation of the following paper:

Weak-shot Fine-grained Classification via Similarity Transfer

Junjie Chen, Li Niu, Liu Liu, Liqing Zhang
MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University
https://arxiv.org/abs/2009.09197
Accepted by NeurIPS2021.

Abstract

Recognizing fine-grained categories remains a challenging task, due to the subtle distinctions among different subordinate categories, which results in the need of abundant annotated samples. To alleviate the data-hungry problem, we consider the problem of learning novel categories from web data with the support of a clean set of base categories, which is referred to as weak-shot learning. In this setting, we propose to transfer pairwise semantic similarity from base categories to novel categories. Specifically, we firstly train a similarity net on clean data, and then leverage the transferred similarity to denoise web training data using two simple yet effective strategies. In addition, we apply adversarial loss on similarity net to enhance the transferability of similarity. Comprehensive experiments on three fine-grained datasets demonstrate the effectiveness of our setting and method.

1. Setting

In practice, we often have a set of base categories with sufficient well-labeled data, and the problem is how to learn novel categories with less expense, in which base categories and novel categories have no overlap. Such problem motivates zero-shot learning, few-shot learning, as well as our setting. To bridge the gap between base categories and novel categories, zero-shot learning requires category-level semantic representation for all categories, while few-shot learning requires a few clean examples for novel categories. Considering the drawbacks of zero/few-shot learning and the accessibility of free web data, we intend to learn novel categories by virtue of web data with the support of a clean set of base categories.

2. Our Method

Specifically, our framework consists of two training phases. Firstly, we train a similarity net (SimNet) on base training set, which feeds in two images and outputs the semantic similarity. Secondly, we apply the trained SimNet to obtain the semantic similarities among web images. In this way, the similarity is transferred from base categories to novel categories. Based on the transferred similarities, we design two simple yet effective methods to assist in learning the main classifier on novel training set. (1) Sample weighting (i.e., assign small weights to the images dissimilar to others) reduces the impact of outliers (web images with incorrect labels) and thus alleviates the problem of noise overfitting. (2) Graph regularization (i.e., pull close the features of semantically similar samples) prevents the feature space from being disturbed by noisy labels. In addition, we propose to apply adversarial loss on SimNet to make it indistinguishable for base categories and novel categories, so that the transferability of similarity is strengthened.

3. Results

Extensive experiments on three fine-grained datasets have demonstrated the potential of our learning scenario and the effectiveness of our method. For qualitative analysis, on the one hand, the clean images are assigned with high weights, while the images belonging to outlier are assigned with low weights; on the other hand, the transferred similarities accurately portray the semantic relations among web images.

4. Experiment Codebase

4.1 Data

We provide the packages of CUB, Car, FGVC, and WebVision at Baidu Cloud (access code: BCMI).

The original packages are split by split -b 10G ../CUB.zip CUB.zip., thus we need merge by cat CUB.zip.a* > CUB.zip before decompression.

The ImageNet dataset is publicly available, and all data files are configured as:

├── CUB
├── Car
├── Air
├── WebVision
├── ImageNet:
  ├── train
      ├── ……
  ├── val
      ├── ……
  ├── ILSVRC2012_validation_ground_truth.txt
  ├── meta.mat
  ├── train_files.txt

Just employ --data_path ANY_PATH/CUB to specify the data dir.

4.2 Install

See requirement.txt.

4.3 Evaluation

The trained models are released as trained_models.zip at Baidu Cloud (access code: BCMI).

The command in _scripts/DATASET_NAME/eval.sh is used to evaluate the model.

4.4 Training

We provide the full scripts for CUB dataset in _scripts/CUB/ dir as an example.

For other datasets, just change the data path, i.e., --data_path ANY_PATH/WebVision.

Bibtex

If you find this work is useful for your research, please cite our paper using the following BibTeX [pdf] [supp] [arxiv]:

@inproceedings{SimTrans2021,
title={Weak-shot Fine-grained Classification via Similarity Transfer},
author={Chen, Junjie and Niu, Li and Liu, Liu and Zhang, Liqing},
booktitle={NeurIPS},
year={2021}}
Owner
BCMI
Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University.
BCMI
[CVPR 2021] Pytorch implementation of Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs In this work, we propose a framework HijackGAN, which enables non-linear latent space travers

Hui-Po Wang 46 Sep 05, 2022
Auxiliary data to the CHIIR paper Searching to Learn with Instructional Scaffolding

Searching to Learn with Instructional Scaffolding This is the data and analysis code for the paper "Searching to Learn with Instructional Scaffolding"

Arthur Câmara 2 Mar 02, 2022
C3DPO - Canonical 3D Pose Networks for Non-rigid Structure From Motion.

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion By: David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedal

Meta Research 309 Dec 16, 2022
Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Learning the Beauty in Songs: Neural Singing Voice Beautifier Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao Zhejiang University ACL 2022 Mai

Jinglin Liu 257 Dec 30, 2022
Release of the ConditionalQA dataset

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

14 Oct 17, 2022
Knowledge Management for Humans using Machine Learning & Tags

HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.

Ravn Tech, Inc. 165 Nov 04, 2022
NLP made easy

GluonNLP: Your Choice of Deep Learning for NLP GluonNLP is a toolkit that helps you solve NLP problems. It provides easy-to-use tools that helps you l

Distributed (Deep) Machine Learning Community 2.5k Jan 04, 2023
Simple and Distributed Machine Learning

Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy

Microsoft 3.9k Dec 30, 2022
A deep learning framework for historical document image analysis

DIVA-DAF Description A deep learning framework for historical document image analysis. How to run Install dependencies # clone project git clone https

9 Aug 04, 2022
rastrainer is a QGIS plugin to training remote sensing semantic segmentation model based on PaddlePaddle.

rastrainer rastrainer is a QGIS plugin to training remote sensing semantic segmentation model based on PaddlePaddle. UI TODO Init UI. Add Block. Add l

deepbands 5 Mar 04, 2022
Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification We provide the codes for repr

12 Dec 12, 2022
Optimized code based on M2 for faster image captioning training

Transformer Captioning This repository contains the code for Transformer-based image captioning. Based on meshed-memory-transformer, we further optimi

lyricpoem 16 Dec 16, 2022
Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

Rowel Atienza 152 Dec 28, 2022
[NeurIPS 2021] SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning

SSUL - Official Pytorch Implementation (NeurIPS 2021) SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning Sun

Clova AI Research 44 Dec 27, 2022
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

Jianquan Ye 298 Dec 21, 2022
RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth, in ICCV 2021 (oral)

RINDNet RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth Mengyang Pu, Yaping Huang, Qingji Guan and Haibin Lin

Mengyang Pu 75 Dec 15, 2022
FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI 声明: 本项目仅限于学习交流,不可用于非法用途,包括但不限于:用于游戏外挂等,使用本项目产生的任何后果与本人无关! 简介 本项目基于yolov5,实现了一款FPS类游戏(CF、CSGO等)的自瞄AI,本项目旨在使用现

Fabian 246 Dec 28, 2022
Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

Inter-Prototype (BMVC 2021): Official Project Webpage This repository provides the official PyTorch implementation of the following paper: Improving F

Jungsoo Lee 16 Jun 30, 2022
Deep Inside Convolutional Networks - This is a caffe implementation to visualize the learnt model

Deep Inside Convolutional Networks This is a caffe implementation to visualize the learnt model. Part of a class project at Georgia Tech Problem State

Jigar 61 Apr 15, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

249 Dec 28, 2022