The code is an implementation of Feedback Convolutional Neural Network for Visual Localization and Segmentation.

Last update: Dec 04, 2022

Related tags

Overview

Feedback Convolutional Neural Network for Visual Localization and Segmentation

The code is an implementation of Feedback Convolutional Neural Network for Visual Localization and Segmentation. The code is written in PyTorch, very simple to understand.

There is also a Caffe implementation, please check it if you use Caffe and Matlab.

Requirement:

Python 3
Pytorch 0.4.0

How to run:

open the ipython notebooks with jupyter notebook

then open vgg_fr.ipynb or vgg_fsp.ipynb, these are the two main files for demonstrate feedback idea.

How it looks:

If you run vgg_fsp.ipynb without modification of code, you are supposed to see below visualization:

Input image:

Image gradient with respect to the target label:

Image gradient with respect to the target label after 4 iterations of feedback selective pruning (FSP):

Files explanation:

vgg_fr.ipynb: the main file that defines the vgg feedback network with the feedback recovering mechanism and run a feedback visualization on examplar images.
vgg_fsp.ipynb: the main file that defines the vgg feedback network with the feedback selective pruning mechanism and run a feedback visualization on examplar images.
images: storing exmaplar images
imagenet1000_clsid_to_human.txt: storing image net 1000 class names, for visualization and understanding purpose
test/simple_test.ipynb: unit test for a simple feedback network, using a simple fully connected structure
test/vgg_test.ipynb: unit test for the loading of a pretrained vgg network, then check the weights copying from pretrained network to a new defined network interface

Citation

Please consider citing in your publications if it helps your research:

@inproceedings{cao2015look,
  title={Look and think twice: Capturing top-down visual attention with feedback convolutional neural networks},
  author={Cao, Chunshui and Liu, Xianming and Yang, Yi and Yu, Yinan and Wang, Jiang and Wang, Zilei and Huang, Yongzhen and Wang, Liang and Huang, Chang and Xu, Wei and others},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={2956--2964},
  year={2015}
}

The code is an implementation of Feedback Convolutional Neural Network for Visual Localization and Segmentation.

Related tags

Overview

Feedback Convolutional Neural Network for Visual Localization and Segmentation

Requirement:

How to run:

How it looks:

Files explanation:

Citation

Owner

The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

Intelligent Video Analytics toolkit based on different inference backends.

Code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

Logsig-RNN: a novel network for robust and efficient skeleton-based action recognition

Advances in Neural Information Processing Systems (NeurIPS), 2020.

This repository contains the code for Direct Molecular Conformation Generation (DMCG).

The implementation for "Comprehensive Knowledge Distillation with Causal Intervention".

JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition

Image to Image translation, image generataton, few shot learning

Official repo for AutoInt: Automatic Integration for Fast Neural Volume Rendering in CVPR 2021

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

A hybrid framework (neural mass model + ML) for SC-to-FC prediction

This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

Magisk module to enable hidden features on Android 12 Developer Preview 1.

A NSFW content filter.

Simple Baselines for Human Pose Estimation and Tracking

Efficient 6-DoF Grasp Generation in Cluttered Scenes