Faster RCNN with PyTorch

Last update: Dec 23, 2022

Overview

Faster RCNN with PyTorch

Note: I re-implemented faster rcnn in this project when I started learning PyTorch. Then I use PyTorch in all of my projects. I still remember it costed one week for me to figure out how to build cuda code as a pytorch layer :). But actually this is not a good implementation and I didn't achieve the same mAP as the original caffe code.

This project is no longer maintained and may not compatible with the newest pytorch (after 0.4.0). So I suggest:

You can still read and study this code if you want to re-implement faster rcnn by yourself;
You can use the better PyTorch implementation by ruotianluo or Detectron.pytorch if you want to train faster rcnn with your own data;

This is a PyTorch implementation of Faster RCNN. This project is mainly based on py-faster-rcnn and TFFRCNN.

For details about R-CNN please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks by Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun.

Progress

Forward for detecting
RoI Pooling layer with C extensions on CPU (only forward)
RoI Pooling layer on GPU (forward and backward)
Training on VOC2007
TensroBoard support
Evaluation

Installation and demo

Install the requirements (you can use pip or Anaconda):

conda install pip pyyaml sympy h5py cython numpy scipy
conda install -c menpo opencv3
pip install easydict

Clone the Faster R-CNN repository

git clone [email protected]:longcw/faster_rcnn_pytorch.git

Build the Cython modules for nms and the roi_pooling layer
```
cd faster_rcnn_pytorch/faster_rcnn
./make.sh
```
Download the trained model VGGnet_fast_rcnn_iter_70000.h5 and set the model path in demo.py
Run demo python demo.py

Training on Pascal VOC 2007

Follow this project (TFFRCNN) to download and prepare the training, validation, test data and the VGG16 model pre-trained on ImageNet.

Since the program loading the data in faster_rcnn_pytorch/data by default, you can set the data path as following.

cd faster_rcnn_pytorch
mkdir data
cd data
ln -s $VOCdevkit VOCdevkit2007

Then you can set some hyper-parameters in train.py and training parameters in the .yml file.

Now I got a 0.661 mAP on VOC07 while the origin paper got a 0.699 mAP. You may need to tune the loss function defined in faster_rcnn/faster_rcnn.py by yourself.

Training with TensorBoard

With the aid of Crayon, we can access the visualisation power of TensorBoard for any deep learning framework.

To use the TensorBoard, install Crayon (https://github.com/torrvision/crayon) and set use_tensorboard = True in faster_rcnn/train.py.

Evaluation

Set the path of the trained model in test.py.

cd faster_rcnn_pytorch
mkdir output
python test.py

License: MIT license (MIT)

Faster RCNN with PyTorch

Related tags

Overview

Faster RCNN with PyTorch

Progress

Installation and demo

Training on Pascal VOC 2007

Training with TensorBoard

Evaluation

Owner

Long Chen

A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

Bringing Characters to Life with Computer Brains in Unity

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Collection of Docker images for ML/DL and video processing projects

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Human Detection - Pedestrian Detection using OpenCV Python

The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS.

Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai

Small-bets - Ergodic Experiment With Python

Python implementation of Project Fluent

PoolFormer: MetaFormer is Actually What You Need for Vision

A motion detection system with RaspberryPi, OpenCV, Python

AFLNet: A Greybox Fuzzer for Network Protocols

Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

Python Interview Questions

Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

Repository for the semantic WMI loss

Planning from Pixels in Environments with Combinatorially Hard Search Spaces -- NeurIPS 2021