A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

Last update: Dec 03, 2022

Overview

R-YOLOv4

This is a PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection. (Final project for NCKU INTRODUCTION TO ARTIFICIAL INTELLIGENCE course)

Introduction

The objective of this project is to adapt YOLOv4 model to detecting oriented objects. As a result, modifying the original loss function of the model is required. I got a successful result by increasing the number of anchor boxes with different rotating angle and combining smooth-L1-IoU loss function proposed by R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object into the original loss for bounding boxes.

Features

Loss Function (only for x, y, w, h, theta)

Scheduler

Cosine Annealing with Warmup (Reference: Cosine Annealing with Warmup for PyTorch)

Recall

As the paper suggested, I get a better results from **f(ariou) = exp(1-ariou)-1**. Therefore I used it for my loss function.

Usage

Clone and Setup Environment

$ git clone https://github.com/kunnnnethan/R-YOLOv4.git
$ cd R-YOLOv4/

Create Conda Environment

$ conda env create -f environment.yml

Create Python Virtual Environment

$ python3.8 -m venv (your environment name)
$ source ~/your-environment-name/bin/activate
$ pip3 install torch torchvision torchaudio
$ pip install -r requirements.txt

Download pretrained weights
weights

Make sure your files arrangment looks like the following
Note that each of your dataset folder in data should split into three files, namely train, test, and detect.

R-YOLOv4/
├── train.py
├── test.py
├── detect.py
├── xml2txt.py
├── environment.xml
├── requirements.txt
├── model/
├── datasets/
├── lib/
├── outputs/
├── weights/
    ├── pretrained/ (for training)
    └── UCAS-AOD/ (for testing and detection)
└── data/
    └── UCAS-AOD/
        ├── class.names
        ├── train/
            ├── ...png
            └── ...txt
        ├── test/
            ├── ...png
            └── ...txt
        └── detect/
            └── ...png

Train, Test, and Detect
Please refer to lib/options.py to check out all the arguments.

Train

I have implemented methods to load and train three different datasets. They are UCAS-AOD, DOTA, and custom dataset respectively. You can check out how I loaded those dataset into the model at /datasets. The angle of each bounding box is limited in (- pi/2, pi/2], and the height of each bounding box is always longer than it's width.

You can run experiments/display_inputs.py to visualize whether your data is loaded successfully.

UCAS-AOD dataset

Please refer to this repository to rearrange files so that it can be loaded and trained by this model.
You can download the weight that I trained from UCAS-AOD.

While training, please specify which dataset you are using.
$ python train.py --dataset UCAS_AOD

DOTA dataset

Download the official dataset from here. The original files should be able to be loaded and trained by this model.

While training, please specify which dataset you are using.
$ python train.py --dataset DOTA

Train with custom dataset

Use labelImg2 to help label your data. labelImg2 is capable of labeling rotated objects.
Move your data folder into the R-YOLOv4/data folder.
Run xml2txt.py
1. generate txt files: python xml2txt.py --data_folder your-path --action gen_txt
2. delete xml files: python xml2txt.py --data_folder your-path --action del_xml

A trash custom dataset that I made and the weight trained from it are provided for your convenience.

While training, please specify which dataset you are using.
$ python train.py --dataset custom

Training Log

---- [Epoch 2/2] ----
+---------------+--------------------+---------------------+---------------------+----------------------+
| Step: 596/600 | loss               | reg_loss            | conf_loss           | cls_loss             |
+---------------+--------------------+---------------------+---------------------+----------------------+
| YoloLayer1    | 0.4302629232406616 | 0.32991039752960205 | 0.09135108441114426 | 0.009001442231237888 |
| YoloLayer2    | 0.7385762333869934 | 0.5682911276817322  | 0.15651139616966248 | 0.013773750513792038 |
| YoloLayer3    | 1.5002599954605103 | 1.1116538047790527  | 0.36262497305870056 | 0.025981156155467033 |
+---------------+--------------------+---------------------+---------------------+----------------------+
Total Loss: 2.669099, Runtime: 404.888372

Tensorboard

If you would like to use tensorboard for tracking traing process.

Open additional terminal in the same folder where you are running program.
Run command $ tensorboard --logdir='weights/your_model_name/logs' --port=6006
Go to http://localhost:6006/

Results

UCAS_AOD

Method	Plane	Car	mAP
YOLOv4 (smoothL1-iou)	98.05	92.05	95.05

DOTA

DOTA have not been tested yet. (It's quite difficult to test because of large resolution of images)

trash (custom dataset)

Method	Plane	Car	mAP
YOLOv4 (smoothL1-iou)	100.00	100.00	100.00

TODO

Mosaic Augmentation
Mixup Augmentation

References

yangxue0827/RotationDetection
eriklindernoren/PyTorch-YOLOv3
Tianxiaomo/pytorch-YOLOv4
ultralytics/yolov5

YOLOv4: Optimal Speed and Accuracy of Object Detection

Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao

Abstract There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets...

@article{yolov4,
  title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
  author={Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
  journal = {arXiv},
  year={2020}
}

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object

Xue Yang, Junchi Yan, Ziming Feng, Tao He

Abstract Rotation detection is a challenging task due to the difficulties of locating the multi-angle objects and separating them effectively from the background. Though considerable progress has been made, for practical settings, there still exist challenges for rotating objects with large aspect ratio, dense distribution and category extremely imbalance. In this paper, we propose an end-to-end refined single-stage rotation detector for fast and accurate object detection by using a progressive regression approach from coarse to fine granularity...

@article{r3det,
  title={R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object},
  author={Xue Yang, Junchi Yan, Ziming Feng, Tao He},
  journal = {arXiv},
  year={2019}
}

A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

Related tags

Overview

R-YOLOv4

Introduction

Features

Loss Function (only for x, y, w, h, theta)

Scheduler

Recall

Usage

Train

UCAS-AOD dataset

DOTA dataset

Train with custom dataset

Training Log

Tensorboard

Results

UCAS_AOD

DOTA

trash (custom dataset)

TODO

References

Owner

This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.

Repository for the COLING 2020 paper "Explainable Automated Fact-Checking: A Survey."

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

CN24 is a complete semantic segmentation framework using fully convolutional networks

Solve a Rubiks Cube using Python Opencv and Kociemba module

The ARCA23K baseline system

Unofficial implementation (replicates paper results!) of MINER: Multiscale Implicit Neural Representations in pytorch-lightning

Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

FedML: A Research Library and Benchmark for Federated Machine Learning

Improving adversarial robustness by a coupling rejection strategy

KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

List of all dependencies affected by node-ipc malicious commit