Gesture-Detection-and-Depth-Estimation

This is my graduation project.

(1) In this project, I use the YOLOv3 object detection model to detect gesture in RGB image. I trained the model on the self-made gesture dataset to obtain the gesture detection model based on deep learning. Then by testing the model on the test dataset, I found that the model can meet the requirements of real-time gesture detection while maintaining high accuracy.

(2) Then I tried to use the monocular depth estimation algorithm based on depth learning to estimate the depth of gesture object from a single RGB image, including FastDepth algorithm and the improved detection model based on YOLOv3. The FastDepth algorithm is trained and tested on the self-made gesture-depth dataset. Then, by adding a depth vector to output dimensions and modifying the loss function, the function of estimating target depth is added to the YOLOv3 model. Then I trained and tested the modified YOLOv3 model on the same gesture-depth dataset. Finally, the experiment results show that both methods can estimate the depth information of gesture object in RGB image to a certain extent.

Gesture detection:

Depth data:

Estimate target depth：

(3) Also, I developed a simple program with PyOpenGL that can use gesture information to draw simple shapes in three-dimensional space.

Try to draw a cube:

For more information, you can check my final paper.

YOLOv3 model is based on coldlarry's model: https://github.com/coldlarry/YOLOv3-complete-pruning

Graduation Project

Related tags

Overview

Gesture-Detection-and-Depth-Estimation

Owner

ChaosAT

Tensorflow AffordanceNet and AffContext implementations

Equivariant GNN for the prediction of atomic multipoles up to quadrupoles.

Pydantic models for pywttr and aiopywttr.

Prior-Guided Multi-View 3D Head Reconstruction

An implementation of RetinaNet in PyTorch.

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

Code for Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022)

Dados coletados e programas desenvolvidos no processo de iniciação científica

Machine Learning Model deployment for Container (TensorFlow Serving)

TensorFlow implementation of Adaptive Information Transfer Multi-task (AITM) framework. Code for the paper submitted to KDD21: Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning for Customer Acquisition.

Camera-caps - Examine the camera capabilities for V4l2 cameras

Character Grounding and Re-Identification in Story of Videos and Text Descriptions

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

Neural style transfer in PyTorch.

Active Offline Policy Selection With Python

Code for Active Learning at The ImageNet Scale.

Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

[ICCV'21] Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment

Rethinking Nearest Neighbors for Visual Classification

[CVPR2021] Invertible Image Signal Processing