A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

Last update: Jan 17, 2022

Related tags

Overview

A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

This is the repository for our Paper/Contribution to the WI2022 in Nürnberg.

Abstract

In recent years, large pre-trained deep neural networks (DNNs) have revolutionized the field of computer vision (CV). Although these DNNs have been shown to be very well suited for general image recognition tasks, application in industry is often precluded for three reasons:

large pre-trained DNNs are built on hundreds of millions of parameters, making deployment on many devices impossible,
the underlying dataset for pre-training consists of general objects, while industrial cases often consist of very specific objects, such as structures on solar wafers,
potentially biased pre-trained DNNs raise legal issues for companies.

As a remedy, we study neural networks for CV that we train from scratch. For this purpose, we use a real-world case from a solar wafer manufacturer. We find that our neural networks achieve similar performances as pre-trained DNNs, even though they consist of far fewer parameters and do not rely on third-party datasets.

Structure of this repository

+-- ImageClassification            | Runner Notebook + Scripts for experiments
+-- ReadMe.md			   | ReadMe
+-- Results.xlsx                   | Results that were reported in the paper
+-- RunResults                     | Detailed logging of our experiments results that were reported in the paper (IDs correspond to old IDs in the .xlsx file due to procedure)

You might also like...

Computer vision - fun segmentation experience using classic and deep tools :)

Computer_Vision_Segmentation_Fun Segmentation of Images and Video. Tools: pytorch Models: Classic model - GrabCut Deep model - Deeplabv3_resnet101 Flo

1 Dec 18, 2021

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision Project | Arxiv | Abstract It is very challenging for various visual tasks such as image

377 Jan 7, 2023

Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

91 Dec 2, 2022

Best Practices on Recommendation Systems

Recommenders What's New (February 4, 2021) We have a new relase Recommenders 2021.2! It comes with lots of bug fixes, optimizations and 3 new algorith

14.8k Jan 3, 2023

Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract

52 Nov 22, 2022

A DeepStack custom model for detecting common objects in dark/night images and videos.

A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

Related tags

Overview

A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

Abstract

Structure of this repository

You might also like...

Computer vision - fun segmentation experience using classic and deep tools :)

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

Best Practices on Recommendation Systems

Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

A DeepStack custom model for detecting common objects in dark/night images and videos.

An unofficial styleguide and best practices summary for PyTorch

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

Dark Finix: All in one hacking framework with almost 100 tools

Releases(v1.0)

v1.0(Jan 5, 2022)

Owner

Maximilian Harl

Bilinear attention networks for visual question answering

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

[ICCV' 21] "Unsupervised Point Cloud Pre-training via Occlusion Completion"

Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

YOLO-v5 기반 단안 카메라의 영상을 활용해 차간 거리를 일정하게 유지하며 주행하는 Adaptive Cruise Control 기능 구현

a morph transfer UGATIT for image translation.

Additional functionality for use with fastai’s medical imaging module

code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification

The Submission for SIMMC 2.0 Challenge 2021

WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction"

FedML: A Research Library and Benchmark for Federated Machine Learning

Imaginaire - NVIDIA's Deep Imagination Team's PyTorch Library

Replication of Pix2Seq with Pretrained Model

Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

SmallInitEmb - LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence

Code for layerwise detection of linguistic anomaly paper (ACL 2021)

Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

Use .csv files to record, play and evaluate motion capture data.

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception