9th place solution

Overview

AllDataAreExt-Galixir-Kaggle-HPA-2021-Solution

Team Members

  • Qishen Ha is Master of Engineering from the University of Tokyo. Machine Learning Engineer at LINE Corp. Kaggle Competition Grandmaster. Z by HP & NVIDIA Global Data Science Ambassador.

  • Bo Liu is currently a Senior Deep Learning Data Scientist at NVIDIA based in the U.S. and a Kaggle Competition Grandmaster.

  • Fuxu Liu is currently a Algorithm Engineer at ReadSense based in the China. Kaggle Competition Grandmaster. Z by HP & NVIDIA Global Data Science Ambassador.

  • Daishu is currently a Senior Research Scientist at Galixir. Kaggle Competition Grandmaster.

Methods

Overview of Methods

Image-to-cell augmentation module

We used two methods to train and make predictions in our pipeline.

Firstly, we use 512 x 512 image size to train and test. For predicting, we loop n times for each image (n is the number of cells in the image), leaving only one cell in each time and masking out the other cells to get single cell predictions.

The second method is trained with 768 x 786 images with random crop to 512 x 512 then tested almost the same way as our first approach. Specifically, we not only mask out the other cells but reposition of the cells in the left to the center of the image as well.

The two methods share the same training process, in which we incorporate two augmentation approach specifically designed for this task, in addition to regular augmentation methods such as random rotation, flipping, cropping, cutout and brightness adjusting. The first augmentation approach is, with a small probability, multiplying the data of the green channel (protein) by a random number in the range of [0.0,0.1] while setting the label to negative to improve the model's ability to recognize negative samples. The other augmentation approach is, with a small probability, setting the green channel to red (Microtubules) or yellow (Endoplasmicreticulum), multiplying it by a random number in the range of [0.6,1.0] and changing the label to the Microtubules or Endoplasmicreticulum.

pseudo-3D cell augmentation module

We pre-crop all the cells of each image and save them locally. Then during training, for each image we randomly select 16 cells. We then set bs=32, so for each batch we have 32x16=512 cells in total.

We resize each cell to 128x128, so the returned data shape from the dataloader is (32, 16, 4, 128, 128) . Next we reshape it into (512, 4, 128, 128) and then use a very common CNN to forward it, the output shape is (512, 19).

In the prediction phase we use the predicted average of different augmented images of a cell as the predicted value for each cell. But during the training process, we rereshape this (512, 19) prediction back into (32, 16, 19) . Then the loss is calculated for each cell with image-level GT label.

Featurziation with deep neural network

We use multipe CNN variants to train, such as EfficientNet, ResNet, DenseNet.

Classification

We average the different model predictions from different methods.

Tree-Structured Directory

├── input

│   ├──hpa-512: 512-image and 512-cell mask

│   │   ├── test

│   │   ├── test_cell_mask

│   │   ├── train

│   │   └── train_cell_mask

│   ├── hpa-seg : official segmentation models

│   └── hpa-single-cell-image-classification : official data and kaggle_2021.tsv

├── output : logs, models and submission

Code

  • S1_external_data_download.py: download external train data

  • S2_data_process.py: generate 512-image and 512-cell mask

  • S3_train_pipeline1.py: train image-to-cell augmentation module

  • S4.1_crop_cells.py: crop training cells for pseudo-3D cell augmentation module

  • S4.2_train_pipeline2.py: train pseudo-3D cell augmentation module

  • S5_predict.py: generate submission.csv

Owner
daishu
daishu
中文语音识别系列,读者可以借助它快速训练属于自己的中文语音识别模型,或直接使用预训练模型测试效果。

MASR中文语音识别(pytorch版) 开箱即用 自行训练 使用与训练分离(增量训练) 识别率高 说明:因为每个人电脑机器不同,而且有些安装包安装起来比较麻烦,强烈建议直接用我编译好的docker环境跑 目前docker基础环境为ubuntu-cuda10.1-cudnn7-pytorch1.6.

发送小信号 180 Dec 17, 2022
Codes for AAAI22 paper "Learning to Solve Travelling Salesman Problem with Hardness-Adaptive Curriculum"

Paper For more details, please see our paper Learning to Solve Travelling Salesman Problem with Hardness-Adaptive Curriculum which has been accepted a

14 Sep 30, 2022
Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

Triple-cooperative Video Shadow Detection Code and dataset for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"[arXiv link] [official l

Zhihao Chen 24 Oct 04, 2022
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
[ICLR 2022] DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

DAB-DETR This is the official pytorch implementation of our ICLR 2022 paper DAB-DETR. Authors: Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi

336 Dec 25, 2022
VM3000 Microphones

VM3000-Microphones This project was completed by Ricky Leman under the supervision of Dr Ben Travaglione and Professor Melinda Hodkiewicz as part of t

UWA System Health Lab 0 Jun 04, 2021
On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation On Nonlinear Latent Transformations for GAN-based Image Editi

Valentin Khrulkov 22 Oct 24, 2022
Learn other languages ​​using artificial intelligence with python.

The main idea of ​​the project is to facilitate the learning of other languages. We created a simple AI that will interact with you. Just ask questions that if she knows, she will answer.

Pedro Rodrigues 2 Jun 07, 2022
Conditional Gradients For The Approximately Vanishing Ideal

Conditional Gradients For The Approximately Vanishing Ideal Code for the paper: Wirth, E., and Pokutta, S. (2022). Conditional Gradients for the Appro

IOL Lab @ ZIB 0 May 25, 2022
Bayesian optimisation library developped by Huawei Noah's Ark Library

Bayesian Optimisation Research This directory contains official implementations for Bayesian optimisation works developped by Huawei R&D, Noah's Ark L

HUAWEI Noah's Ark Lab 395 Dec 30, 2022
DziriBERT: a Pre-trained Language Model for the Algerian Dialect

DziriBERT DziriBERT is the first Transformer-based Language Model that has been pre-trained specifically for the Algerian Dialect. It handles Algerian

117 Jan 07, 2023
Pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

About this repository This repo contains an Pytorch implementation for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Netwo

wxDai 7 Oct 14, 2022
Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

You Only Cut Once (YOCO) YOCO is a simple method/strategy of performing augmenta

88 Dec 28, 2022
Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation (RA-L/ICRA 2020)

Aerial Depth Completion This work is described in the letter "Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation", by Lucas

ETHZ V4RL 70 Dec 22, 2022
CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability

This is the official repository of the paper: CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability A private copy of the

Fadi Boutros 33 Dec 31, 2022
An AutoML Library made with Optuna and PyTorch Lightning

An AutoML Library made with Optuna and PyTorch Lightning Installation Recommended pip install -U gradsflow From source pip install git+https://github.

GradsFlow 294 Dec 17, 2022
Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight

Revisiting RCAN: Improved Training for Image Super-Resolution Introduction Image super-resolution (SR) is a fast-moving field with novel architectures

Zudi Lin 76 Dec 01, 2022
Pytorch implementation of the paper SPICE: Semantic Pseudo-labeling for Image Clustering

SPICE: Semantic Pseudo-labeling for Image Clustering By Chuang Niu and Ge Wang This is a Pytorch implementation of the paper. (In updating) SOTA on 5

Chuang Niu 154 Dec 15, 2022
RL-GAN: Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation

RL-GAN: Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation RL-GAN is an official implementation of the paper: T

42 Nov 10, 2022
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 04, 2023