Open-World Entity Segmentation

Related tags

Text Data & NLPEntity
Overview

Open-World Entity Segmentation Project Website

Lu Qi*, Jason Kuen*, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia


This project provides an implementation for the paper "Open-World Entity Segmentation" based on Detectron2. Entity Segmentation is a segmentation task with the aim to segment everything in an image into semantically-meaningful regions without considering any category labels. Our entity segmentation models can perform exceptionally well in a cross-dataset setting where we use only COCO as the training dataset but we test the model on images from other datasets at inference time. Please refer to project website for more details and visualizations.


Installation

This project is based on Detectron2, which can be constructed as follows.

  • Install Detectron2 following the instructions. We are noting that our code is implemented in detectron2 commit version 28174e932c534f841195f02184dc67b941c65a67 and pytorch 1.8.
  • Setup the coco dataset including instance and panoptic annotations following the structure. The code of entity evaluation metric is saved in the file of modified_cocoapi. You can directly replace your compiled coco.py with modified_cocoapi/PythonAPI/pycocotools/coco.py.
  • Copy this project to /path/to/detectron2/projects/EntitySeg
  • Set the "find_unused_parameters=True" in distributed training of your own detectron2. You could modify it in detectron2/engine/defaults.py.

Data pre-processing

(1) Generate the entity information of each image by the instance and panoptic annotation. Please change the path of coco annotation files in the following code.

cd /path/to/detectron2/projects/EntitySeg/make_data
bash make_entity_mask.sh

(2) Change the generated entity information to the json files.

cd /path/to/detectron2/projects/EntitySeg/make_data
python3 entity_to_json.py

Training

To train model with 8 GPUs, run:

cd /path/to/detectron2
python3 projects/EntitySeg/train_net.py --config-file <projects/EntitySeg/configs/config.yaml> --num-gpus 8

For example, to launch entity segmentation training (1x schedule) with ResNet-50 backbone on 8 GPUs and save the model in the path "/data/entity_model". one should execute:

cd /path/to/detectron2
python3 projects/EntitySeg/train_net.py --config-file projects/EntitySeg/configs/entity_default.yaml --num-gpus 8 OUTPUT_DIR /data/entity_model

Evaluation

To evaluate a pre-trained model with 8 GPUs, run:

cd /path/to/detectron2
python3 projects/EntitySeg/train_net.py --config-file <config.yaml> --num-gpus 8 --eval-only MODEL.WEIGHTS model_checkpoint

Visualization

To visualize some image result of a pre-trained model, run:

cd /path/to/detectron2
python3 projects/EntitySeg/demo_result_and_vis.py --config-file <config.yaml> --input <input_path> --output <output_path> MODEL.WEIGHTS model_checkpoint MODEL.CONDINST.MASK_BRANCH.USE_MASK_RESCORE "True"

For example,

python3 projects/EntitySeg/demo_result_and_vis.py --config-file projects/EntitySeg/configs/entity_swin_lw7_1x.yaml --input /data/input/*.jpg --output /data/output MODEL.WEIGHTS /data/pretrained_model/R_50.pth MODEL.CONDINST.MASK_BRANCH.USE_MASK_RESCORE "True"

Pretrained weights of Swin Transformers

Use the tools/convert_swin_to_d2.py to convert the pretrained weights of Swin Transformers to the detectron2 format. For example,

pip install timm
wget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth
python tools/convert_swin_to_d2.py swin_tiny_patch4_window7_224.pth swin_tiny_patch4_window7_224_trans.pth

Pretrained weights of Segformer Backbone

Use the tools/convert_mit_to_d2.py to convert the pretrained weights of SegFormer Backbone to the detectron2 format. For example,

pip install timm
python tools/convert_mit_to_d2.py mit_b0.pth mit_b0_trans.pth

Results

We provide the results of several pretrained models on COCO val set. It is easy to extend it to other backbones. We first describe the results of using CNN backbone.

Method Backbone Sched Entity AP download
Baseline R50 1x 28.3 model | metrics
Ours R50 1x 29.8 model | metrics
Ours R50 3x 31.8 model | metrics
Ours R101 1x 31.0 model | metrics
Ours R101 3x 33.2 model | metrics
Ours R101-DCNv2 3x 35.5 model | metrics

The results of using transformer backbone as follows.The Mask Rescore indicates that we use mask rescoring in inference by setting MODEL.CONDINST.MASK_BRANCH.USE_MASK_RESCORE to True.

Method Backbone Sched Entity AP Mask Rescore download
Ours Swin-T 1x 33.0 34.6 model | metrics
Ours Swin-L-W7 1x 37.8 39.3 model | metrics
Ours Swin-L-W7 3x 38.6 40.0 model | metrics
Ours Swin-L-W12 3x TBD TBD model | metrics
Ours MiT-b0 1x 28.8 30.4 model | metrics
Ours MiT-b2 1x 35.1 36.6 model | metrics
Ours MiT-b3 1x 36.9 38.5 model | metrics
Ours MiT-b5 1x 37.2 38.7 model | metrics
Ours MiT-b5 3x TBD TBD model | metrics

Citing Ours

Consider to cite Open-World Entity Segmentation if it helps your research.

@inprocedings{qi2021open,
  title={Open World Entity Segmentation},
  author={Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia},
  booktitle={arxiv},
  year={2021}
}
Owner
DV Lab
Deep Vision Lab
DV Lab
Natural Language Processing Specialization

Natural Language Processing Specialization In this folder, Natural Language Processing Specialization projects and notes can be found. WHAT I LEARNED

Kaan BOKE 3 Oct 06, 2022
Semi-automated vocabulary generation from semantic vector models

vec2word Semi-automated vocabulary generation from semantic vector models This script generates a list of potential conlang word forms along with asso

9 Nov 25, 2022
Index different CKAN entities in Solr, not just datasets

ckanext-sitesearch Index different CKAN entities in Solr, not just datasets Requirements This extension requires CKAN 2.9 or higher and Python 3 Featu

Open Knowledge Foundation 3 Dec 02, 2022
A python gui program to generate reddit text to speech videos from the id of any post.

Reddit text to speech generator A python gui program to generate reddit text to speech videos from the id of any post. Current functionality Generate

Aadvik 17 Dec 19, 2022
This repo stores the codes for topic modeling on palliative care journals.

This repo stores the codes for topic modeling on palliative care journals. Data Preparation You first need to download the journal papers. bash 1_down

3 Dec 20, 2022
Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

IMDB Sentiment Analysis This is the final project of Machine Learning Courses in Huazhong University of Science and Technology, School of Artificial I

Daniel 0 Dec 27, 2021
SummerTime - Text Summarization Toolkit for Non-experts

A library to help users choose appropriate summarization tools based on their specific tasks or needs. Includes models, evaluation metrics, and datasets.

Yale-LILY 213 Jan 04, 2023
A paper list for aspect based sentiment analysis.

Aspect-Based-Sentiment-Analysis A paper list for aspect based sentiment analysis. Survey [IEEE-TAC-20]: Issues and Challenges of Aspect-based Sentimen

jiangqn 419 Dec 20, 2022
GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub.

Nathan Cooper 2.3k Jan 01, 2023
Grover is a model for Neural Fake News -- both generation and detectio

Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.

Rowan Zellers 856 Dec 24, 2022
Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

Kaizhi Qian 193 Dec 30, 2022
Open source annotation tool for machine learning practitioners.

doccano doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequ

7.1k Jan 01, 2023
CorNet Correlation Networks for Extreme Multi-label Text Classification

CorNet Correlation Networks for Extreme Multi-label Text Classification Prerequisites python==3.6.3 pytorch==1.2.0 torchgpipe==0.0.5 click==7.0 ruamel

Guangxu Xun 38 Dec 31, 2022
Code for "Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments".

Code for "Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments".

Yu Zhang 50 Nov 08, 2022
It analyze the sentiment of the user, whether it is postive or negative.

Sentiment-Analyzer-Tool It analyze the sentiment of the user, whether it is postive or negative. It uses streamlit library for creating this sentiment

Paras Patidar 18 Dec 17, 2022
Text Analysis & Topic Extraction on Android App user reviews

AndroidApp_TextAnalysis Hi, there! This is code archive for Text Analysis and Topic Extraction from user_reviews of Android App. Dataset Source : http

Fitrie Ratnasari 1 Feb 14, 2022
Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Random Word Generator Generates meaningful words from dictionary with given no. of letters and words. This might be useful for generating short links

Mohammed Rabil 1 Jan 01, 2022
BookNLP, a natural language processing pipeline for books

BookNLP BookNLP is a natural language processing pipeline that scales to books and other long documents (in English), including: Part-of-speech taggin

654 Jan 02, 2023
Installation, test and evaluation of Scribosermo speech-to-text engine

Scribosermo STT Setup Scribosermo is a LGPL licensed, open-source speech recognition engine to "Train fast Speech-to-Text networks in different langua

Florian Quirin 3 Jun 20, 2022
NLPShala , the best IDE for all Natural language processing tasks.

The revolutionary IDE for all NLP (Natural language processing) stuffs on the internet.

Abhi 3 Aug 08, 2021