Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

Overview

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22)

Paper Link | Project Page

Abstract :

Manual annotation of large-scale point cloud dataset for varying tasks such as 3D object classification, segmentation and detection is often laborious owing to the irregular structure of point clouds. Self-supervised learning, which operates without any human labeling, is a promising approach to address this issue. We observe in the real world that humans are capable of mapping the visual concepts learnt from 2D images to understand the 3D world. Encouraged by this insight, we propose CrossPoint, a simple cross-modal contrastive learning approach to learn transferable 3D point cloud representations. It enables a 3D-2D correspondence of objects by maximizing agreement between point clouds and the corresponding rendered 2D image in the invariant space, while encouraging invariance to transformations in the point cloud modality. Our joint training objective combines the feature correspondences within and across modalities, thus ensembles a rich learning signal from both 3D point cloud and 2D image modalities in a self-supervised fashion. Experimental results show that our approach outperforms the previous unsupervised learning methods on a diverse range of downstream tasks including 3D object classification and segmentation. Further, the ablation studies validate the potency of our approach for a better point cloud understanding.

Citation

If you find our work, this repository, or pretrained models useful, please consider giving a star and citation.

@inproceedings{afham2022crosspoint,
    title={CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding}, 
    author={Mohamed Afham and Isuru Dissanayake and Dinithi Dissanayake and Amaya Dharmasiri and Kanchana Thilakarathna and Ranga Rodrigo},
    booktitle={IEEE/CVF International Conference on Computer Vision and Pattern Recognition},
    month = {June},
    year={2022}
  }

Dependencies

Refer requirements.txt for the required packages.

Pretrained Models

CrossPoint pretrained models with DGCNN feature extractor are available here.

Download data

Datasets are available here. Run the command below to download all the datasets (ShapeNetRender, ModelNet40, ScanObjectNN, ShapeNetPart) to reproduce the results.

cd data
source download_data.sh

Train CrossPoint

Refer scripts/script.sh for the commands to train CrossPoint.

Downstream Tasks

1. 3D Object Classification

Run eval_ssl.ipynb notebook to perform linear SVM object classification in both ModelNet40 and ScanObjectNN datasets.

2. Few-Shot Object Classification

Refer scripts/fsl_script.sh to perform few-shot object classification.

3. 3D Object Part Segmentation

Refer scripts/script.sh for fine-tuning experiment for part segmentation in ShapeNetPart dataset.

Acknowledgements

Our code borrows heavily from DGCNN repository. We thank the authors of DGCNN for releasing their code. If you use our model, please consider citing them as well.

Owner
Mohamed Afham
Electronics and Telecommunication Engineering Undergraduate | Passionate in Deep Learning Research
Mohamed Afham
Solution of Kaggle competition: Sartorius - Cell Instance Segmentation

Sartorius - Cell Instance Segmentation https://www.kaggle.com/c/sartorius-cell-instance-segmentation Environment setup Build docker image bash .dev_sc

68 Dec 09, 2022
AnimationKit: AI Upscaling & Interpolation using Real-ESRGAN+RIFE

ALPHA 2.5: Frostbite Revival (Released 12/23/21) Changelog: [ UI ] Chained design. All steps link to one another! Use the master override toggles to s

87 Nov 16, 2022
Scripts used to make and evaluate OpenAlex's concept tagging model

openalex-concept-tagging This repository contains all of the code for getting the concept tagger up and running. To learn more about where this model

OurResearch 18 Dec 09, 2022
Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

V-MPO Simple code to demonstrate Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) in Pyt

Nugroho Dewantoro 9 Jun 06, 2022
A synthetic texture-invariant dataset for object detection of UAVs

A synthetic dataset for object detection of UAVs This repository contains a synthetic datasets accompanying the paper Sim2Air - Synthetic aerial datas

LARICS Lab 10 Aug 13, 2022
potpourri3d - An invigorating blend of 3D geometry tools in Python.

A Python library of various algorithms and utilities for 3D triangle meshes and point clouds. Managed by Nicholas Sharp, with new tools added lazily as needed. Currently, mainly bindings to C++ tools

Nicholas Sharp 295 Jan 05, 2023
Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

CG-MuAlign A reference implementation for "Collective Multi-type Entity Alignment Between Knowledge Graphs", published in WWW 2020. If you find our pa

Bran Zhu 28 Dec 11, 2022
ShapeGlot: Learning Language for Shape Differentiation

ShapeGlot: Learning Language for Shape Differentiation Created by Panos Achlioptas, Judy Fan, Robert X.D. Hawkins, Noah D. Goodman, Leonidas J. Guibas

Panos 32 Dec 23, 2022
Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Collapse by Conditioning: Training Class-conditional GANs with Limited Data Moha

Mohamad Shahbazi 33 Dec 06, 2022
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Mozhdeh Gheini 16 Jul 16, 2022
2021 Artificial Intelligence Diabetes Datathon

A.I.D.D. 2021 2021 Artificial Intelligence Diabetes Datathon A.I.D.D. 2021은 ‘2021 인공지능 학습용 데이터 구축사업’을 통해 만들어진 학습용 데이터를 활용하여 당뇨병을 효과적으로 예측할 수 있는가에 대한 A

2 Dec 27, 2021
Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capability)

Protein GLM (wip) Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capabil

Phil Wang 17 May 06, 2022
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers Results results on COCO val Backbone Method Lr Schd PQ Config Download

155 Dec 20, 2022
Continual reinforcement learning baselines: experiment specifications, implementation of existing methods, and common metrics. Easily extensible to new methods.

Continual Reinforcement Learning This repository provides a simple way to run continual reinforcement learning experiments in PyTorch, including evalu

55 Dec 24, 2022
Accelerating BERT Inference for Sequence Labeling via Early-Exit

Sequence-Labeling-Early-Exit Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit Requirement: Please refer to re

李孝男 23 Oct 14, 2022
OpenMMLab Text Detection, Recognition and Understanding Toolbox

Introduction English | 简体中文 MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the correspondi

OpenMMLab 3k Jan 07, 2023
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
NeoPlay is the project dedicated to ESport events.

NeoPlay is the project dedicated to ESport events. On this platform users can participate in tournaments with prize pools as well as create their own tournaments.

3 Dec 18, 2021
Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility ICCV2021

Vis2Mesh This is the offical repository of the paper: Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Lear

71 Dec 25, 2022
Dynamic Environments with Deformable Objects (DEDO)

DEDO - Dynamic Environments with Deformable Objects DEDO is a lightweight and customizable suite of environments with deformable objects. It is aimed

Rika 32 Dec 22, 2022