Joint Learning of 3D Shape Retrieval and Deformation, CVPR 2021

Last update: Oct 18, 2022

Overview

Joint Learning of 3D Shape Retrieval and Deformation

Joint Learning of 3D Shape Retrieval and Deformation

Mikaela Angelina Uy, Vladimir G. Kim, Minhyuk Sung, Noam Aigerman, Siddhartha Chaudhuri and Leonidas Guibas

CVPR 2021

Introduction

We propose a novel technique for producing high-quality 3D models that match a given target object image or scan. Our method is based on retrieving an existing shape from a database of 3D models and then deforming its parts to match the target shape. Unlike previous approaches that independently focus on either shape retrieval or deformation, we propose a joint learning procedure that simultaneously trains the neural deformation module along with the embedding space used by the retrieval module. This enables our network to learn a deformation-aware embedding space, so that retrieved models are more amenable to match the target after an appropriate deformation. In fact, we use the embedding space to guide the shape pairs used to train the deformation module, so that it invests its capacity in learning deformations between meaningful shape pairs. Furthermore, our novel part-aware deformation module can work with inconsistent and diverse part structures on the source shapes. We demonstrate the benefits of our joint training not only on our novel framework, but also on other state-of-the-art neural deformation modules proposed in recent years. Lastly, we also show that our jointly-trained method outperforms various non-joint baselines. Our project page can be found here, and the arXiv version of our paper can be found here.

@inproceedings{uy-joint-cvpr21,
      title = {Joint Learning of 3D Shape Retrieval and Deformation},
      author = {Mikaela Angelina Uy and Vladimir G. Kim and Minhyuk Sung and Noam Aigerman and Siddhartha Chaudhuri and Leonidas Guibas},
      booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      year = {2021}
  }

Data download and preprocessing details

Dataset downloads can be found in the links below. These should be extracted in the project home folder.

Raw source shapes are here.
Processed h5 and pickle files are here.
Targets:
- [Optional] (already processed in h5) point cloud
- Images: chair, table, cabinet. You also need to modify the correct path for IMAGE_BASE_DIR in the image training and evaluation scripts.
Automatic segmentation (ComplementMe)
- Source shapes are here.
- Processed h5 and pickle files are here.

For more details on the pre-processing scripts, please take a look at run_preprocessing.py and generate_combined_h5.py. run_preprocessing.py includes the details on how the connectivity constraints and projection matrices are defined. We use the keypoint_based constraint to define our source model constraints in the paper.

The renderer used throughout the project can be found here. Please modify the paths, including the input and output directories, accordingly at global_variables.py if you want to process your own data.

Pre-trained Models

The pretrained models for Ours and Ours w/ IDO, which uses our joint training approach can be found here. We also included the pretrained models of our structure-aware deformation-only network, which are trained on random source-target pairs used to initialize our joint training.

Evaluation

Example commands to run the evaluation script are as follows. The flags can be changed as desired. --mesh_visu renders the output results into images, remove the flag to disable the rendering. Note that --category is the object category and the values should be set to "chair", "table", "storagefurniture" for classes chair, table and cabinet, respectively.

For point clouds:

python evaluate.py --logdir=ours_ido_pc_chair/ --dump_dir=dump_ours_ido_pc_chair/ --joint_model=1 --use_connectivity=1 --use_src_encoder_retrieval=1 --category=chair --use_keypoint=1 --mesh_visu=1

python evaluate_recall.py --logdir=ours_ido_pc_chair/ --dump_dir=dump_ours_ido_pc_chair/ --category=chair

For images:

python evaluate_images.py --logdir=ours_ido_img_chair/ --dump_dir=dump_ours_ido_img_chair/ --joint_model=1 --use_connectivity=1 --category=chair --use_src_encoder_retrieval=1 --use_keypoint=1 --mesh_visu=1

python evaluate_images_recall.py --logdir=ours_ido_img_chair/ --dump_dir=dump_ours_ido_img_chair/ --category=chair

Training

To train deformation-only networks on random source-target pairs, example commands are as follows:

# For point clouds
python train_deformation_final.py --logdir=log/ --dump_dir=dump/ --to_train=1 --use_connectivity=1 --category=chair --use_keypoint=1 --use_symmetry=1

# For images
python train_deformation_images.py --logdir=log/ --dump_dir=dump/ --to_train=1 --use_connectivity=1 --category=storagefurniture --use_keypoint=1 --use_symmetry=1

To train our joint models without IDO (Ours), example commands are as follows:

# For point clouds
python train_region_final.py --logdir=log/ --dump_dir=dump/ --to_train=1 --init_deformation=1 --loss_function=regression --distance_function=mahalanobis --use_connectivity=1 --use_src_encoder_retrieval=1 --category=chair --model_init=df_chair_pc/ --selection=retrieval_candidates --use_keypoint=1 --use_symmetry=1

# For images
python train_region_images.py --logdir=log/ --dump_dir=dump/ --to_train=1 --use_connectivity=1 --selection=retrieval_candidates --use_src_encoder_retrieval=1 --category=chair --use_keypoint=1 --use_symmetry=1 --init_deformation=1 --model_init=df_chair_img/

To train our joint models with IDO (Ours w/ IDO), example commands are as follows:

# For point clouds
python joint_with_icp.py --logdir=log/ --dump_dir=dump/ --to_train=1 --loss_function=regression --distance_function=mahalanobis --use_connectivity=1 --use_src_encoder_retrieval=1 --category=chair --model_init=df_chair_pc/ --selection=retrieval_candidates --use_keypoint=1 --use_symmetry=1 --init_deformation=1 --use_icp_pp=1 --fitting_loss=l2

# For images
python joint_icp_images.py --logdir=log/ --dump_dir=dump/ --to_train=1 --init_joint=1 --loss_function=regression --distance_function=mahalanobis --use_connectivity=1 --use_src_encoder_retrieval=1 --category=chair --model_init=df_chair_img/ --selection=retrieval_candidates --use_keypoint=1 --use_symmetry=1 --init_deformation=1 --use_icp_pp=1 --fitting_loss=l2

Note that our joint training approach is used by setting the flag --selection=retrieval_candidates=1.

Related Work

This work and codebase is related to the following previous work:

Deformation-Aware Embedding 3D Model Embedding and Retrieval by Uy et al. (ECCV 2020).

License

This repository is released under MIT License (see LICENSE file for details).

Joint Learning of 3D Shape Retrieval and Deformation, CVPR 2021

Related tags

Overview

Joint Learning of 3D Shape Retrieval and Deformation

Introduction

Data download and preprocessing details

Pre-trained Models

Evaluation

Training

Related Work

License

Owner

Mikaela Uy

PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

Optical Character Recognition + Instance Segmentation for russian and english languages

Public Code for NIPS submission SimiGrad: Fine-Grained Adaptive Batching for Large ScaleTraining using Gradient Similarity Measurement

A computational optimization project towards the goal of gerrymandering the results of a hypothetical election in the UK.

Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

Attention mechanism with MNIST dataset

Artificial Intelligence playing minesweeper 🤖

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

Convert Table data to approximate values with GUI

Bianace Prediction Pytorch Model

A PyTorch implementation of QANet.

Use your Philips Hue lights as Racing Flags. Works with Assetto Corsa, Assetto Corsa Competizione and iRacing.

Breaching - Breaching privacy in federated learning scenarios for vision and text

An algorithmic trading bot that learns and adapts to new data and evolving markets using Financial Python Programming and Machine Learning.

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data

Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al.