Towards Part-Based Understanding of RGB-D Scans

1Technical University of Munich
2Skolkovo Institute of Science and Technology
3New York University
CVPR 2021

From an input RGB-D scan (left), we propose to detect objects in the scan and predict their complete part decompositions as semantic part completion; that is, we predict the part masks for the complete object, inferring the part geometry of any missing or unobserved regions in the scan. To achieve this, we predict the part structure of each detected object to drive a geometric prior-driven prediction of the complete part masks.

Abstract

Recent advances in 3D semantic scene understanding have shown impressive progress in 3D instance segmentation, enabling object-level reasoning about 3D scenes; however, a finer-grained understanding is required to enable interactions with objects and their functional understanding. Thus, we propose the task of part-based scene understanding of real-world 3D environments: from an RGB-D scan of a scene, we detect objects, and for each object predict its decomposition into geometric part masks, which composed together form the complete geometry of the observed object. We leverage an intermediary part graph representation to enable robust completion as well as building of part priors, which we use to construct the final part mask predictions. Our experiments demonstrate that guiding part understanding through part graph to part prior-based predictions significantly outperforms alternative approaches to the task of semantic part completion.

Video

Results

We test our approach for semantic part completion on ScanNet dataset in comparison with state of the art for part decomposition, including scan completion followed by part segmentation. Our approach produces more consistent, accurate part decompositions.

BibTeX


      @inproceedings{bokhovkin2021towards,
        title={Towards Part-Based Understanding of RGB-D Scans},
        author={Bokhovkin, Alexey and Ishimtsev, Vladislav and Bogomolov, Emil and Zorin, Denis and Artemov, Alexey and Burnaev, Evgeny and Dai, Angela},
        booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
        pages={7484--7494},
        year={2021}
      }