Points to patches: Enabling the use of self-attention for 3D shape recognition

Axel Berg, Magnus Oskarsson, Mark O'Connor

Research output: Chapter in Book/Report/Conference proceedingPaper in conference proceedingpeer-review

Abstract

While the Transformer architecture has become ubiquitous in the machine learning field, its adaptation to 3D shape recognition is non-trivial. Due to its quadratic computational complexity, the self-attention operator quickly becomes inefficient as the set of input points grows larger. Furthermore, we find that the attention mechanism struggles to find useful connections between individual points on a global scale. In order to alleviate these problems, we propose a two-stage Point Transformer-in-Transformer (Point-TnT) approach which combines local and global attention mechanisms, enabling both individual points and patches of points to attend to each other effectively. Experiments on shape classification show that such an approach provides more useful features for downstream tasks than the baseline Transformer, while also being more computationally efficient. In addition, we also extend our method to feature matching for scene reconstruction, showing that it can be used in conjunction with existing scene reconstruction pipelines.
Original languageEnglish
Title of host publication2022 26th International Conference on Pattern Recognition (ICPR)
PublisherIEEE - Institute of Electrical and Electronics Engineers Inc.
Pages528-534
Number of pages7
ISBN (Electronic)978-1-6654-9062-7
ISBN (Print)978-1-6654-9062-7
DOIs
Publication statusPublished - 2022 Aug 21
Event26TH International Conference on Pattern Recognition, 2022 - Montreal, Canada
Duration: 2022 Aug 212022 Aug 25

Publication series

NameInternational Conference on Pattern Recognition
PublisherIEEE
ISSN (Print)1051-4651
ISSN (Electronic)2831-7475

Conference

Conference26TH International Conference on Pattern Recognition, 2022
Abbreviated titleICPR 2022
Country/TerritoryCanada
CityMontreal
Period2022/08/212022/08/25

Subject classification (UKÄ)

  • Computer Vision and Robotics (Autonomous Systems)

Fingerprint

Dive into the research topics of 'Points to patches: Enabling the use of self-attention for 3D shape recognition'. Together they form a unique fingerprint.

Cite this