研究目的
To incorporate shape information, specifically mean curvature, into convolutional neural networks for 3D object recognition to improve accuracy by addressing the loss of continuous shape details in voxel representations.
研究成果
The incorporation of mean curvature into voxel CNNs consistently improves 3D object recognition accuracy, achieving a 1% increase on ModelNet10 and slight improvements on ModelNet40 without modifying network hyperparameters. The method effectively retains shape information lost in voxelization, enhancing the discrimination of objects based on local curvature features. Future work could extend this to 2D projection-based CNNs for higher resolution applications.
研究不足
The method is limited by the scale-variance of curvature, as curvature values depend on object size, and the voxel representation itself is not scale-invariant. Performance may degrade for classes with similar curvature profiles, such as nightstand and dresser, or in datasets with wide size variations like ModelNet40. The computation relies on finely tessellated meshes for accurate curvature estimation, which may not be available for all objects.
1:Experimental Design and Method Selection:
The study uses a voxel-based CNN architecture (Octnet) augmented with mean curvature features. Mean curvature is computed per vertex from triangle meshes using a method from prior work, and integrated into the voxel grid for training. The approach is designed to be rotation-invariant and requires no modification to the network configuration.
2:Sample Selection and Data Sources:
The ModelNet10 and ModelNet40 datasets are used, consisting of CAD models provided as triangle meshes. ModelNet10 has 3991 training and 908 test shapes across 10 categories; ModelNet40 has 9843 training and 2468 test shapes across 40 categories.
3:List of Experimental Equipment and Materials:
Computational resources for running deep learning experiments, including software for curvature computation and CNN training. Specific equipment not detailed in the paper.
4:Experimental Procedures and Operational Workflow:
Precompute mean curvature per vertex from the triangle mesh data. Convert the mesh into a voxel grid with 64x64x64 resolution. Augment Octnet by replacing the binary occupancy grid with the curvature values. Train the network using cross-entropy loss for 20 epochs with a learning rate of 0.
5:Data Analysis Methods:
001.
5. Data Analysis Methods: Accuracy is measured as the overall classification accuracy on the test sets. Confusion matrices are generated to analyze per-class performance. Comparisons are made with baseline methods like Voxnet and original Octnet.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容