- 标题
- 摘要
- 关键词
- 实验方案
- 产品
-
SCN: Switchable Context Network for Semantic Segmentation of RGB-D Images
摘要: Context representations have been widely used to profit semantic image segmentation. The emergence of depth data provides additional information to construct more discriminating context representations. Depth data preserves the geometric relationship of objects in a scene, which is generally hard to be inferred from RGB images. While deep convolutional neural networks (CNNs) have been successful in solving semantic segmentation, we encounter the problem of optimizing CNN training for the informative context using depth data to enhance the segmentation accuracy. In this paper, we present a novel switchable context network (SCN) to facilitate semantic segmentation of RGB-D images. Depth data is used to identify objects existing in multiple image regions. The network analyzes the information in the image regions to identify different characteristics, which are then used selectively through switching network branches. With the content extracted from the inherent image structure, we are able to generate effective context representations that are aware of both image structures and object relationships, leading to a more coherent learning of semantic segmentation network. We demonstrate that our SCN outperforms state-of-the-art methods on two public datasets.
关键词: Context representation,convolutional neural network (CNN),RGB-D images,semantic segmentation
更新于2025-09-23 15:23:52
-
[ACM Press the 2nd International Conference - Tianjin, China (2018.09.19-2018.09.21)] Proceedings of the 2nd International Conference on Biomedical Engineering and Bioinformatics - ICBEB 2018 - 3D Human Pose Estimation from RGB+D Images with Convolutional Neural Networks
摘要: In this paper, we explore 3D human pose estimation on the RGB+D images. While many researchers try to directly predict 3D pose from single RGB image, we propose a simple framework that could predict 3D pose predictions with the RGB image and depth image. Our approach is based on two aspects. On the one hand, we predicted accurate 2D joint locations from RGB image by applying the stacked hourglass networks based on the improved residual architecture. On the other hand, in view of obtained 2D joint locations, we could estimate 3D pose with the depth after calculating depth image patches. In general, compared with the state-of-the-art approaches, our model achieves signification improvement on benchmark dataset.
关键词: Deep Learning,Human Pose Estimation,RGB+D Images
更新于2025-09-23 15:23:52
-
Holistic and local patch framework for 6D object pose estimation in RGB-D images
摘要: 6D object pose estimation is a challenging problem of great importance arising in computer vision and many practical applications. In this paper, we present a novel framework for 6D object pose estimation in RGB-D images. By contrast with recent holistic or local patch-based method, we combine holistic and local patches together to fulfill this task. The proposed method has three stages, including holistic patch extraction, local patch regression and 6D pose refinement. In the first stage, we employ an existing convolutional neural network to roughly predict the location of target object and extract holistic patches, which is trained with synthetic rendering data. In the second stage, an improved Convolutional Auto-Encoder (CAE) is employed to learn the condensed feature representation of local patch, and coarse 6D object pose can be estimated by the regression of feature voting. Finally, we utilize Particle Swarm Optimization (PSO) to refine 6D object pose. The proposed method is evaluated on three challenging public datasets which can test the performance under background clutter, foreground occlusion as well as multiple-instance conditions. Moreover, we provide extensive experiments on the various parameters of the framework such as the dimension of local patch feature and some parameters in PSO. Several experimental results demonstrate that the proposed method outperforms some other state-of-the-art methods.
关键词: Convolutional neural network,Particle swarm optimization,Local patch,6D object pose estimation,Holistic patch,RGB-D images
更新于2025-09-19 17:15:36
-
Naked Eye Pseudo 3D Display Technology Outside the Screen
摘要: The 3D display technology based on motion parallax has the advantages of no need to wear glasses and has no limitation on the viewing angle. However, the current problems of this method include 3D scenes and models can only be displayed within the screen; the system is sensitive to light. For the first problem, we proposed the concept of virtual bezel, which makes out system could produce a strong illusion that objects displayed outside the screen, and greatly increases the visual impact of 3D display. For the problem of light sensitivity, unlike other system using RGB camera, we use RGB-D images to detect and track the viewer’s face, which makes our system work properly even at night.
关键词: Kinect,RGB-D images,3D display technology,motion parallax,virtual bezel
更新于2025-09-11 14:15:04
-
[IEEE 2018 24th International Conference on Pattern Recognition (ICPR) - Beijing, China (2018.8.20-2018.8.24)] 2018 24th International Conference on Pattern Recognition (ICPR) - Multimodal Face Spoofing Detection via RGB-D Images
摘要: While it has been shown that using 3D information might significantly benefit face anti-spoofing systems, traditional color images are still generally used, due to several issues such as expensive hardware requirement, high time cost, or poor accessibility when obtaining and using true 3D images. Thus, we could use RGB-D images captured by relatively low cost sensors instead, e.g., Kinect cameras, to achieve better performance without consuming huge amount of time or money. This research presents a novel multimodal face anti-spoofing method, which makes full use of available information on RGB-D images and no manually chosen regions are needed. For every pair of RGB-D images, first of all, we calculate the correlation between color and depth images to detect multimodal properties; then, by analyzing the consistency of subregions extracted from the depth image, we are able to distinguish flat spoofing faces from genuine human beings. Both anti-spoofing features are fused to make final anti-spoofing decisions. Experiments on both self-collected and pubic 3DMAD datasets show that our proposed approach is effective for intra-dataset and cross-dataset testing scenarios, and that our method could deal with different presentation attacks carried by photos, tablet screens, and face masks.
关键词: presentation attack detection,depth consistency,RGB-D images,face anti-spoofing,multimodal correlation
更新于2025-09-09 09:28:46
-
[IEEE NAECON 2018 - IEEE National Aerospace and Electronics Conference - Dayton, OH, USA (2018.7.23-2018.7.26)] NAECON 2018 - IEEE National Aerospace and Electronics Conference - Real-time 3D scene reconstruction and localization with surface optimization
摘要: A real-time 3D scene reconstruction and localization system with surface optimization is proposed. The dense 3D point cloud model is created by utilizing rotation and orientation invariant feature matching along with loop-closure detection algorithm on RGB-D images in a mobile robot. The high resolution and smooth mesh model is implemented on a GPU based computer through wireless communication.
关键词: localization,RGB-D images,surface optimization,mobile robot,GPU,3D scene reconstruction
更新于2025-09-04 15:30:14