研究目的
To develop a multi-information fusion network (MIFNet) for accurate sea-land segmentation in high-resolution remote sensing images, addressing the limitations of traditional threshold-based methods and existing CNN approaches by integrating multi-scale edges, multi-scale segmentation, and global context information.
研究成果
MIFNet effectively integrates multi-scale edges, multi-scale segmentation, and global context information through network learning, achieving superior performance in sea-land segmentation compared to state-of-the-art methods. The fusion of edge information and global context enhances both segmentation accuracy and edge precision, demonstrating the importance of multi-information fusion in complex remote sensing scenes. Future work will focus on improving edge accuracy.
研究不足
The method may have high computational complexity due to the deep network architecture and fusion processes. It relies on manually labeled ground truth, which can be time-consuming and subjective. The dataset is limited to natural-colored images from Google Earth, potentially not generalizing to other types of remote sensing data or resolutions. Edge accuracy could be further improved, as noted for future work.
1:Experimental Design and Method Selection:
The study uses an end-to-end convolutional neural network (MIFNet) designed to fuse multi-scale edge maps, multi-scale segmentation results, and global context information. The network consists of three parts: multi-tasking network for generating multi-scale edges and segmentation, global network for incorporating global context via multi-scale sub-region pooling, and fusion network to integrate all information through learned fusion.
2:Sample Selection and Data Sources:
The dataset comprises 230 natural-colored images collected from Google Earth with spatial resolution of 3-5m. 150 images are used for training, and 80 for validation and testing. Ground truth segmentation maps are labeled manually, and edge maps are derived automatically from segmentation ground truth.
3:5m. 150 images are used for training, and 80 for validation and testing. Ground truth segmentation maps are labeled manually, and edge maps are derived automatically from segmentation ground truth.
List of Experimental Equipment and Materials:
3. List of Experimental Equipment and Materials: No specific hardware or equipment is mentioned; the implementation uses software tools including the Caffe library for deep learning.
4:Experimental Procedures and Operational Workflow:
Data augmentation is performed by cropping 384x384 samples from original images at 150-pixel intervals and randomly, followed by rotation and flipping to increase training samples to 21,280. The network is trained with SGD optimizer, initial learning rate of 0.01, momentum 0.9, batch size 8, and learning rate reduction when validation loss plateaus. Testing involves resizing images to 896x896 and using block processing for large images.
5:The network is trained with SGD optimizer, initial learning rate of 01, momentum 9, batch size 8, and learning rate reduction when validation loss plateaus. Testing involves resizing images to 896x896 and using block processing for large images.
Data Analysis Methods:
5. Data Analysis Methods: Evaluation metrics include Precision, Recall, F-score for segmentation, and edge precision (EP), edge recall (ER), F-score-e for edges. Metrics are calculated based on true positives, false positives, and false negatives, with edge correctness defined by distance thresholds (N=1 to 5).
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容