修车大队一品楼qm论坛51一品茶楼论坛,栖凤楼品茶全国楼凤app软件 ,栖凤阁全国论坛入口,广州百花丛bhc论坛杭州百花坊妃子阁

oe1(光电查) - 科学论文

14 条数据
?? 中文(中国)
  • [Lecture Notes in Computer Science] Euro-Par 2018: Parallel Processing Workshops Volume 11339 (Euro-Par 2018 International Workshops, Turin, Italy, August 27-28, 2018, Revised Selected Papers) || Modeling and Optimizing Data Transfer in GPU-Accelerated Optical Coherence Tomography

    摘要: Signal processing of optical coherence tomography (OCT) has become a bottleneck for using OCT in medical and industrial applications. Recently, GPUs gained more importance as compute device to achieve video frame rate of 25 frames/s. Therefore, we develop a CUDA implementation of an OCT signal processing chain: We focus on reformulating the signal processing algorithms in terms of high-performance libraries like CUBLAS and CUFFT. Additionally, we use NVIDIA’s stream concept to overlap computations and data transfers. Performance results are presented for two Pascal GPUs and validated with a derived performance model. The model gives an estimate for the overall execution time for the OCT signal processing chain, including compute and transfer times.

    关键词: GPU,OCT,CUDA,Performance model

    更新于2025-09-23 15:23:52

  • [IEEE IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - Valencia (2018.7.22-2018.7.27)] IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - A Car-Borne SAR System for Interferometric Measurements: Development Status and System Enhancements

    摘要: Terrestrial radar systems are used operationally for area-wide measurement and monitoring of surface displacements on steep slopes, as prevalent in mountainous areas or also in open pit mines. One limitation of these terrestrial systems is the decreasing cross-range resolution with increasing distance of observation due to the limited antenna size of the real aperture radar or the limited synthetic aperture of the quasi-stationary SAR systems. Recently, we have conducted a first experiment using a car-borne SAR system at Ku-band, demonstrating the time-domain back-projection (TDBP) focusing capability for the FMCW case and single-pass interferometric capability of our experimental Ku-band car-borne SAR system. The cross-range spatial resolution provided by such a car-based SAR system is potentially independent from the distance of observation, given that an adequate sensor trajectory can be built. In this paper, we give (1) an overview of the updated system hardware (radar setup and high-precision combined INS/GNSS positioning and attitude determination), and (2) present SAR imagery obtained with the updated prototype Ku-band car-borne SAR system.

    关键词: azimuth focusing,Ku-band,SAR imaging,ground-based SAR system,car-borne SAR,parallelization,SAR interferometry,GPU,CUDA,interferometry,CARSAR,Synthetic aperture radar (SAR)

    更新于2025-09-23 15:22:29

  • [IEEE 2018 IEEE International Conference on Imaging Systems and Techniques (IST) - Krakow, Poland (2018.10.16-2018.10.18)] 2018 IEEE International Conference on Imaging Systems and Techniques (IST) - Real-Time 3D Reconstruction in Minimally Invasive Surgery with Quasi-Dense Matching

    摘要: In this work, a method for 3D reconstruction of Minimally Invasive Surgery data in real-time is presented. It is formulated on top of the already established framework of Quasi-Dense Matching, optimizing its components for speed. First, it recovers a set of sparse features, which are matched robustly. Then, 3D information is propagated in a spatial neighbourhood, until similarity reaches a prede?ned threshold, to cover a semi-dense portion of operating ?eld domain. Matching on dense level is achieved with Zero Mean Normalized Cross Correlation metric to establish correspondences. The algorithm is able to recover disparity maps with relatively small error, while maintaining real-time performance.

    关键词: CUDA,MIS,Disparity Estimation,Stereo Matching,3D Reconstruction

    更新于2025-09-23 15:22:29

  • GPU Acceleration of Clustered DPCM for Lossless Compression of Hyperspectral Images

    摘要: With the development of remote sensing technology, spatial and spectral resolutions of hyperspectral images have become increasingly dense. In order to overcome difficulties in the storage, transmission and manipulation of hyperspectral images, an effective compression algorithm is requisite. The Clustered Differential Pulse Code Modulation (C-DPCM), which is a prediction-based hyperspectral lossless compression algorithm, can achieve a relatively high compression ratio, but its efficiency still requires improvement. This paper presents a parallel implementation of the C-DPCM algorithm on Graphics Processing Units (GPUs) with the Compute Unified Device Architecture (CUDA), which is a parallel computing platform and programming model developed by NVIDIA. Three optimization strategies are utilized to implement the C-DPCM algorithm in parallel, including a version that uses shared memory and registers, a version that employs multi-stream, and a version that uses multi-GPU. In addition, we studied how to assign all classes to each GPU to minimize the processing time. Finally, we reduced the compression time from approximately half an hour to an hour to several seconds, with almost no loss in accuracy.

    关键词: C-DPCM,GPU,CUDA,Hyperspectral image lossless compression

    更新于2025-09-23 15:22:29

  • Parallel K-Means Clustering for Brain Cancer Detection Using Hyperspectral Images

    摘要: The precise delineation of brain cancer is a crucial task during surgery. There are several techniques employed during surgical procedures to guide neurosurgeons in the tumor resection. However, hyperspectral imaging (HSI) is a promising non-invasive and non-ionizing imaging technique that could improve and complement the currently used methods. The HypErspectraL Imaging Cancer Detection (HELICoiD) European project has addressed the development of a methodology for tumor tissue detection and delineation exploiting HSI techniques. In this approach, the K-means algorithm emerged in the delimitation of tumor borders, which is of crucial importance. The main drawback is the computational complexity of this algorithm. This paper describes the development of the K-means clustering algorithm on different parallel architectures, in order to provide real-time processing during surgical procedures. This algorithm will generate an unsupervised segmentation map that, combined with a supervised classification map, will offer guidance to the neurosurgeon during the tumor resection task. We present parallel K-means clustering based on OpenMP, CUDA and OpenCL paradigms. These algorithms have been validated through an in-vivo hyperspectral human brain image database. Experimental results show that the CUDA version can achieve a speed-up of ~150× with respect to a sequential processing. The remarkable result obtained in this paper makes possible the development of a real-time classification system.

    关键词: unsupervised clustering,brain cancer detection,Graphics Processing Units (GPUs),OpenCL,CUDA,K-means,OpenMP,hyperspectral imaging

    更新于2025-09-23 15:21:01

  • Fatigue crack branching in laser melting deposited Ti–55511 alloy

    摘要: With the improvement of GPU’s general computing capacity, the use of parallel computing to solve some dif?cult problems with large amount of data and intensive computing tasks has become the trend of the times. In GPU general computing, CUDA and OpenCL have been widely used and studied. However, the two parallel programming models generally exist the weakness that whose API is too close to the underlying hardware, which makes programming inef?cient and is not suitable for the large-scale parallel tasks that require rapid implementation. OpenACC is a relatively advanced and simple programming language, which can achieve rapid parallelization, but the computing effect of the program is relatively low (generally lower than CUDA). Therefore, this paper tries to combine CUDA and OpenACC for mixed parallelization. This way not only greatly reduces the workload of code conversion, but also has a computing performance no less than a pure CUDA program.

    关键词: matrix multiplication,OpenACC,CUDA

    更新于2025-09-11 14:15:04

  • A CUDA-based GPU engine for gprMax: Open source FDTD electromagnetic simulation software

    摘要: The Finite-Difference Time-Domain (FDTD) method is a popular numerical modelling technique in computational electromagnetics. The volumetric nature of the FDTD technique means simulations often require extensive computational resources (both processing time and memory). The simulation of Ground Penetrating Radar (GPR) is one such challenge, where the GPR transducer, subsurface/structure, and targets must all be included in the model, and must all be adequately discretised. Additionally, forward simulations of GPR can necessitate hundreds of models with different geometries (A-scans) to be executed. This is exacerbated by an order of magnitude when solving the inverse GPR problem or when using forward models to train machine learning algorithms. We have developed one of the first open source GPU-accelerated FDTD solvers specifically focussed on modelling GPR. We designed optimal kernels for GPU execution using NVIDIA’s CUDA framework. Our GPU solver achieved performance throughputs of up to 1194 Mcells/s and 3405 Mcells/s on NVIDIA Kepler and Pascal architectures, respectively. This is up to 30 times faster than the parallelised (OpenMP) CPU solver can achieve on a commonly-used desktop CPU (Intel Core i7-4790K). We found the cost-performance benefit of the NVIDIA GeForce-series Pascal-based GPUs – targeted towards the gaming market – to be especially notable, potentially allowing many individuals to benefit from this work using commodity workstations. We also note that the equivalent Tesla-series P100 GPU – targeted towards data-centre usage – demonstrates significant overall performance advantages due to its use of high-bandwidth memory. The performance benefits of our GPU-accelerated solver were demonstrated in a GPR environment by running a large-scale, realistic (including dispersive media, rough surface topography, and detailed antenna model) simulation of a buried anti-personnel landmine scenario.

    关键词: GPGPU,Finite-Difference Time-Domain,GPU,CUDA,GPR,NVIDIA

    更新于2025-09-11 14:15:04

  • CUDA-based Volume Rendering and Inspection for Time-varying Ultrasonic Testing Datasets

    摘要: We present a framework for time-varying 3D UT datasets based on volume real-time CUDA-based rendering with high quality. In addition, we design a novel trapezoid-shaped transfer functions (TF) with color gradient. Furthermore, we propose an interactive 3D inspection method via clipping plane with enhanced ability of the cross-section data.

    关键词: Clipping Plane,Volume Rendering,Ultrasonic Testing,Transfer Function,CUDA,Time-varying Datasets

    更新于2025-09-10 09:29:36

  • [IEEE 2018 New York Scientific Data Summit (NYSDS) - New York, NY, USA (2018.8.6-2018.8.8)] 2018 New York Scientific Data Summit (NYSDS) - High-Performance Multi-Mode Ptychography Reconstruction on Distributed GPUs

    摘要: Ptychography is an emerging imaging technique that is able to provide wavelength-limited spatial resolution from specimen with extended lateral dimensions. As a scanning microscopy method, a typical two-dimensional image requires a number of data frames. As a diffraction-based imaging technique, the real-space image has to be recovered through iterative reconstruction algorithms. Due to these two inherent aspects, a ptychographic reconstruction is generally a computation-intensive and time-consuming process, which limits the throughput of this method. We report an accelerated version of the multi-mode difference map algorithm for ptychography reconstruction using multiple distributed GPUs. This approach leverages available scienti?c computing packages in Python, including mpi4py and PyCUDA, with the core computation functions implemented in CUDA C. We ?nd that interestingly even with MPI collective communications, the weak scaling in the number of GPU nodes can still remain nearly constant. Most importantly, for realistic diffraction measurements, we observe a speedup ranging from a factor of 10 to 103 depending on the data size, which reduces the reconstruction time remarkably from hours to typically about 1 minute and is thus critical for real-time data processing and visualization.

    关键词: MPI,Python,CUDA,GPU,X-ray ptychography

    更新于2025-09-10 09:29:36

  • Fast parallel beam propagation method based on multi-core and many-core architectures

    摘要: In this paper, a fast technique is suggested to accelerate the computation of the fast Fourier transform beam propagation method (FFT-BPM). The FFT-BPM is executed on a graphical processing unit (GPU) and multi-core processor GPUs to speed up the computation of huge number of propagation steps with a higher speed than the traditional CPU. Further, the suggested technique is implemented in parallel approach which is faster than serial implementation. The achieved speedup factor is 150× and 5× using GPU and eight cores multiprocessor, respectively with respect to a single core processing time of 215 steps input Gaussian beam. In order to verify the speed of the proposed technique, the possibility of using the BPM to compute the time-consuming Goos–H?nchen shift calculation is proposed. Further, the propagation of a single mode light beam in fiber optic for 5 × 106 steps is executed using GPU. It is found that the speed up of the studied mode is equal to 168x over a single core calculation.

    关键词: Compute unified device architecture (CUDA),Parallel computing,Beam propagation method,Open multiprocessors (OpenMP),Graphical processing unit (GPU)

    更新于2025-09-10 09:29:36