研究目的
To construct an efficient Wiener filter for noisy and incomplete image data and to quickly draw probabilistic samples from the Wiener posterior using Dataflow Engines, with applications in astronomy for improving dark matter maps and cosmological parameter determination.
研究成果
The implementation demonstrates a significant speed-up of at least 11.3 times when using 8 DFEs compared to 32 CPU threads for drawing samples from the Wiener posterior, highlighting the efficiency of dataflow computing for large-scale Bayesian inference. This approach can enhance applications in cosmology, such as dark matter mapping and parameter estimation, with potential for further optimization and extension to larger datasets.
研究不足
The image size is constrained to 128^2 pixels due to on-chip memory limitations of DFEs; larger images may require off-chip memory support. The speed-up is a lower bound, as the implementation does not fully utilize all CPU threads with DFEs. The study assumes known signal and noise covariances and Gaussian distributions, which may not hold in all real-world scenarios.
1:Experimental Design and Method Selection:
The study uses a messenger field algorithm (Algorithm 1) to draw samples from the Wiener posterior distribution, leveraging Dataflow Engines (DFEs) for efficient computation. The algorithm involves iterative steps with Fourier transforms and Gaussian random variates.
2:Sample Selection and Data Sources:
Simulated datasets are generated with 128^2 pixel images, where signals are Gaussian random fields with known power spectra, and noise is Gaussian with varying variance across pixels; some pixels are masked to represent missing data.
3:List of Experimental Equipment and Materials:
DFEs (MAX4 Maia with Altera Stratix V FPGA and 48 GB DRAM), CPU servers (Intel Xeon E5-2650 v2), MPC-X boxes, and software tools like FFTW3 for CPU and MaxPower libraries for DFE.
4:Experimental Procedures and Operational Workflow:
Implement the messenger field algorithm on both CPU and DFE platforms, run multiple MCMC chains in parallel, measure execution times for drawing samples, and compare performance.
5:Data Analysis Methods:
Statistical analysis of execution times, calculation of speed-up factors, and verification of algorithm correctness by comparing outputs between CPU and DFE implementations.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容