研究目的
To compute correctly rounded ?oating-point sums in Oelog NT depth using parallel associative reduction, iterative re?nement, and conservative early termination detection, exploiting the scaling in transistor count to accelerate ?oating-point performance even when clock rates remain flat.
研究成果
The paper concludes that floating-point values can be summed in parallel to produce a correctly rounded result in Oelog NT depth. The lightweight test for early termination detection ensures the algorithm is fast and predictable, requiring only two iterations in virtually all cases. The implementation of the FPART as an extension of a standard FPA runs as fast as the standard FPA, while requiring only 22 percent more area.
研究不足
The technical and application constraints include the potential for intermediate overflows and the need for handling undetermined cases (undet) where the convergence test may fail. The algorithm's performance may degrade with extremely ill-conditioned data.
1:Experimental Design and Method Selection:
The methodology involves using parallel associative reduction, iterative refinement, and conservative early termination detection to compute correctly rounded floating-point sums. The theoretical models include tree-reduce parallelism and residue-preserving floating-point adders.
2:Sample Selection and Data Sources:
The experiments use different datasets with varying condition numbers to evaluate the algorithm's performance.
3:List of Experimental Equipment and Materials:
The hardware implementation includes two residue-preserving IEEE-754 double-precision floating-point adders on a Virtex 6 FPGA.
4:Experimental Procedures and Operational Workflow:
The algorithm involves performing a parallel tree-reduce sum on the sum output of the FPART with dlog Ne stages and N (cid:3) 1 nodes, keeping the residue output for potential refinement and error estimation.
5:Data Analysis Methods:
The approach for analyzing experimental data includes statistical techniques and software tools utilized to evaluate the algorithm's convergence and accuracy.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容