Fast fourier transform accelerated fast multipole algorithm
This paper describes an O(p2 log2(p)N) implementation of the fast multipole algorithm (FMA) for N-body simulations. This method of computing the FMA is faster than the original, which is O(p4N), where p is the number of terms retained in the truncated multipole expansion representation of the potential field of a collection of charged particles. The p term determines the accuracy of the calculation. The limiting O(p4) computation in the original FMA is a convolution-like operation on a matrix of multipole coefficients. This paper describes the implementation details of a conversion of this limiting computation to linear convolution, which is then computed in the Fourier domain using the fast Fourier transform (FFT), based on a method originally outlined by Greengard and Rokhlin. In addition, this paper describes a new block decomposition of the multipole expansion data that provides numerical stability and efficient computation. The resulting O(p2 log2(p)) subroutine has a speedup of 2 on a sequential processor over the original method for p = 8, and a speedup of 4 for p = 16. The new subroutine vectorizes well and has a speedup of 3 on a vector processor at p = 8 and a speedup of 6 at p = 16.
Duke Scholars
Published In
DOI
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Numerical & Computational Mathematics
- 4903 Numerical and computational mathematics
- 4901 Applied mathematics
- 0802 Computation Theory and Mathematics
- 0103 Numerical and Computational Mathematics
- 0102 Applied Mathematics
Citation
Published In
DOI
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Numerical & Computational Mathematics
- 4903 Numerical and computational mathematics
- 4901 Applied mathematics
- 0802 Computation Theory and Mathematics
- 0103 Numerical and Computational Mathematics
- 0102 Applied Mathematics