Fast fourier transform accelerated fast multipole algorithm

Journal Article (Journal Article)

This paper describes an O(p2 log2(p)N) implementation of the fast multipole algorithm (FMA) for N-body simulations. This method of computing the FMA is faster than the original, which is O(p4N), where p is the number of terms retained in the truncated multipole expansion representation of the potential field of a collection of charged particles. The p term determines the accuracy of the calculation. The limiting O(p4) computation in the original FMA is a convolution-like operation on a matrix of multipole coefficients. This paper describes the implementation details of a conversion of this limiting computation to linear convolution, which is then computed in the Fourier domain using the fast Fourier transform (FFT), based on a method originally outlined by Greengard and Rokhlin. In addition, this paper describes a new block decomposition of the multipole expansion data that provides numerical stability and efficient computation. The resulting O(p2 log2(p)) subroutine has a speedup of 2 on a sequential processor over the original method for p = 8, and a speedup of 4 for p = 16. The new subroutine vectorizes well and has a speedup of 3 on a vector processor at p = 8 and a speedup of 6 at p = 16.

Full Text

Duke Authors

Cited Authors

  • Elliott, WD; Board, JA

Published Date

  • January 1, 1996

Published In

Volume / Issue

  • 17 / 2

Start / End Page

  • 398 - 415

International Standard Serial Number (ISSN)

  • 1064-8275

Digital Object Identifier (DOI)

  • 10.1137/S1064827594264259

Citation Source

  • Scopus