Automatic parallel code generation for NuFFT data translation on multicores

Journal Article

The nonuniform FFT (NuFFT) is widely used in many applications. Focusing on the most time-consuming part of the NuFFT computation, the data translation step, in this paper, we develop an automatic parallel code generation tool for data translation targeting emerging multicores. The key components of this tool are two scalable parallelization strategies, namely, the source-driven parallelization and the target-driven parallelization. Both these strategies employ equally sized geometric tiling and binning to improve data locality while trying to balance workloads across the cores through dynamic task allocation. They differ in the partitioning and scheduling schemes used to guarantee mutual exclusion in data updates. This tool also consists of a code generator and a code optimizer for the data translation. We evaluated our tool on a commercial multicore machine for both 2D and 3D inputs under different sample distributions with large data set sizes. The results indicate that both parallelization strategies have good scalability as the number of cores and the number of dimensions of data space increase. In particular, the target-driven parallelization outperforms the other when samples are nonuniformly distributed. The experiments also show that our code optimizations can bring about 32%43% performance improvement to the data translation step of NuFFT. © 2012 World Scientific Publishing Company.

Full Text

Duke Authors

Cited Authors

  • Zhang, Y; Liu, J; Kultursay, E; Kandemir, M; Pitsianis, N; Sun, X

Published Date

  • April 1, 2012

Published In

Volume / Issue

  • 21 / 2

International Standard Serial Number (ISSN)

  • 0218-1266

Digital Object Identifier (DOI)

  • 10.1142/S021812661240004X

Citation Source

  • Scopus