Scholars@Duke publication: Exploring parallelization strategies for NUFFT data translation

Exploring parallelization strategies for NUFFT data translation

Publication , Journal Article

Zhang, Y; Kandemir, M; Pitsianis, NP; Sun, X

Published in: Embedded Systems Week 2009 - Proceedings of the 7th ACM International Conference on Embedded Software, EMSOFT '09

December 24, 2009

Published version (DOI)

This paper introduces parallelization strategies for the Non-Uniform FFT (NUFFT) data translation on multicore architectures. The NUFFT enables the use of the celebrated FFT with un-equally spaced data in numerous situations in signal and image processing as well as in scientific computing. The critical extension lies at the translation of non-equally spaced or non-uniformly sampled data onto an equally spaced Cartesian grid or vice versa. The data translation can be made sufficiently accurate, with the arithmetic complexity linearly proportional to the size of the data ensemble. For large NUFFTs, however, the data translation is found substantially dominant in computation time on modern computers while it is expected to be dominated by the FFT. In order to match the FFT performance achieved by FFTW, data locality and parallelism in the data translation must be explored and exploited as well. We are concerned with two fundamental issues. First, the data translation can be described as a matrix-vector multiplication with a matrix of irregular sparsity. This is beyond the effective scope of the conventional tiling and parallelization schemes applied by a compiler for performance improvement on computation with dense matrices. Secondly, multicore processors exist and emerge in many different configurations, and are expected to evolve further in architectural variety. This may mean the end of performance tuning on a single type of architecture. In this paper, we introduce an automation tool that takes two specifications as input, one on an application-specific data translation algorithm, the other on a target multicore processor architecture. The tool generates a parallel code that explores the data locality and parallelism by utilizing both geometric structures in data translation and the processor-memory configurations in the target architecture. We present preliminary experimental results on both a simulator and a commercial multicore machine. The results show that our parallelization strategy brings significant performance improvement for the NUFFT data translation by efficiently exploiting the data locality and concurrency in the application. Copyright 2009 ACM.

Duke Scholars

Author Nikos Pitsianis Computer Science

Author Xiaobai Sun Computer Science

Published In

Embedded Systems Week 2009 - Proceedings of the 7th ACM International Conference on Embedded Software, EMSOFT '09

DOI

10.1145/1629335.1629361

Publication Date

December 24, 2009

Start / End Page

187 / 196

Citation

APA

Chicago

ICMJE

MLA

NLM

Zhang, Y., Kandemir, M., Pitsianis, N. P., & Sun, X. (2009). Exploring parallelization strategies for NUFFT data translation. Embedded Systems Week 2009 - Proceedings of the 7th ACM International Conference on Embedded Software, EMSOFT ’09, 187–196. https://doi.org/10.1145/1629335.1629361

Zhang, Y., M. Kandemir, N. P. Pitsianis, and X. Sun. “Exploring parallelization strategies for NUFFT data translation.” Embedded Systems Week 2009 - Proceedings of the 7th ACM International Conference on Embedded Software, EMSOFT ’09, December 24, 2009, 187–96. https://doi.org/10.1145/1629335.1629361.

Zhang Y, Kandemir M, Pitsianis NP, Sun X. Exploring parallelization strategies for NUFFT data translation. Embedded Systems Week 2009 - Proceedings of the 7th ACM International Conference on Embedded Software, EMSOFT ’09. 2009 Dec 24;187–96.

Zhang, Y., et al. “Exploring parallelization strategies for NUFFT data translation.” Embedded Systems Week 2009 - Proceedings of the 7th ACM International Conference on Embedded Software, EMSOFT ’09, Dec. 2009, pp. 187–96. Scopus, doi:10.1145/1629335.1629361.

Published In

Embedded Systems Week 2009 - Proceedings of the 7th ACM International Conference on Embedded Software, EMSOFT '09

DOI

10.1145/1629335.1629361

Publication Date

December 24, 2009

Start / End Page

187 / 196