Skip to main content

Exploiting 162-nanosecond end-to-end communication latency on Anton

Publication ,  Conference
Dror, RO; Grossman, JP; Mackenzie, KM; Towles, B; Chow, E; Salmon, JK; Young, C; Bank, JA; Batson, B; Deneroff, MM; Kuskin, JS; Larson, RH ...
Published in: 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010
December 1, 2010

Strong scaling of scientific applications on parallel architectures is increasingly limited by communication latency. This paper describes the techniques used to mitigate latency in Anton, a massively parallel special-purpose machine that accelerates molecular dynamics (MD) simulations by orders of magnitude compared with the previous state of the art. Achieving this speedup required a combination of hardware mechanisms and software constructs to reduce network latency, sender and receiver overhead, and synchronization costs. Key elements of Anton's approach, in addition to tightly integrated communication hardware, include formulating data transfer in terms of counted remote writes, leveraging fine-grained communication, and establishing fixed, optimized communication patterns. Anton delivers software-to-software inter-node latency significantly lower than any other large-scale parallel machine, and the total critical-path communication time for an Anton MD simulation is less than 4% that of the next fastest MD platform. © 2010 IEEE.

Duke Scholars

Published In

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010

DOI

Publication Date

December 1, 2010
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Dror, R. O., Grossman, J. P., Mackenzie, K. M., Towles, B., Chow, E., Salmon, J. K., … Shaw, D. E. (2010). Exploiting 162-nanosecond end-to-end communication latency on Anton. In 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010. https://doi.org/10.1109/SC.2010.23
Dror, R. O., J. P. Grossman, K. M. Mackenzie, B. Towles, E. Chow, J. K. Salmon, C. Young, et al. “Exploiting 162-nanosecond end-to-end communication latency on Anton.” In 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010, 2010. https://doi.org/10.1109/SC.2010.23.
Dror RO, Grossman JP, Mackenzie KM, Towles B, Chow E, Salmon JK, et al. Exploiting 162-nanosecond end-to-end communication latency on Anton. In: 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010. 2010.
Dror, R. O., et al. “Exploiting 162-nanosecond end-to-end communication latency on Anton.” 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010, 2010. Scopus, doi:10.1109/SC.2010.23.
Dror RO, Grossman JP, Mackenzie KM, Towles B, Chow E, Salmon JK, Young C, Bank JA, Batson B, Deneroff MM, Kuskin JS, Larson RH, Moraes MA, Shaw DE. Exploiting 162-nanosecond end-to-end communication latency on Anton. 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010. 2010.

Published In

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010

DOI

Publication Date

December 1, 2010