Skip to main content

Tuning Strassen's matrix multiplication for memory efficiency

Publication ,  Conference
Thottethodi, M; Chatterjee, S; Lebeck, AR
Published in: Proceedings of the International Conference on Supercomputing
January 1, 1998

Strassen's algorithm for matrix multiplication gains its lower arithmetic complexity at the expense of reduced locality of reference, which makes it challenging to implement the algorithm efficiently on a modern machine with a hierarchical memory system. We report on an implementation of this algorithm that uses several unconventional techniques to make the algorithm memory-friendly. First, the algorithm internally uses a nonstandard array layout known as Morton order that is based on a quad-tree decomposition of the matrix. Second, we dynamically select the recursion truncation point to minimize padding without affecting the performance of the algorithm, which we can do by virtue of the cache behavior of the Morton ordering. Each technique is critical for performance, and their combination as done in our code multiplies their effectiveness. Performance comparisons of our implementation with that of competing implementations show that our implementation often outperforms the alternative techniques (up to 25%). However, we also observe wide variability across platforms and across matrix sizes, indicating that at this time, no single implementation is a clear choice for all platforms or matrix sizes. We also note that the time required to convert matrices to/from Morton order is a noticeable amount of execution time (5% to 15%). Eliminating this overhead further reduces our execution time.

Duke Scholars

Published In

Proceedings of the International Conference on Supercomputing

DOI

ISBN

081868707X

Publication Date

January 1, 1998

Volume

1998-November
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Thottethodi, M., Chatterjee, S., & Lebeck, A. R. (1998). Tuning Strassen's matrix multiplication for memory efficiency. In Proceedings of the International Conference on Supercomputing (Vol. 1998-November). https://doi.org/10.1109/SC.1998.10045
Thottethodi, M., S. Chatterjee, and A. R. Lebeck. “Tuning Strassen's matrix multiplication for memory efficiency.” In Proceedings of the International Conference on Supercomputing, Vol. 1998-November, 1998. https://doi.org/10.1109/SC.1998.10045.
Thottethodi M, Chatterjee S, Lebeck AR. Tuning Strassen's matrix multiplication for memory efficiency. In: Proceedings of the International Conference on Supercomputing. 1998.
Thottethodi, M., et al. “Tuning Strassen's matrix multiplication for memory efficiency.” Proceedings of the International Conference on Supercomputing, vol. 1998-November, 1998. Scopus, doi:10.1109/SC.1998.10045.
Thottethodi M, Chatterjee S, Lebeck AR. Tuning Strassen's matrix multiplication for memory efficiency. Proceedings of the International Conference on Supercomputing. 1998.

Published In

Proceedings of the International Conference on Supercomputing

DOI

ISBN

081868707X

Publication Date

January 1, 1998

Volume

1998-November