Skip to main content

Error detection via online checking of cache coherence with token coherence signatures

Publication ,  Journal Article
Meixner, A; Sorin, DJ
Published in: Proceedings International Symposium on High Performance Computer Architecture
August 10, 2007

To provide high dependability in a multithreaded system despite hardware faults, the system must detect and correct errors in its shared memory system. Recent research has explored dynamic checking of cache coherence as a comprehensive approach to memory system error detection. However, existing coherence checkers are costly to implement, incur high interconnection network traffic overhead, and do not scale well. In this paper, we describe the Token Coherence Signature Checker (TCSC), which provides comprehensive, low-cost, scalable coherence checking by maintaining signatures that represent recent histories of coherence events at all nodes (cache and memory controllers). Periodically, these signatures are sent to a verifier to determine if an error occurred. TCSC has a small constant hardware cost per node, independent of cache and memory size and the number of nodes. TCSCs interconnect bandwidth overhead has a constant upper bound and never exceeds 7% in our experiments. TCSC has negligible impact on system performance. © 2007 IEEE.

Duke Scholars

Published In

Proceedings International Symposium on High Performance Computer Architecture

DOI

ISSN

1530-0897

Publication Date

August 10, 2007

Start / End Page

145 / 156
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Meixner, A., & Sorin, D. J. (2007). Error detection via online checking of cache coherence with token coherence signatures. Proceedings International Symposium on High Performance Computer Architecture, 145–156. https://doi.org/10.1109/HPCA.2007.346193
Meixner, A., and D. J. Sorin. “Error detection via online checking of cache coherence with token coherence signatures.” Proceedings International Symposium on High Performance Computer Architecture, August 10, 2007, 145–56. https://doi.org/10.1109/HPCA.2007.346193.
Meixner A, Sorin DJ. Error detection via online checking of cache coherence with token coherence signatures. Proceedings International Symposium on High Performance Computer Architecture. 2007 Aug 10;145–56.
Meixner, A., and D. J. Sorin. “Error detection via online checking of cache coherence with token coherence signatures.” Proceedings International Symposium on High Performance Computer Architecture, Aug. 2007, pp. 145–56. Scopus, doi:10.1109/HPCA.2007.346193.
Meixner A, Sorin DJ. Error detection via online checking of cache coherence with token coherence signatures. Proceedings International Symposium on High Performance Computer Architecture. 2007 Aug 10;145–156.

Published In

Proceedings International Symposium on High Performance Computer Architecture

DOI

ISSN

1530-0897

Publication Date

August 10, 2007

Start / End Page

145 / 156