Scholars@Duke publication: Icfp: tolerating all-level cache misses in in-order processors

Icfp: tolerating all-level cache misses in in-order processors

Publication , Conference

Hilton, A; Nagarakatte, S; Roth, A

Published in: Proceedings - International Symposium on High-Performance Computer Architecture

January 1, 2009

Published version (DOI) Open Access Copy (Duke)

Growing concerns about power have revived interest in in-order pipelines. In-order pipelines sacrifice single-thread performance. Specifically, they do not allow execution to flow freely around data cache misses. As a result, they have difficulties overlapping independent misses with one another. Previously proposed techniques like Runahead execution and Multipass pipelining have attacked this problem. In this paper, we go a step further and introduce iCFP (in-order Continual Flow Pipeline), an adaptation of the CFP concept to an in-order processor. When iCFP encounters a primary data cache or L2 miss, it checkpoints the register file and transitions into an "advance" execution mode. Miss-independent instructions execute as usual and even update register state. Missdependent instructions are diverted into a slice buffer, un-blocking the pipeline latches. When the miss returns, iCFP "rallies" and executes the contents of the slice buffer, merging miss-dependent state with missindependent state along the way. An enhanced register dependence tracking scheme and a novel store buffer design facilitate the merging process. Cycle-level simulations show that iCFP out-performs Runahead, Multipass, and SLTP, another non-blocking in-order pipeline design. © 2008 IEEE.

Duke Scholars

Author Andrew Douglas Hilton Electrical and Computer Engineering

Published In

Proceedings - International Symposium on High-Performance Computer Architecture

DOI

10.1109/HPCA.2009.4798281

ISSN

1530-0897

ISBN

9781424429325

Publication Date

January 1, 2009

Start / End Page

431 / 442

Citation

APA

Chicago

ICMJE

MLA

NLM

Hilton, A., Nagarakatte, S., & Roth, A. (2009). Icfp: tolerating all-level cache misses in in-order processors. In Proceedings - International Symposium on High-Performance Computer Architecture (pp. 431–442). https://doi.org/10.1109/HPCA.2009.4798281

Hilton, A., S. Nagarakatte, and A. Roth. “Icfp: tolerating all-level cache misses in in-order processors.” In Proceedings - International Symposium on High-Performance Computer Architecture, 431–42, 2009. https://doi.org/10.1109/HPCA.2009.4798281.

Hilton A, Nagarakatte S, Roth A. Icfp: tolerating all-level cache misses in in-order processors. In: Proceedings - International Symposium on High-Performance Computer Architecture. 2009. p. 431–42.

Hilton, A., et al. “Icfp: tolerating all-level cache misses in in-order processors.” Proceedings - International Symposium on High-Performance Computer Architecture, 2009, pp. 431–42. Scopus, doi:10.1109/HPCA.2009.4798281.

Hilton A, Nagarakatte S, Roth A. Icfp: tolerating all-level cache misses in in-order processors. Proceedings - International Symposium on High-Performance Computer Architecture. 2009. p. 431–442.

Published In

Proceedings - International Symposium on High-Performance Computer Architecture

DOI

10.1109/HPCA.2009.4798281

ISSN

1530-0897

ISBN

9781424429325

Publication Date

January 1, 2009

Start / End Page

431 / 442