Scholars@Duke publication: Genesis: A Hardware Acceleration Framework for Genomic Data Analysis

Genesis: A Hardware Acceleration Framework for Genomic Data Analysis

Publication , Conference

Ham, TJ; Bruns-Smith, D; Sweeney, B; Lee, Y; Seo, SH; Song, UG; Oh, YH; Asanovic, K; Lee, JW; Wills, LW

Published in: Proceedings International Symposium on Computer Architecture

May 1, 2020

In this paper, we describe our vision to accelerate algorithms in the domain of genomic data analysis by proposing a framework called Genesis (genome analysis) that contains an interface and an implementation of a system that processes genomic data efficiently. This framework can be deployed in the cloud and exploit the FPGAs-as-a-service paradigm to provide cost-efficient secondary DNA analysis. We propose conceptualizing genomic reads and associated read attributes as a very large relational database and using extended SQL as a domain-specific language to construct queries that form various data manipulation operations. To accelerate such queries, we design a Genesis hardware library which consists of primitive hardware modules that can be composed to construct a dataflow architecture specialized for those queries. As a proof of concept for the Genesis framework, we present the architecture and the hardware implementation of several genomic analysis stages in the secondary analysis pipeline corresponding to the best known software analysis toolkit, GATK4 workflow proposed by the Broad Institute. We walk through the construction of genomic data analysis operations using a sequence of SQL-style queries and show how Genesis hardware library modules can be utilized to construct the hardware pipelines designed to accelerate such queries. We exploit parallelism and data reuse by utilizing a dataflow architecture along with the use of on-chip scratchpads as well as non-blocking APIs to manage the accelerators, allowing concurrent execution of the accelerator and the host. Our accelerated system deployed on the cloud FPGA performs up to $ 19.3× better than GATK4 running on a commodity multi-core Xeon server and obtains up to $ 15× better cost savings. We believe that if a software algorithm can be mapped onto a hardware library to utilize the underlying accelerator(s) using an already-standardized software interface such as SQL, while allowing the efficient mapping of such interface to primitive hardware modules as we have demonstrated here, it will expedite the acceleration of domainspecific algorithms and allow the easy adaptation of algorithm changes.

Duke Scholars

Author Lisa Wills Computer Science

Published In

Proceedings International Symposium on Computer Architecture

DOI

10.1109/ISCA45697.2020.00031

ISSN

1063-6897

Publication Date

May 1, 2020

Volume

2020-May

Start / End Page

254 / 267

Citation

APA

Chicago

ICMJE

MLA

NLM

Ham, T. J., Bruns-Smith, D., Sweeney, B., Lee, Y., Seo, S. H., Song, U. G., … Wills, L. W. (2020). Genesis: A Hardware Acceleration Framework for Genomic Data Analysis. In Proceedings International Symposium on Computer Architecture (Vol. 2020-May, pp. 254–267). https://doi.org/10.1109/ISCA45697.2020.00031

Ham, T. J., D. Bruns-Smith, B. Sweeney, Y. Lee, S. H. Seo, U. G. Song, Y. H. Oh, K. Asanovic, J. W. Lee, and L. W. Wills. “Genesis: A Hardware Acceleration Framework for Genomic Data Analysis.” In Proceedings International Symposium on Computer Architecture, 2020-May:254–67, 2020. https://doi.org/10.1109/ISCA45697.2020.00031.

Ham TJ, Bruns-Smith D, Sweeney B, Lee Y, Seo SH, Song UG, et al. Genesis: A Hardware Acceleration Framework for Genomic Data Analysis. In: Proceedings International Symposium on Computer Architecture. 2020. p. 254–67.

Ham, T. J., et al. “Genesis: A Hardware Acceleration Framework for Genomic Data Analysis.” Proceedings International Symposium on Computer Architecture, vol. 2020-May, 2020, pp. 254–67. Scopus, doi:10.1109/ISCA45697.2020.00031.

Ham TJ, Bruns-Smith D, Sweeney B, Lee Y, Seo SH, Song UG, Oh YH, Asanovic K, Lee JW, Wills LW. Genesis: A Hardware Acceleration Framework for Genomic Data Analysis. Proceedings International Symposium on Computer Architecture. 2020. p. 254–267.

Published In

Proceedings International Symposium on Computer Architecture

DOI

10.1109/ISCA45697.2020.00031

ISSN

1063-6897

Publication Date

May 1, 2020

Volume

2020-May

Start / End Page

254 / 267