CoBloom: An FPGA Accelerator System for Bloom Filter Insertion in Genomics Applications
The rapidly expanding volume of genomic datasets causes processing speed to become a significant bottleneck in genomic analysis workflows. FPGA acceleration presents an effective means to optimize computationally intensive workloads. K-mer counting is a widely used operation that involves recording each substring within a nucleotide sequence. This is frequently done with a counting Bloom filter, a data structure that requires multiple hash computations and memory accesses to a large lookup table per K-mer operation. This work presents two primary contributions. First, we profile Bloom filter implementations, examining hash rate performance, memory access patterns with and without prefetching, and the impacts of sorting to identify scaling bottlenecks and optimal resource allocation strategies. Second, we present CoBloom, an FPGA-accelerated counting Bloom filter which employs a hybrid design combining hardware-based hash computation with CPU-managed memory operations. CoBloom's architecture addresses the identified bottlenecks to create a more efficient K-mer counting pipeline.