Skip to main content

High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape.

Publication ,  Journal Article
Helmy, M; Li, JU; Yan, XF; Meade, RK; Anderson, E; Chen, PB; Czechanski, AM; Di Domenico, T; Flint, J; Garrison, E; Gontijo, MTP; Haggerty, L ...
Published in: Cell Genom
February 11, 2026

We present a collection of 17 high-quality long-read inbred mouse strain genomes with complete annotation (contig N50s of 0.8-33.9 Mbp). This collection includes 12 widely used classical laboratory strains and 5 wild-derived strains. We have resolved previously incomplete genomic regions, including the major histocompatibility complex (MHC), defensin cluster, T cell receptor, and Ly49 complexes. Hundreds of non-reference genes from previous publications not found in GRCm39, such as Defa1, Raet1a, and Klra20 (Ly49T), were localized in the new reference genomes. We conducted a genome-wide scan of variable number tandem repeats (VNTRs) within the coding regions, identifying over 400 genes with VNTR polymorphisms with up to 600 repeat copies and repeat units reaching 990 nucleotides. Our strain-specific annotations enhance RNA sequencing (RNA-seq) analyses, as demonstrated in PWK/PhJ, where we observed a 5.1% improvement in read mapping and expression-level differences in 2.1% of coding genes compared to using GRCm39.

Duke Scholars

Published In

Cell Genom

DOI

EISSN

2666-979X

Publication Date

February 11, 2026

Volume

6

Issue

2

Start / End Page

101074

Location

United States

Related Subject Headings

  • Molecular Sequence Annotation
  • Mice, Inbred Strains
  • Mice
  • Major Histocompatibility Complex
  • Genome
  • Animals
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Helmy, M., Li, J. U., Yan, X. F., Meade, R. K., Anderson, E., Chen, P. B., … Keane, T. M. (2026). High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape. Cell Genom, 6(2), 101074. https://doi.org/10.1016/j.xgen.2025.101074
Helmy, Mohab, Jin U. Li, Xinyu F. Yan, Rachel K. Meade, Elizabeth Anderson, Patrick B. Chen, Anne M. Czechanski, et al. “High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape.Cell Genom 6, no. 2 (February 11, 2026): 101074. https://doi.org/10.1016/j.xgen.2025.101074.
Helmy M, Li JU, Yan XF, Meade RK, Anderson E, Chen PB, et al. High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape. Cell Genom. 2026 Feb 11;6(2):101074.
Helmy, Mohab, et al. “High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape.Cell Genom, vol. 6, no. 2, Feb. 2026, p. 101074. Pubmed, doi:10.1016/j.xgen.2025.101074.
Helmy M, Li JU, Yan XF, Meade RK, Anderson E, Chen PB, Czechanski AM, Di Domenico T, Flint J, Garrison E, Gontijo MTP, Guarracino A, Haggerty L, Heard E, Howe K, Meena N, Martin FJ, Miska EA, Rall I, Ramakrishna NB, Sapetschnig A, Sinha S, Sun D, Tricomi FF, Qu R, Wood JMD, Wu T, Zhou DJ, Reinholdt L, Adams DJ, Smith CM, Lilue J, Keane TM. High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape. Cell Genom. 2026 Feb 11;6(2):101074.

Published In

Cell Genom

DOI

EISSN

2666-979X

Publication Date

February 11, 2026

Volume

6

Issue

2

Start / End Page

101074

Location

United States

Related Subject Headings

  • Molecular Sequence Annotation
  • Mice, Inbred Strains
  • Mice
  • Major Histocompatibility Complex
  • Genome
  • Animals