Skip to main content
Journal cover image

Unraveling the functional dark matter through global metagenomics.

Publication ,  Journal Article
Pavlopoulos, GA; Baltoumas, FA; Liu, S; Selvitopi, O; Camargo, AP; Nayfach, S; Azad, A; Roux, S; Call, L; Ivanova, NN; Chen, IM; Karatzas, E ...
Published in: Nature
October 2023

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Nature

DOI

EISSN

1476-4687

ISSN

0028-0836

Publication Date

October 2023

Volume

622

Issue

7983

Start / End Page

594 / 602

Related Subject Headings

  • Proteins
  • Protein Conformation
  • Microbiology
  • Metagenomics
  • Metagenome
  • General Science & Technology
  • Databases, Protein
  • Cluster Analysis
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Pavlopoulos, G. A., Baltoumas, F. A., Liu, S., Selvitopi, O., Camargo, A. P., Nayfach, S., … Kyrpides, N. C. (2023). Unraveling the functional dark matter through global metagenomics. Nature, 622(7983), 594–602. https://doi.org/10.1038/s41586-023-06583-7
Pavlopoulos, Georgios A., Fotis A. Baltoumas, Sirui Liu, Oguz Selvitopi, Antonio Pedro Camargo, Stephen Nayfach, Ariful Azad, et al. “Unraveling the functional dark matter through global metagenomics.Nature 622, no. 7983 (October 2023): 594–602. https://doi.org/10.1038/s41586-023-06583-7.
Pavlopoulos GA, Baltoumas FA, Liu S, Selvitopi O, Camargo AP, Nayfach S, et al. Unraveling the functional dark matter through global metagenomics. Nature. 2023 Oct;622(7983):594–602.
Pavlopoulos, Georgios A., et al. “Unraveling the functional dark matter through global metagenomics.Nature, vol. 622, no. 7983, Oct. 2023, pp. 594–602. Epmc, doi:10.1038/s41586-023-06583-7.
Pavlopoulos GA, Baltoumas FA, Liu S, Selvitopi O, Camargo AP, Nayfach S, Azad A, Roux S, Call L, Ivanova NN, Chen IM, Paez-Espino D, Karatzas E, Novel Metagenome Protein Families Consortium, Iliopoulos I, Konstantinidis K, Tiedje JM, Pett-Ridge J, Baker D, Visel A, Ouzounis CA, Ovchinnikov S, Buluç A, Kyrpides NC. Unraveling the functional dark matter through global metagenomics. Nature. 2023 Oct;622(7983):594–602.
Journal cover image

Published In

Nature

DOI

EISSN

1476-4687

ISSN

0028-0836

Publication Date

October 2023

Volume

622

Issue

7983

Start / End Page

594 / 602

Related Subject Headings

  • Proteins
  • Protein Conformation
  • Microbiology
  • Metagenomics
  • Metagenome
  • General Science & Technology
  • Databases, Protein
  • Cluster Analysis