Skip to main content
Journal cover image

Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

Publication ,  Journal Article
Zhu, Y; Engström, PG; Tellgren-Roth, C; Baudo, CD; Kennell, JC; Sun, S; Billmyre, RB; Schröder, MS; Andersson, A; Holm, T; Sigurgeirsson, B ...
Published in: Nucleic Acids Res
March 17, 2017

Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Nucleic Acids Res

DOI

EISSN

1362-4962

Publication Date

March 17, 2017

Volume

45

Issue

5

Start / End Page

2629 / 2643

Location

England

Related Subject Headings

  • Sequence Analysis, RNA
  • Proteogenomics
  • Protein Domains
  • Peptides
  • Molecular Sequence Annotation
  • Malassezia
  • Genome, Mitochondrial
  • Genome, Fungal
  • Genes, Fungal
  • Fungal Proteins
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhu, Y., Engström, P. G., Tellgren-Roth, C., Baudo, C. D., Kennell, J. C., Sun, S., … Lehtiö, J. (2017). Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis. Nucleic Acids Res, 45(5), 2629–2643. https://doi.org/10.1093/nar/gkx006
Zhu, Yafeng, Pär G. Engström, Christian Tellgren-Roth, Charles D. Baudo, John C. Kennell, Sheng Sun, R Blake Billmyre, et al. “Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.Nucleic Acids Res 45, no. 5 (March 17, 2017): 2629–43. https://doi.org/10.1093/nar/gkx006.
Zhu Y, Engström PG, Tellgren-Roth C, Baudo CD, Kennell JC, Sun S, et al. Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis. Nucleic Acids Res. 2017 Mar 17;45(5):2629–43.
Zhu, Yafeng, et al. “Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.Nucleic Acids Res, vol. 45, no. 5, Mar. 2017, pp. 2629–43. Pubmed, doi:10.1093/nar/gkx006.
Zhu Y, Engström PG, Tellgren-Roth C, Baudo CD, Kennell JC, Sun S, Billmyre RB, Schröder MS, Andersson A, Holm T, Sigurgeirsson B, Wu G, Sankaranarayanan SR, Siddharthan R, Sanyal K, Lundeberg J, Nystedt B, Boekhout T, Dawson TL, Heitman J, Scheynius A, Lehtiö J. Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis. Nucleic Acids Res. 2017 Mar 17;45(5):2629–2643.
Journal cover image

Published In

Nucleic Acids Res

DOI

EISSN

1362-4962

Publication Date

March 17, 2017

Volume

45

Issue

5

Start / End Page

2629 / 2643

Location

England

Related Subject Headings

  • Sequence Analysis, RNA
  • Proteogenomics
  • Protein Domains
  • Peptides
  • Molecular Sequence Annotation
  • Malassezia
  • Genome, Mitochondrial
  • Genome, Fungal
  • Genes, Fungal
  • Fungal Proteins