Common to rare transfer learning (CORAL) enables inference and prediction for a quarter million rare Malagasy arthropods.
DNA-based biodiversity surveys result in massive-scale data, including up to millions of species-of which, most are rare. Making the most of such data for inference and prediction requires modeling approaches that can relate species occurrences to environmental and spatial predictors, while incorporating information about their taxonomic or phylogenetic placement. Even if the scalability of joint species distribution models to large communities has greatly advanced, incorporating hundreds of thousands of species has not been feasible to date, leading to compromised analyses. Here we present a 'common to rare transfer learning' (CORAL) approach, based on borrowing information from the common species to enable statistically and computationally efficient modeling of both common and rare species. We illustrate that CORAL leads to much improved prediction and inference in the context of DNA metabarcoding data from Madagascar, comprising 255,188 arthropod species detected in 2,874 samples.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Phylogeny
- Madagascar
- Machine Learning
- Developmental Biology
- DNA Barcoding, Taxonomic
- Biodiversity
- Arthropods
- Animals
- 31 Biological sciences
- 11 Medical and Health Sciences
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Phylogeny
- Madagascar
- Machine Learning
- Developmental Biology
- DNA Barcoding, Taxonomic
- Biodiversity
- Arthropods
- Animals
- 31 Biological sciences
- 11 Medical and Health Sciences