Research Interests
My research interests broadly fall into biological and biomedical data science, in particular enabling data and knowledge to be more findable, accessible, interoperable, and reusable ("FAIR"). My research program for >15 years has focused on allowing the vast body of observational data expressed in natural language descriptions to be fully computable, through the use of knowledge representation and discovery technologies, in particular ontologies and machine reasoning. I have been funded for this work by the National Science Foundation (NSF) through a series of collaborative grants, as part of which I design and build eScience infrastructure, including data standards, ontologies, data integration systems, and reusable programming interfaces and tools.
Specifically, I am one of the PIs of Phenoscape, a >10 year collaborative project aiming to make evolutionary phenotype descriptions amenable to large-scale computation and reuse, by allowing machines to understand their semantics. Within the recently created HDR Imageomics Institute, where I am part of the leadership team, my research work centers around fully reproducible and automated machine learning (ML) workflows, and making structured knowledge available to ML algorithms. I was also a PI of the Phyloreferencing project, a collaborative effort to enable machines to understand phylogenetic clade definitions and to use them for reproducible computational data integration of taxon-linked data.
Specifically, I am one of the PIs of Phenoscape, a >10 year collaborative project aiming to make evolutionary phenotype descriptions amenable to large-scale computation and reuse, by allowing machines to understand their semantics. Within the recently created HDR Imageomics Institute, where I am part of the leadership team, my research work centers around fully reproducible and automated machine learning (ML) workflows, and making structured knowledge available to ML algorithms. I was also a PI of the Phyloreferencing project, a collaborative effort to enable machines to understand phylogenetic clade definitions and to use them for reproducible computational data integration of taxon-linked data.
Selected Grants
HDR Institute: Imageomics: A new frontier of biological information powered by knowledge-guided machine learning
ResearchPrincipal Investigator · Awarded by Ohio State University · 2021 - 2026Collaborative Research: ABI Innovation: Enabling machine-actionable semantics for comparative analyses of trait evolution
ResearchPrincipal Investigator · Awarded by National Science Foundation · 2017 - 2023HARDAC-M: Enabling memory-intensive computation for genomics
EquipmentPrincipal Investigator · Awarded by North Carolina Biotechnology Center · 2020 - 2021Collaborative Research: ABI Innovation: An Ontology-Based system for Querying Life in a Post-Taxonomic Age
ResearchPrincipal Investigator · Awarded by National Science Foundation · 2015 - 2020Phenoscape Knowledgebase Interop Codefest
ConferencePrincipal Investigator · Awarded by University of South Dakota · 2017 - 2018External Relationships
- Amazon, Inc
- Dryad Digital Data Repository
- Open Bioinformatics Foundation (OBF)
- Phoenix Bioinformatics
- Ronin Institute for Independent Scholarship
This faculty member (or a member of their immediate family) has reported outside activities with the companies, institutions, or organizations listed above. This information is available to institutional leadership and, when appropriate, management plans are in place to address potential conflicts of interest.