Skip to main content
Journal cover image

Subset clustering of binary sequences, with an application to genomic abnormality data.

Publication ,  Journal Article
Hoff, PD
Published in: Biometrics
December 2005

This article develops a model-based approach to clustering multivariate binary data, in which the attributes that distinguish a cluster from the rest of the population may depend on the cluster being considered. The clustering approach is based on a multivariate Dirichlet process mixture model, which allows for the estimation of the number of clusters, the cluster memberships, and the cluster-specific parameters in a unified way. Such a clustering approach has applications in the analysis of genomic abnormality data, in which the development of different types of tumors may depend on the presence of certain abnormalities at subsets of locations along the genome. Additionally, such a mixture model provides a nonparametric estimation scheme for dependent sequences of binary data.

Duke Scholars

Published In

Biometrics

DOI

EISSN

1541-0420

ISSN

0006-341X

Publication Date

December 2005

Volume

61

Issue

4

Start / End Page

1027 / 1036

Related Subject Headings

  • Statistics & Probability
  • Monte Carlo Method
  • Models, Statistical
  • Models, Genetic
  • Markov Chains
  • Humans
  • Genome, Human
  • Cluster Analysis
  • Chromosome Aberrations
  • Carcinoma, Renal Cell
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Hoff, P. D. (2005). Subset clustering of binary sequences, with an application to genomic abnormality data. Biometrics, 61(4), 1027–1036. https://doi.org/10.1111/j.1541-0420.2005.00381.x
Hoff, Peter D. “Subset clustering of binary sequences, with an application to genomic abnormality data.Biometrics 61, no. 4 (December 2005): 1027–36. https://doi.org/10.1111/j.1541-0420.2005.00381.x.
Hoff, Peter D. “Subset clustering of binary sequences, with an application to genomic abnormality data.Biometrics, vol. 61, no. 4, Dec. 2005, pp. 1027–36. Epmc, doi:10.1111/j.1541-0420.2005.00381.x.
Journal cover image

Published In

Biometrics

DOI

EISSN

1541-0420

ISSN

0006-341X

Publication Date

December 2005

Volume

61

Issue

4

Start / End Page

1027 / 1036

Related Subject Headings

  • Statistics & Probability
  • Monte Carlo Method
  • Models, Statistical
  • Models, Genetic
  • Markov Chains
  • Humans
  • Genome, Human
  • Cluster Analysis
  • Chromosome Aberrations
  • Carcinoma, Renal Cell