Coupling optional pólya trees and the two sample problem

Journal Article (Journal Article)

Testing and characterizing the difference between two data samples is of fundamental interest in statistics. Existing methods such as Kolmogorov-Smirnov and Cramer-vonMises tests do not scale well as the dimensionality increases and provide no easy way to characterize the difference should it exist. In this work, we propose a theoretical framework for inference that addresses these challenges in the form of a prior for Bayesian nonparametric analysis. The new prior is constructed based on a random-partition-and-assignment procedure similar to the one that defines the standard optional Pólya tree distribution, but has the ability to generate multiple random distributions jointly. These random probability distributions are allowed to "couple," that is, to have the same conditional distribution, on subsets of the sample space. We show that this "coupling optional Pólya tree" prior provides a convenient and effective way for both the testing of two sample difference and the learning of the underlying structure of the difference. In addition, we discuss some practical issues in the computational implementation of this prior and provide several numerical examples to demonstrate its work. Supplementary materials for this article are available online. © 2011 American Statistical Association.

Full Text

Duke Authors

Cited Authors

  • Ma, L; Wong, WH

Published Date

  • December 1, 2011

Published In

Volume / Issue

  • 106 / 496

Start / End Page

  • 1553 - 1565

International Standard Serial Number (ISSN)

  • 0162-1459

Digital Object Identifier (DOI)

  • 10.1198/jasa.2011.tm10003

Citation Source

  • Scopus