Scholars@Duke publication: On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data.

On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data.

Publication , Journal Article

Molenberghs, G; Kenward, MG; Aerts, M; Verbeke, G; Tsiatis, AA; Davidian, M; Rizopoulos, D

Published in: Stat Methods Med Res

February 2014

The vast majority of settings for which frequentist statistical properties are derived assume a fixed, a priori known sample size. Familiar properties then follow, such as, for example, the consistency, asymptotic normality, and efficiency of the sample average for the mean parameter, under a wide range of conditions. We are concerned here with the alternative situation in which the sample size is itself a random variable which may depend on the data being collected. Further, the rule governing this may be deterministic or probabilistic. There are many important practical examples of such settings, including missing data, sequential trials, and informative cluster size. It is well known that special issues can arise when evaluating the properties of statistical procedures under such sampling schemes, and much has been written about specific areas (Grambsch P. Sequential sampling based on the observed Fisher information to guarantee the accuracy of the maximum likelihood estimator. Ann Stat 1983; 11: 68-77; Barndorff-Nielsen O and Cox DR. The effect of sampling rules on likelihood statistics. Int Stat Rev 1984; 52: 309-326). Our aim is to place these various related examples into a single framework derived from the joint modeling of the outcomes and sampling process and so derive generic results that in turn provide insight, and in some cases practical consequences, for different settings. It is shown that, even in the simplest case of estimating a mean, some of the results appear counterintuitive. In many examples, the sample average may exhibit small sample bias and, even when it is unbiased, may not be optimal. Indeed, there may be no minimum variance unbiased estimator for the mean. Such results follow directly from key attributes such as non-ancillarity of the sample size and incompleteness of the minimal sufficient statistic of the sample size and sample sum. Although our results have direct and obvious implications for estimation following group sequential trials, there are also ramifications for a range of other settings, such as random cluster sizes, censored time-to-event data, and the joint modeling of longitudinal and time-to-event data. Here, we use the simplest group sequential setting to develop and explicate the main results. Some implications for random sample sizes and missing data are also considered. Consequences for other related settings will be considered elsewhere.

Duke Scholars

Author Marie Davidian Biostatistics & Bioinformatics, Division of Biostatistics

Published In

Stat Methods Med Res

DOI

10.1177/0962280212445801

EISSN

1477-0334

Publication Date

February 2014

Volume

Issue

Start / End Page

11 / 41

Location

England

Related Subject Headings

Statistics & Probability
Sample Size
Probability
Models, Statistical
Likelihood Functions
4905 Statistics
4202 Epidemiology
1117 Public Health and Health Services
0104 Statistics

Citation

APA

Chicago

ICMJE

MLA

NLM

Molenberghs, G., Kenward, M. G., Aerts, M., Verbeke, G., Tsiatis, A. A., Davidian, M., & Rizopoulos, D. (2014). On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data. Stat Methods Med Res, 23(1), 11–41. https://doi.org/10.1177/0962280212445801

Molenberghs, Geert, Michael G. Kenward, Marc Aerts, Geert Verbeke, Anastasios A. Tsiatis, Marie Davidian, and Dimitris Rizopoulos. “On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data.” Stat Methods Med Res 23, no. 1 (February 2014): 11–41. https://doi.org/10.1177/0962280212445801.

Molenberghs G, Kenward MG, Aerts M, Verbeke G, Tsiatis AA, Davidian M, et al. On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data. Stat Methods Med Res. 2014 Feb;23(1):11–41.

Molenberghs, Geert, et al. “On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data.” Stat Methods Med Res, vol. 23, no. 1, Feb. 2014, pp. 11–41. Pubmed, doi:10.1177/0962280212445801.

Molenberghs G, Kenward MG, Aerts M, Verbeke G, Tsiatis AA, Davidian M, Rizopoulos D. On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data. Stat Methods Med Res. 2014 Feb;23(1):11–41.

Published In

Stat Methods Med Res

DOI

10.1177/0962280212445801

EISSN

1477-0334

Publication Date

February 2014

Volume

Issue

Start / End Page

11 / 41

Location

England

Related Subject Headings

Statistics & Probability
Sample Size
Probability
Models, Statistical
Likelihood Functions
4905 Statistics
4202 Epidemiology
1117 Public Health and Health Services
0104 Statistics