Missing data in the 2 x 2 table: patterns and likelihood-based analysis for cross-sectional studies with supplemental sampling.

Published

Journal Article

Standard measures of crude association in the context of a cross-sectional study are the risk difference, relative risk and odds ratio as derived from a 2x 2 table. Most such studies are subject to missing data on disease, exposure, or both, introducing bias into the usual complete-case analysis. We describe several scenarios distinguished by the manner in which missing data arise, and for each we adjust the natural multinomial likelihood to properly account for missing data. The situations presented allow for increasing levels of generality with regard to the missing data mechanism. The final case, quite conceivable in epidemiologic studies, assumes that the probability of missing exposure depends on true exposure and disease status, as well as upon whether disease status is missing (and conversely for the probability of missing disease information). When parameters relating to the missing data process are inestimable without strong assumptions, we propose maximum likelihood analysis subsequent to collecting supplemental data in the spirit of a validation study. Analytical results give insight into the bias inherent in complete-case analysis for each scenario, and numerical results illustrate the performance of likelihood-based point and interval estimates in the most general case. Adjustment for potential confounders via stratified analysis is also discussed.

Full Text

Duke Authors

Cited Authors

  • Lyles, RH; Allen, AS

Published Date

  • February 28, 2003

Published In

Volume / Issue

  • 22 / 4

Start / End Page

  • 517 - 534

PubMed ID

  • 12590411

Pubmed Central ID

  • 12590411

International Standard Serial Number (ISSN)

  • 0277-6715

Digital Object Identifier (DOI)

  • 10.1002/sim.1348

Language

  • eng

Conference Location

  • England