Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach.

Published online

Journal Article

BACKGROUND: In testing for differential gene expression involving multiple serial analysis of gene expression (SAGE) libraries, it is critical to account for both between and within library variation. Several methods have been proposed, including the t test, tw test, and an overdispersed logistic regression approach. The merits of these tests, however, have not been fully evaluated. Questions still remain on whether further improvements can be made. RESULTS: In this article, we introduce an overdispersed log-linear model approach to analyzing SAGE; we evaluate and compare its performance with three other tests: the two-sample t test, tw test and another based on overdispersed logistic linear regression. Analysis of simulated and real datasets show that both the log-linear and logistic overdispersion methods generally perform better than the t and tw tests; the log-linear method is further found to have better performance than the logistic method, showing equal or higher statistical power over a range of parameter values and with different data distributions. CONCLUSION: Overdispersed log-linear models provide an attractive and reliable framework for analyzing SAGE experiments involving multiple libraries. For convenience, the implementation of this method is available through a user-friendly web-interface available at

Full Text

Cited Authors

  • Lu, J; Tomfohr, JK; Kepler, TB

Published Date

  • June 29, 2005

Published In

Volume / Issue

  • 6 /

Start / End Page

  • 165 -

PubMed ID

  • 15987513

Pubmed Central ID

  • 15987513

Electronic International Standard Serial Number (EISSN)

  • 1471-2105

Digital Object Identifier (DOI)

  • 10.1186/1471-2105-6-165


  • eng

Conference Location

  • England