Nonparametric methods for microarray data based on exchangeability and borrowed power

MLT Lee, GA Whitmore, Harry Björkbacka, MW Freeman

Research output: Contribution to journalArticlepeer-review

Abstract

This article proposes nonparametric inference procedures for analyzing microarray gene expression data that are reliable, robust, and simple to implement. They are conceptually transparent and require no special-purpose software. The analysis begins by normalizing gene expression data in a unique way. The resulting adjusted observations consist of gene-treatment interaction terms ( representing differential expression) and error terms. The error terms are considered to be exchangeable, which is the only substantial assumption. Thus, under a family null hypothesis of no differential expression, the adjusted observations are exchangeable and all permutations of the observations are equally probable. The investigator may use the adjusted observations directly in a distribution-free test method or use their ranks in a rank-based method, where the ranking is taken over the whole data set. For the latter, the essential steps are as follows: 1. Calculate a Wilcoxon rank-sum difference or a corresponding Kruskal-Wallis rank statistic for each gene. 2. Randomly permute the observations and repeat the previous step. 3. Independently repeat the random permutation a suitable number of times. Under the exchangeability assumption, the permutation statistics are independent random draws from a null cumulative distribution function (c.d.f.) approximated by the empirical c.d.f. Reference to the empirical c.d.f. tells if the test statistic for a gene is outlying and, hence, shows differential expression. This feature is judged by using an appropriate rejection region or computing a p-value for each test statistic, taking into account multiple testing. The distribution-free analog of the rank-based approach is also available and has parallel steps which are described in the article. The proposed nonparametric analysis tends to give good results with no additional refinement, although a few refinements are presented that may interest some investigators. The implementation is illustrated with a case application involving differential gene expression in wild-type and knockout mice of an E. coli lipopolysaccharide (LPS) endotoxin treatment, relative to a baseline untreated condition.
Original languageEnglish
Pages (from-to)783-797
JournalJournal of Biopharmaceutical Statistics
Volume15
Issue number5
DOIs
Publication statusPublished - 2005

Subject classification (UKÄ)

  • Cardiology and Cardiovascular Disease

Free keywords

  • rank methods
  • normalization
  • nonparametric methods
  • multiple testing
  • microarray
  • gene expression
  • false discovery rate
  • distribution-free
  • exchangeable random variables
  • SAM
  • statistical analysis

Fingerprint

Dive into the research topics of 'Nonparametric methods for microarray data based on exchangeability and borrowed power'. Together they form a unique fingerprint.

Cite this