References for Microarray Data Analysis
References for Microarray Data Analysis
PubH 5470-2 (Spring 2002)
(With comments in parentheses)
http://www.biostat.umn.edu/~weip/course/ge/ref02s.html
- Introduction to microarray technologies
- Brown P and Botstein D. Exploring the new world of the
genome with DNA microarrays. Nature Genetics
Supplement 21:33-37, 1999.
(A general introduction on cDNA array technology and its
applications)
- Duggan DJ, Bittner M, Chen Y, Meltzer P and Trent JM.
Expression profiling using cDNA microarrays.
Nature Genetics Supplement 21:10-14, 1999.
(Intro to cDNA arrays)
- Lipshutz RJ, Fodor SPA, Gingeras TR and Lockhart DJ.
High density synthetic oligonucleotide arrays.
Nature Genetics Supplement 21:20-24, 1999.
(Intro to Affy oligonucleotide arrays)
- Affymetrix Inc. Statistical Algorithms Reference Guide.
(Intro to algorithms being used to summarize gene
expression levels for Affymetrix Microarray Suite version 5.0)
- Li C and Wong WH. Model-based analysis of oligonucleotide
arrays: expression index computation and outlier detection.
PNAS, 98:31-36, 2001.
(Use of a multiplicative model to summarize expression levels
for Affy arrays)
- Li C and Wong WH. Model-based analysis of oligonucleotide
arrays: model validation, design issues and standard error
application.
Genome Biology,
2001/2/8/research/0032.
(Further development and more numerical evaluations)
- Detecting differentially expressed genes
- Chen Y, Dougherty ER and Bittner ML. Ratio-based decisions
and the quantitative analysis of cDNA microarray images.
J Biomedical Optics, 2:364-367, 1997.
(Probably the earliest paper on statistical analysis of
array data; use of the Wilcoxon nonparametric test;
proposed Normal-based parametric models)
- Newton M etc. On differential variability of expression
ratios: improving statistical inference about gene
expression changes from microarray data. Journal of
Computational Biology, 8:37-52, 2001.
(Online
access at Uof M)
(Parametric Bayesian approach, w/o replications)
- Lin Y, Nadler ST, Attie AD and Yandell BS. Mining for
low-abundance transcripts in microarray data.
PS
(Nonparametric approach, w/o replications)
- Kerr MK, Martin M and Churchill GA. Analysis of variance
for gene expression microarray data.
Journal of Computational Biology, 7:819-837, 2000.
(Online
access at Uof M)
or PDF
(Use of ANOVA model)
- Kerr, M.K. et al. Statistical analysis of a gene expression
microarray experiemnt with replication.
To appear Statistica Sinica, 2002.
PDF
- Dudoit S, Yang YH, Callow MJ and Speed TP. Statistical
methods for identifying differentially expressed genes
in replicated cDNA microarray experiements.
Statistica Sinica, 12:111-139, 2002.
PS
(Use of loess curve to center;
permutation test using t-statistic; adjustment for multiple
tests)
- Ideker, T., Thorsson, V., Siehel, A.F. and Hood, L.E. (2000). Testing
for differentially-expressed genes by maximum likelihood analysis of
microarray data. Journal of Computational Biology, 7, 805-817.
access at U of M)
(A Normal-based linear regression approach)
- Thomas, J.G., Olson, J.M., Tapscott, S.J. and Zhao, L.P. (2001). An
efficient and robust statistical modeling approach to discover differentially
expressed genes using genomic expression profiles. {\em Genome Research},
{\bf 11}, 1227-1236.
(A regression approach using the robust/sandwich estimator)
- Wolfinger RD, et al. Assessing gene significance from
cDNA microarray expression data via mixed models.
PDF
J of Computational Biology, 8:625-637.
access at Uof M)
(Use of Normal-based
linear mixed models to do normalization and detecting
differential expression)
- Tusher VG, Tibshirani R and Chu G. Significance analysis
of microarrays applied to the ionizing radiation response.
PNAS, 98:5116-5121, 2001.
(Two conditions with replications; SAM:
use permutation-type tests
and FDR to control for multiplicity)
- Efron B, Tibshirani R, Goss V and Chu G. Microarrays and
their use in a comparative experiment. 2000.
PS
(A modified/newer version: Efron B, Tibshirani R, Storey JD,
and Tusher V. (2001) Empirical Bayes analysis of a microarray
experiment.
Journal of the American Statistical Association, 96:1151-1160, 2001)
(Empirical Bayesian and Frequentist approaches, with
replications)
- Pan W. A Comparative Review of Statistical Methods for
Discovering Differentially Expressed Genes in Replicated
Microarray Experiments.
To appear in Bioinformatics. Also
Research Report 2001-028,
Division of Biostatistics, University of Minnesota, 2001.
PS
or pdf
(Compared the t-test, the Wilcoxon rank test,
the robust regression of Thomas et al, the EB of Efron et al,
the SAM of Tusher et al, and the mixture model of Pan et al)
- Pavlidis P and Noble WS. Analysis of strain and region variation
in gene expression in mouse brain.
Genome Biology,
2001/2/10/research/0042.
(Normal-based two-way ANOVA for two factors with
possibly more than two categories)
- Baldi P and Long AD. A Bayesian framework for the analysis of
microarray expression data: regularized t-test and statistical
inferences of gene changes.
Bioinformatics, 17: 509-519, 2001.
access at U of M)
(Parametric Bayesian approach to t-test)
- Clustering: hierachical, K-means, SOM and model-based clustering.
- Eisen M, Spellman P, Brown P and Botstein D. Cluster
analysis and display of genome-wide expression patterns.
PNAS, 95:14863-14868, 1998.
(hierachical clustering)
- Tavazoie et al. Systematic determination of genetic
network architecture. Nature Genetics, 22:281-285,
1999.
(K-means clustering)
- Tamayo et al. Interpreting patterns of gene expression
with self-organizing maps: methods and application to
hematopoietic differntiation.
PNAS, 96:2907-2912, 1999.
(SOM clustering)
- Zhang K and Zhao H. Assessing reliability of gene clusters
from gene expression data.
Funct Integr Genomics, 1:156-173, 2000.
- Kerr MK and Churchill GA. Bootstrapping cluster analysis:
assessing the reliability of conclusions from microarray
experiments.
PNAS, 98:8961-8965, 2001.
PDF
- Tibshirani R, Walther G, Botstein D and Brown P.
Cluster validation by prediction strength.
PS
- Lee M-L T, Kuo FC, Whitmore GA and Sklar J. Importance of
replication in microarray gene expression studies:
statistical methods and evidence from repetitive cDNA
hybridizations. PNAS,
97:9834-9839, 2000.
(One condition with replications: A mixture of two normals)
- Pan W, Lin J and Le C. Model-based cluster analysis of microarray
gene expression data.
Genome Biology,
3(2): research0009.1-0009.8, 2002.
(Model-based clustering of t-statistics to
explore differential gene expression)
- Ghosh D and Chinnaiyan AM. Mixture modelling of gene expression
data from microarray experiments. Bioinformatics,
18:275-286, 2002.
access at U of M)
(Model-based clustering of gene expression patterns)
- Bhattacharjee et al. Classification of human lung carcinomas
by mRNA expression profiling reveals distinct adenocarcinoma
subclasses.
PNAS, 98:13790-13795, 2001.
- Zhang K and Zhao H. Assessing reliability of gene clusters
from gene expression data.
Funct Integr Genomics, 1:156-173, 2000.
- Hastie T et al. 'Gene shaving' as a method for identifying
distinct sets of genes with similar expression patterns.
Genome Biology,
1(2):research0003.1-0003.21, 2000.
- Hastie T et al. Supervised harvesting of expression trees.
Genome Biology,
2(1):research0003.1-0003.12, 2001.
- Li H and Hong F.
Cluster-Rasch models for microarray gene expression data.
Genome Biology,
2(8)}:research0031.1-0031.13, 2001.
- Classification: discriminant analysis
- Golub T et al. Molecular classification of cancer: class
discovery and class prediction by gene expression
monitoring. Science, 286:531-537, 1999.
(Proposed a weighted voting algorithm)
- Slonim DK, Tamayo P, Mesirov JP, Golub TR and Lander ES.
Class prediction and discovery using gene expression data.
(More detailed description on the methods of Golub et al.)
- S. Dudoit, J. Fridlyand, and T. P. Speed.
Comparison of Discrimination Methods for the Classification
of Tumors Using Gene Expression Data. June 2000.
PS
- Radmacher MD, McShane LM and Simon R. A paradigm for class
prediction using gene expression profiles.
Technical Report 001, National Cancer Institute.
(Assessing the "significance" of classoification results using
a breast cancer data set)
- Hedenfalk I et al. Gene-expression profiles in hereditary breast
cancer. New England Journal of Medicine, 344:539-548, 2001.
PDF
(More scientific background and results using the
previous breast cancer data set)