Statistical Genomics and Spatial Statistics: Incorporating Biological Knowledge of Genes into Analysis of Genomic Data
Wei Pan
Division of Biostatistics
University of Minnesota
Wednesday, December 5, 2007
3:30pm
MoosT 1-450G
Minneapolis Campus
Abstract:
It is a common task in genomic studies to identify a subset of the genes satisfying
certainconditions, such as differentially expressed genes or regulatory target
genes of a transcription factor (TF). This can be formulated as a statistical
hypothesis testing problem. Most existing approaches treat the genes as having
an identical and independent distribution a priori, testing each gene independently
or testing some subsets of the genes one by one. On the other hand, it is known
that the genes work coordinately as dictated by gene networks. Treating genes
equally and independently ignores the important information contained in gene
networks, leading to inefficient analysis and reduced power. We propose incorporating
gene network information into statistical analysis of genomic data. Specifically,
rather than treating the genes equally and independently a priori in a standard
mixture model, we assume that gene-specific prior probabilities are correlated
as induced by a gene network: while the genes are allowed to have different
prior probabilities, those neighboring ones in the network have similar prior
probabilities, reflecting their shared biological functions. We applied the
two approaches to a real ChIP-chip dataset (and simulated data) to identify
the transcriptional target genes of TF GCN4. The new method was found to be
more powerful in discovering the target genes. This is joint work with Peng
Wei.
A social tea will be held at 3:00 P.M. in A434 Mayo. All are Welcome.
For more details contact 612-624-4655 or see http://www.biostat.umn.edu/seminar_academic.html