Fall 2006

Dec 01, 2006: Tracy Bergemann

Building Models for Case-Parent Triad Data"

Studies that genotype individuals within nuclear families are now widespread. Generally, samples are drawn from an affected offspring, manifesting a disease or phenotype of interest, as well as from the parents [Ashan H et al, 2002]. In my collaborations, we are applying this design to a genetic study of adolescent osteosarcoma patients. We will genotype tagSNPs and cSNPs within each of twelve candidate gene regions, as well as exposure information for three different variables. We test for association, not only of single SNPs, but also any possible gene-gene interactions and gene-environment interactions. Hence the number of potential models to fit is quite large. I will discuss how to build these models and the issues involved in characterizing biological phenomenon for this study design. I will also introduce an extension of the Bayesian Information Criterion that incorporates information specific to genetic data.

Nov 10, 2006: LeeAnn Higgins

Signal vs. Noise and Other Inherent Challenges in Protein Mass Spectral Data Interpretation

The role of mass spectrometry in proteomics projects escalated after the optimization and implementation of two protein-friendly ionization techniques within the last decade. Applications of protein mass spectrometry quickly broadened to include disease biomarker investigation and its potential use for the diagnosis of diseases. During these phases, large amounts of data have been generated and analyzed by one or more software programs that provide reports and summaries for the end-user. Software output is typically accompanied by statistical analysis in order to provide levels of confidence and measures of random error. Some protein datasets are grouped according to patterns, with the final goal of distinguishing the diseased state from a healthy state. The ability for software programs to properly discern between mass spectrometric signal and noise, to provide accurate reports and to accurately assess of rates of false positives and false negatives is a computational challenge. Challenges inherent in the interpretation of mass spectrometric data can be met with an open mind to their existence, the proper experimental controls and a proper understanding of the software programs used for data analysis.

Nov 3, 2006: Yan Zhang

Challenges and opportunity in mass spectrometry based proteomics data

Mass spectrometry has not been widely used in proteomics study until two soft ionization methods were introduced about twenty years ago. These two methods are matrix-assisted laser desorption/ionization (MALDI) and electro spray ionization (ESI). Despite the developments the proteomics techniques and computational analysis methods have undergone during the last few years, biomarker study remains a lot of challenges due to the biological sample complexity, wide dynamic range of protein concentrations (10^10 magnitude), and more importantly, the gap between the ability to generate large amount of biological data and capability to analyze those data.

Here, at the University of Minnesota, mass spectrometry core facility provides the state-of-the-art instruments. Recent research, such as, the biomarker study on diagnosis and early detection of lung transplant chronic rejection in our lab, has demonstrated the feasibility of application of statistical analysis on proteomics data. In the near future, the further incorporation of sophisticated experimental design and statistical analysis into high performance mass spectrometry techniques will provide tremendous opportunities for proteomics and biomarker study.