Aronow's Lab |
|||||||||
|
|||||||||
Bruce J. Aronow, Ph.D.The Bioinformatics Core for the University of Cincinnati Comparative Mouse Genome Centers Consortium (CMGCC) is approaching several aims in order to assist researchers: (i) in designing and analyzing gene and protein expression experiments, and (ii) in evaluating their experimental data in the context of other relevant expression profile and other genetic data (e.g., genotype and sequence data), and to build three databases that can support efforts to create and understand mouse models generated by the entire consortium. The three principle systems are: GeneServer: Information Server for Genes of High Interest to CMGCC InvestigatorsThe goal of the geneserver database is to act as a repository and lookup source for to integrate information about genes, transcripts, proteins, and pathways. This is being approached in successive steps:
Comparative Genomics analysis tool to find highly conserved genomic regions that may contain functional domains and regions. This has been approached mainly for the goal of identifying critical promoter and enhancer elements using the comparative genomics of cis-elements within conserved sequences approach.
GENETThrough Genet, the Gene Expression Data Server, we have begun to make available our published and some unpublished gene expression data for web view, searching, re-analysis, and download. The system is available at http://genet.chmcc.org. We have now placed onto this webserver data from the effects of Rb delta CK (PSMRB) that is to accompany a submitted manuscript (Markey et al. ) and this data is now available for login as username CMGCC with password CMGCC. We are planning to add additional data pertinent to DNA damage and cancer models. The important overall goal of these efforts is to identify gene responses that can predict the potential for likely harmful consequences of human polymorphism substitutions into the mouse genome. Identifying differentially expressed genes in multifactorial experiments: We have developed a complete strategy for analyzing microarray data generated by the Genomics Core. The strategy incorporates curvilinear within and between array normalization approaches that effectively remove systematic biases in the data. Linear models based statistical analysis of processed data is then applied which allows us to optimally utilize information from the whole experiment with the goal to identify genes whose expression is affected by various combination of treatments. To effectively perform such analysis that involve fitting thousands of ANOVA models, we developed appropriate SAS programs that utilize the unmatched linear models capabilities of this statistical package. We also developed appropriate Perl routines for pre-processing of the raw data generated by the Genomics Core to a format that can be directly accessed by SAS. Complete processing of a 15 array 3 factors experiment, including the data pre-processing, comprehensive normalization and fitting of all applicable ANOVA models, calculation of various measures of statistical significance (e.g. individual p-values, False Discovery Rate based significance measures, etc.), generation of comprehensive model and data quality diagnostics, and merging of gene annotations, takes about 1 hour of computing time on a high-end PC workstation. Outputs of such analysis are then, depending on the preferences of individual investigators, uploaded to GeneSpring, transferred to Microsoft Excel and/or other electronic formats. The linear model (i.e. Analysis of Variance) approach allows us to reduce the cost of multifactorial experiments by reducing the number of combinations of experimental factors that need to be directly compared on a microarray. The appropriate experimental design is crucial for ones ability to identify statistically significant changes. In order to choose an optimal experimental design, investigators whose microarray experiments are subsidized by CMGCC are required to consult a CMGCC member with experience in designing such experiments prior to conducting the experiment. We have begun to make available our published and some unpublished gene expression data for web view, searching, re-analysis, and download. The system is available at http://genet.chmcc.org/. We have now placed onto this webserver data from the effects of Rb delta CK (PSMRB) that is to accompany a submitted manuscript (Markey et al. from the Knudsen laboratory) and this data is now available for login as username CMGCC with password CMGCC. We are planning to add additional data pertinent to DNA damage and cancer models. The important overall goal of these efforts is to identify gene responses that can predict the potential for likely harmful consequences of human polymorphism substitutions into the mouse genome. We have also completed the incorporation of a very large dataset of 350 microarrays from children with a variety of leukemia types with different oncogenes active. We are exploring mechanisms of mining this data for gene functions in cell cycle control and DNA damage that are related to oncogenic pathways. PathMaker Objects:
Components:
Bioinformatics Projects |
|||||||||
|
|||||||||