[Proc Amer Assoc Cancer Res, Volume 47, 2006]
Cellular and Molecular Biology 21: Computational Biology and Bioinformatics
SCIMS: A new algorithm for associating anticancer drug mechanisms with gene ontology categories
Gabriel S. Eichler,
Mark Reimers and
John N. Weinstein
NCI/NIH, Bethesda, MD
Introduction: In recent years, our research group has performed a number of experimental and computer studies that integrate different types of molecular data on the NCI-60, a diverse panel of human cancer cells from 9 tissues of origin. The NCI-60 panel has been used by the Developmental Therapeutics Program (DTP) of the NCI to screen >100,000 chemical compounds for anticancer activity since 1990. Using Pearsons correlation as a metric, we previously shed light on the mechanisms and potential medical uses of a number of anticancer agents, including L-asparaginase, oxaliplatin, MDR1-inverse compounds, and the ellipticiniums. The present study improves on that type of analysis by computing a more robust multi-gene correlation metric between drug growth inhibition data and biologically associated groups of genes, such as those in Gene Ontology (GO) categories or those with common Protein Family (Pfam) domains. Methods: To quantify the strength of association between anticancer drug activity patterns and mRNA transcriptional profiles, we developed a new algorithm (SCIMS) based on a stratified correlation metric and shrunken scoring scheme, with statistical significance computed using a false discovery rate (FDR) which adjusts for multiple comparisons. First, we validated the approach on a synthetic dataset of 100 drugs and 2000 genes. Then, we applied it to Affymetrix U133A microarray profiles of the NCI-60 and DTP GI50 data for 118 drugs with putatively known mechanisms of action. Because the program is compute-intensive, we ran it on the NIHs Biowulf supercomputing cluster. Results: The results for the synthetic datasets showed >95% sensitivity and specificity for prediction of known associations. When applied to the NCI-60 cell line data, the SCIMS algorithm identified a number of robust associations that coincide with previously documented mechanisms of action, including a particularly strong one between the activity patterns of antifols and the DNA processing GO category (FDR =.05). Conclusion: The results on the synthetic dataset suggest that the new algorithm is highly sensitive and specific, and the prediction of known mechanisms of action indicates utility of the algorithm for real data.
Copyright © 2006 by the American Association for Cancer Research.