Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles


Show simple item record Faith, Jeremiah J en_US Hayete, Boris en_US Thaden, Joshua T en_US Mogno, Ilaria en_US Wierzbowski, Jamey en_US Cottarel, Guillaume en_US Kasif, Simon en_US Collins, James J en_US Gardner, Timothy S en_US 2012-01-09T21:00:16Z 2012-01-09T21:00:16Z 2007-1-9 en_US
dc.identifier.citation Faith, Jeremiah J, Boris Hayete, Joshua T Thaden, Ilaria Mogno, Jamey Wierzbowski, Guillaume Cottarel, Simon Kasif, James J Collins, Timothy S Gardner. "Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles" PLoS Biology 5(1): e8. (2007) en_US
dc.identifier.issn 1545-7885 en_US
dc.description.abstract Machine learning approaches offer the potential to systematically identify transcriptional regulatory interactions from a compendium of microarray expression profiles. However, experimental validation of the performance of these methods at the genome scale has remained elusive. Here we assess the global performance of four existing classes of inference algorithms using 445 Escherichia coli Affymetrix arrays and 3,216 known E. coli regulatory interactions from RegulonDB. We also developed and applied the context likelihood of relatedness (CLR) algorithm, a novel extension of the relevance networks class of algorithms. CLR demonstrates an average precision gain of 36% relative to the next-best performing algorithm. At a 60% true positive rate, CLR identifies 1,079 regulatory interactions, of which 338 were in the previously known network and 741 were novel predictions. We tested the predicted interactions for three transcription factors with chromatin immunoprecipitation, confirming 21 novel interactions and verifying our RegulonDB-based performance estimates. CLR also identified a regulatory link providing central metabolic control of iron transport, which we confirmed with real-time quantitative PCR. The compendium of expression data compiled in this study, coupled with RegulonDB, provides a valuable model system for further improvement of network inference algorithms using experimental data. Author SummaryOrganisms can adapt to changing environments—becoming more virulent, for example, or activating stress responses—thanks to a flexible gene expression program controlled by the dynamic interactions of hundreds of transcriptional regulators. To unravel this regulatory complexity, multiple computational algorithms have been developed to analyze gene expression profiles and detect dependencies among genes over different conditions. It has been difficult to judge whether these algorithms can generate accurate global maps of regulatory interactions, however, because of the absence of a model organism with both a compendium of gene expression data and a corresponding network of experimentally determined regulatory interactions. To address this issue, we assembled 445 Escherichia coli microarrays, applied four classes of inference algorithms to the dataset, and validated the predictions against 3,216 experimentally determined E. coli interactions. The top-performing algorithm identifies 1,079 regulatory interactions at a confidence level of 60% or higher. Of these predicted interactions, 741 are novel and illuminate the regulation of amino acid biosynthesis, flagella biosynthesis, osmotic stress response, antibiotic resistance, and iron regulation. By defining the capabilities and limitations of network inference algorithms for large-scale mapping of prokaryotic regulatory networks, our work should facilitate their application to the mapping of novel microbes. A novel, machine-learning method is developed to predict transcriptional regulatory interactions, making use of microarray data. One interaction identified appears to be important for the control of iron transport. en_US
dc.description.sponsorship Pharmaceutical Research and Manufacturers of America Foundation; United States Department of Energy Office of Science (DEFG02-04ER63803); National Institutes of Health; National Science Foundation (EF-0425719); National Heart, Lung, and Blood Proteomics Initiative (HHSN268200248178C); Whitaker Foundation; Cellicon Biotechnologies, Inc. en_US
dc.language.iso en en_US
dc.publisher Public Library of Science en_US
dc.title Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles en_US
dc.type article en_US
dc.identifier.doi 10.1371/journal.pbio.0050008 en_US
dc.identifier.pubmedid 17214507 en_US
dc.identifier.pmcid 1764438 en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search OpenBU


Deposit Materials