by Jason Ernst, Qasim K. Beg, Krin A. Kay, Gabor Balazsi, Zoltán N.
Oltvai, and Ziv Bar-Joseph
PLoS Computational Biology 4(3): e1000044, 2008.
Abstract
While Escherichia coli has one of the most comprehensive datasets of
experimentally verified transcriptional regulatory interactions of any
organism, it is still far from complete. This presents a problem when
trying to combine gene expression and regulatory interactions to model
transcriptional regulatory networks. Using the available regulatory
interactions to predict new interactions may lead to better coverage and
more accurate models. Here, we develop, SEREND (SEmi-supervised REgulatory
Network Discoverer), a semi-supervised learning method that uses a curated
database of verified transcriptional factor-gene interactions, DNA
sequence binding motifs, and a compendium of gene expression data in order
to make thousands of new predictions about transcription factor-gene
interactions, including whether the transcription factor activates or
represses the gene. Using genome-wide binding datasets for several
transcription factors we demonstrate that our semi-supervised
classification strategy improves the prediction of targets for a given
transcription factor. To further demonstrate the utility of our inferred
interactions we generated a new microarray gene expression dataset for the
aerobic to anaerobic shift response in E. coli. We used our inferred
interactions with the verified interactions to reconstruct a dynamic
regulatory network for this response. The network reconstructed when using
our inferred interactions was better able to correctly identify known
regulators and suggested additional activators and repressors as having
important roles during the aerobic-anaerobic shift interface.
To view the aerobic-anaerobic shift response maps from the paper
in the
DREM software, download
DREM and this zip file.
The zip file contains the
aerobic-anaerobic shift response data, the settings files, and the cmd launch
scripts for both the curated and prediction extended TF-gene interaction input,
and the version of the EBI UniProt Ecoli K12 GO Annotations used in the
paper. Place the unzipped
files in the root of the drem directory.