Supplemental Results
Yeang CH*, Mak HC*, McCuine S, Workman C, Jaakola T, Ideker TI
Genome Biology (2005) 6:R62   [fulltext] [PDF]


 Computer Science and Artificial Intelligence Laboratory /
    Laboratory for Integrative Network Biology
MIT / UC San Diego
 ·  Home
Methods
 ·  Experimental Protocols
 ·  Internal Validation of Models
 ·  Knockout Data Reproducibility
·  Regulated Gene Selection
 ·  Building Physical Network Models
·  Protein-DNA Data
·  Protein-Protein Data
·  Knockout Data
 ·  Model Inference
 ·  Software Download
 ·  Evaluating for New Experiments
Data
 ·  Inferred Network Models
 ·  Download Network Model Data
 ·  References

Building Physical Network Models

The Potential of Protein-DNA Data

The ChIP-chip (location analysis) experiments from Lee et al. provided a p-value for a transcription factor's association with each intergenic promoter region, and these confidence values were transformed into potential functions. Denote ei as a likely protein-DNA interaction, xei as a binary variable indicating the presence or absence of the interaction, and yei as its measurement in ChIP-chip data. The potential function associated with a specific binding event was then defined based on the likelihood ratio


We assumed that the p-value was a measure of testing the null hypothesis H0 (xei=0, the binding does not occur) against the alternative hypothesis H1 (xei=1, the binding occurs). Under certain regularity conditions the asymptotic sampling distribution of the log-likelihood ratio statistic is Χ2 with one degree of freedom. We assumed that an extra degree of freedom in the alternative hypothesis was from the unknown binding affinity. The value of this log-likelihood ratio statistic was computed by


where p was the reported p-value and F was the cumulative Χ2 distribution with one degree of freedom. We interpreted P(yei|xei) as a Bayesian-marginal likelihood that has a simple asymptotic approximation (Schwarz 1978):


where P(yei|H1) involved a maximum likelihood fit, d1 was the number of degrees of freedom in H1, and n was the sample size from which the p-value was computed (the number of replicates; n = 3 in our case). Thus


where n = 3.