Date of Award
Campus Access Thesis
Master of Science (MS)
Understanding transcription factor (TF) binding preferences is vital to understanding gene regulation. Many computational methods have been developed to model these preferences from experimental binding data, but the most common methods have a serious deficiency in that they cannot model insertions and deletions (indels) in a binding site. Profile Hidden Markov models (pHMMs) are probabilistic models that, unlike other methods, can include indels. PHMMs can be used on their own to model binding, or incorporated into more complex HMM topologies that can accommodate variable-length spacers. We performed computational analyses of the binding preferences of two TFs using HMMs. First, we used six different HMM topologies to model the binding motif of Gcn4 and compared their accuracies. From this analysis, we found complex dependencies between the variable-length spacer and the half-sites, which cannot be modeled by the HMM topologies currently in use. We also developed a new methodology for comparing different HMM topologies for a TF in order to choose the optimal one that is accurate, biologically sound, and contains as few parameters as possible. In our second application, we used pHMMs to analyze how mutations in p53 that affect its ability to dimerize (cooperativity) may also affect its binding preferences. We used pHMMs to characterize the binding motif of wild-type p53, as well as variants that demonstrate lower cooperativity than wild-type. We also analyzed microarray data for each variant, and observed differential gene expression associated with cooperativity. However, our models are very similar for low and high-cooperativity p53, which suggests that differences in binding specificity are not responsible for the differential gene expression between the variants. We provide strong evidence that, instead, the differences in gene expression are caused by an overall reduction in binding affinity in those variants. Importantly, we show that it is critical to choose a model that accommodates the unique binding characteristics of a TF when modeling its binding preferences.
Colaneri, Cory, "Using Hidden Markov Models to Capture Unique Binding Characteristics of Gcn4 and p53" (2017). Graduate Masters Theses. 422.