Date of Award
5-31-2017
Document Type
Campus Access Thesis
Degree Name
Master of Science (MS)
Department
Biology
First Advisor
Todd Riley
Second Advisor
Jill Macoska
Third Advisor
Linda Huang
Abstract
Understanding transcription factor (TF) binding preferences is vital to understanding gene regulation. Many computational methods have been developed to model these preferences from experimental binding data, but the most common methods have a serious deficiency in that they cannot model insertions and deletions (indels) in a binding site. Profile Hidden Markov models (pHMMs) are probabilistic models that, unlike other methods, can include indels. PHMMs can be used on their own to model binding, or incorporated into more complex HMM topologies that can accommodate variable-length spacers. We performed computational analyses of the binding preferences of two TFs using HMMs. First, we used six different HMM topologies to model the binding motif of Gcn4 and compared their accuracies. From this analysis, we found complex dependencies between the variable-length spacer and the half-sites, which cannot be modeled by the HMM topologies currently in use. We also developed a new methodology for comparing different HMM topologies for a TF in order to choose the optimal one that is accurate, biologically sound, and contains as few parameters as possible. In our second application, we used pHMMs to analyze how mutations in p53 that affect its ability to dimerize (cooperativity) may also affect its binding preferences. We used pHMMs to characterize the binding motif of wild-type p53, as well as variants that demonstrate lower cooperativity than wild-type. We also analyzed microarray data for each variant, and observed differential gene expression associated with cooperativity. However, our models are very similar for low and high-cooperativity p53, which suggests that differences in binding specificity are not responsible for the differential gene expression between the variants. We provide strong evidence that, instead, the differences in gene expression are caused by an overall reduction in binding affinity in those variants. Importantly, we show that it is critical to choose a model that accommodates the unique binding characteristics of a TF when modeling its binding preferences.
Recommended Citation
Colaneri, Cory, "Using Hidden Markov Models to Capture Unique Binding Characteristics of Gcn4 and p53" (2017). Graduate Masters Theses. 422.
https://scholarworks.umb.edu/masters_theses/422
Comments
Free and open access to this Campus Access Thesis is made available to the UMass Boston community by ScholarWorks at UMass Boston. Those not on campus and those without a UMass Boston campus username and password may gain access to this thesis through resources like Proquest Dissertations & Theses Global or through Interlibrary Loan. If you have a UMass Boston campus username and password and would like to download this work from off-campus, click on the "Off-Campus UMass Boston Users" link above.