Date of Award


Document Type

Campus Access Dissertation

Degree Name

Doctor of Philosophy (PhD)


Computer Science

First Advisor

Nurit Haspel

Second Advisor

Dan Simovici

Third Advisor

Marc Pomplun


Proteins are essential molecules in living organisms that perform a broad scope of functionalities, such as catalyzing biochemical reactions, providing structural support, and acting as signaling molecules. Understanding the structure and dynamics of proteins is crucial to elucidate their functions and develop therapies for various diseases. However, the conformational space of proteins is vast, and experimentally exploring it is challenging and time-consuming. Therefore, computational methods are becoming increasingly important for investigating protein conformational changes and dynamics. These methods are also subject to considerable limitations, which are discussed in this dissertation, alongside potential techniques to address them.

The goal of this Ph.D. research is to study the literature regarding protein conformational changes and develop efficient and effective methods to explore their conformational space, which is paramount to understanding how proteins function.

The first two projects focus on using Rapidly-exploring Random Trees (RRT) and Monte Carlo (MC) simulations to efficiently sample and explore the protein conformational space. In the first project, we propose a new RRT*-based search algorithm that outperforms previous methods in terms of exploration efficiency. We further improve the search by integrating rigidity analysis information into the exploration process to help guide the search toward more low-energy conformations. We use topological data analysis in the second project to gain insights into the shape of protein conformational spaces and develop more practical exploration strategies. These methods are demonstrated on several benchmark protein systems and show significant improvements over existing techniques. The final two projects explore new directions in using machine learning techniques to analyze protein conformations. In the third project, we concentrate on designing a method for classifying protein families based on learned compressed representations. We show that these compact representations, which we call fingerprints, capture the relevant features of protein families. In contrast, in the last project, we use Variational Autoencoders (VAE) to explore the conformational space of proteins on molecular dynamics simulation data. Together, this work significantly contributes to advancing our understanding of protein conformational dynamics and developing new tools for studying protein structure-function relationships.


Free and open access to this Campus Access Dissertation is made available to the UMass Boston community by ScholarWorks at UMass Boston. Those not on campus and those without a UMass Boston campus username and password may gain access to this dissertation through resources like Proquest Dissertations & Theses Global or through Interlibrary Loan. If you have a UMass Boston campus username and password and would like to download this work from off-campus, click on the "Off-Campus UMass Boston Users" link above.