Date of Award

8-30-2022

Document Type

Campus Access Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Nurit Haspel

Second Advisor

Dan Simovici

Third Advisor

Marc Pomplun

Abstract

The large scale of biological data becoming available in recent years requires advanced computational methods capable of analyzing these complex, high-dimensional datasets to investigate biological processes and lead to new discoveries. There has been an increase in the utilization of machine learning in biology and proteomics to build predictive models of the underlying biological processes.

This dissertation provides machine learning solutions for three problems related to proteins and their structures. The first problem is to investigate how mutations in a protein sequence can affect its structure stability by using machine learning methods to predict the free energy changes comparing the mutated and not mutated (wild type) proteins. In this project, we compare three machine learning models for predicting the mutation effect. The second problem is focused on exploring protein dynamics and conformational changes. We employ a hybrid algorithm that combines Monte-Carlo sampling and a robotics-based method called RRT* to find conformational pathways using rigidity analysis. We also use a topological data analysis algorithm called mapper to find the intermediate conformations by clustering the conformations that are generated most by our algorithm. The last problem is about classifying protein families. In this part, we propose a method comprising two steps of dimensionality reduction and classification. We present a variational autoencoder for the first step and a convolutional neural network classifier for the second step.

Comments

Free and open access to this Campus Access Dissertation is made available to the UMass Boston community by ScholarWorks at UMass Boston. Those not on campus and those without a UMass Boston campus username and password may gain access to this dissertation through resources like Proquest Dissertations & Theses Global or through Interlibrary Loan. If you have a UMass Boston campus username and password and would like to download this work from off-campus, click on the "Off-Campus UMass Boston Users" link above.

Share

COinS