Date of Award
8-2024
Document Type
Campus Access Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science
First Advisor
Wei Ding
Second Advisor
Dan Simovici
Third Advisor
Ping Chen, Tiago Cogumbreiro
Abstract
Despite the current technological era's ease of data collection, obtaining sufficient data for neural network training remains challenging due to factors like subject rarity, high costs, time constraints, incomplete historical data, and limited data in early research phases. Few-shot learning addresses these issues by enabling effective learning from small datasets, reducing data collection costs, and increasing accessibility to advanced machine-learning techniques. It also enhances training efficiency, improves generalization, and facilitates rapid adaptation to new tasks, proving invaluable in dynamic environments. Few-shot learning faces significant challenges due to data scarcity and the need for robust generalization. Limited training examples hinder effective learning and increase the risk of overfitting, leading to poor performance on new data. This dissertation explores various methodologies to overcome these challenges, aiming to improve model robustness and performance in real-world scenarios. First, we address the data scarcity problem that necessitates a multifaceted approach, with the overwhelming number of parameters in neural networks emerging as a primary culprit. This surplus of parameters intensifies the scarcity issue, leading to suboptimal performance and hindering effective learning. We address this issue through a comprehensive set of methodologies, leveraging domain knowledge to incorporate prior information, employing specialized algorithms tailored to the dataset, integrating Conditional Restricted Boltzmann Machines (CRBMs) with Deep Neural Networks (DNNs) to enhance learning efficiency, and optimizing efficiency through modular sparsification techniques. Secondly, it is also imperative to underscore the significance of generalization and robustness in addressing real-world challenges. Given the limited availability of labeled data typical in such scenarios, the ability of models to generalize effectively from sparse examples becomes pivotal. We endeavor to tackle this issue through a diverse array of methodologies. These encompass network-based regularization, prioritization of techniques leveraging domain knowledge, as well as the strategic utilization of hybrid models. Finally, we will merge the previously outlined methods into a cohesive model to effectively tackle real-world problems. We will perform experiments to compare our approach with SOTA methods, thereby demonstrating its effective performance.
Recommended Citation
Kang, Tianyu, "Effective Neural Network Architecture for Few-Shot Learning" (2024). Graduate Doctoral Dissertations. 991.
https://scholarworks.umb.edu/doctoral_dissertations/991
Comments
Free and open access to this Campus Access Thesis is made available to the UMass Boston community by ScholarWorks at UMass Boston. Those not on campus and those without a UMass Boston campus username and password may gain access to this thesis through resources like Proquest Dissertations & Theses Global (https://www.proquest.com/) or through Interlibrary Loan. If you have a UMass Boston campus username and password and would like to download this work from off-campus, click on the "Off-Campus UMass Boston Users