Date of Award
12-2024
Document Type
Campus Access Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science
First Advisor
Kenneth Fletcher
Second Advisor
David Degras-Valabregue
Third Advisor
Daniel Haehn, Duc Tran
Abstract
Recommender systems help users discover products, content, or actions aligned with their preferences. Traditionally, these systems have used content-based and collaborative filtering approaches: content-based recommenders suggest items similar to those the user has liked, while collaborative filtering identifies patterns of similarity between users or items. Recently, sequential recommendation has introduced time and order into recommendations, recognizing that user interests can shift dynamically. Unlike traditional methods that treat preferences as static, sequential recommendations provide contextually relevant suggestions by incorporating sequential context, making recommendations personalized, timely and contextually relevant. Though recognized as central to sequential recommendation, sequential context itself has rarely been explored in depth. This dissertation focuses on key challenges in quantifying and modeling sequential context in recommendation systems. In the first part, we address the dynamic, hierarchical, and cascading nature of sequential context, proposing a novel framework called Higher Order Latent Interactional Context (HOLIC) for modeling sequential context. This approach combines a multi-layer recurrent neural network, an attention mechanism, and a sequence clustering module to capture the underlying dynamical properties of sequential context. The second part examines the unique characteristics of training data for sequential recommendations, particularly when using transformer-based masked language models (MLMs) for item representation, a common pretraining approach. We find that applying MLMs naively to sequential recommendation data, without addressing differences between user interaction sequences and natural language, can degrade and distort item relationship signals. We propose a method to recover and amplify these signals by augmenting training data with synthetic permutations that preserve local and global sequential context. In the final section, we present pioneering work on multi-modal contexts for sequential recommendation, leveraging the rise of multi-media data—such as images, text, audio, and video in modern applications. We observe that multi-modal interactions add significant complexity, with item relationships potentially unique to each modality, redundant across them, or arising from intermodal synergies. We define and systematically categorize these challenges, and propose two models for uncovering dynamic sequential relationships within and across modalities. The first model treats the user interaction sequence as a heterogeneous temporal graph (HTG) and defines meta-relations on the HTG that encode modality types as node types and connections between modalities as edge types. The meta-relations are used to parameterize message passing and attention mechanisms to distill complex dynamic inter-modal and intra-modal signals into rich contextualized item embeddings. The second model is a self-attention based multi-modal fusion framework with a novel sequence compression technique to manage the transformer’s quadratic time complexity, allowing us to effectively model long multi-modal sequences with text and image interactions.
Recommended Citation
Amankwata, Benjamin, "Quantifying and Modeling Contexts for Sequential Recommendation" (2024). Graduate Doctoral Dissertations. 1028.
https://scholarworks.umb.edu/doctoral_dissertations/1028
Comments
Free and open access to this Campus Access Thesis is made available to the UMass Boston community by ScholarWorks at UMass Boston. Those not on campus and those without a UMass Boston campus username and password may gain access to this thesis through Interlibrary Loan. If you have a UMass Boston campus username and password and would like to download this work from off-campus, click on the