Date of Award
Campus Access Dissertation
Doctor of Philosophy (PhD)
While instituting data mining methodologies to extract hidden patterns and to learn useful knowledge from real-world Big Data, data scientists have reached certain boundaries due to the limitations of the current computational capabilities in processing and modeling the sheer size of the data with complex heterogeneity. On one hand, the Big Data of certain domains, such as climate data, are temporally and spatially intensive, which require special attention while utilizing the domain knowledge to extract the most suitable features and develop simulation or prediction models embedded with spatio-temporal dimensions. On the other hand, the Big Data in certain domains are temporally and spatially scattered, which present another set of challenges for predictive modeling.
In this dissertation, to overcome the current issues, two methodologies are proposed. The first one is called Dimensional Ensemble Modeling, which identifies spatial signatures, such as points of interest or spatial distributions, within segmented time windows for ensemble modeling. Secondly, Hierarchical Ensemble Modeling, the second methodology proposed, utilizes patterns locally extracted at various granularity levels and constructs a global ensemble pattern in a hierarchical fashion through a specially designed boosting strategy, CCRBoost.
In tight cooperation with domain experts, both of the proposed methodologies have been empirically evaluated in two case studies using two real-world data sets. The proposed methodologies have shown promising results in terms of predictive performance in both cases.
Yu, Chung-Hsien, "Spatio-Temporal Ensemble Modeling for Big Data" (2016). Graduate Doctoral Dissertations. 250.