Date of Award
Campus Access Thesis
Master of Science (MS)
Tomasz F. Stepinski
Data Mining is a tool that extracts useful and novel patterns from great amounts of data. According to the task and the data structure, data mining usually involves 4 tasks: Classification, Clustering, Association Rule Learning, and Regression. Two major branches of research are addressed on Data Mining techniques in this thesis. In the first chapter several Classification methods are applied on a real-world data set, the Mars crater data set. The goal of this case study is to improve the accuracy of the crater detection on the remote sensing images of Mars. In the second chapter Sammon's Mapping method is studied and improved. Sammon's Mapping is a projection method which simulates the high-dimension space to a low-dimension one. The motivation of this project is to visualize the internal struture of a data set and facilitate the clustering operation on the data set. Since only when vectors are in 2-dimension or 3-dimension be easily visualized, high-dimension space needs to be mapped to a low-dimension space. After the low-dimension Sammon's Mapped space has been created the number of clusters can be observed. An external measurement of the Clustering result is also implemented in the project. This measurement objectively shows the accuracy of Clustering. With all the steps to implement the Sammon's Mapping, a pipeline is established.
Wang, Jue, "Data Mining Methodology Development: Classification Case Study on Mars Craters Detection and Visualization Analysis Using Sammon's Mapping" (2010). Graduate Masters Theses. 13.