Singapore University of Social Sciences

Fundamentals of Data Mining

Fundamentals of Data Mining (ANL303)


This course introduces students to the concepts and applications of data mining. Students will be introduced to the methodology of data mining, the data preparation and data exploration process as well as data mining techniques for association, clustering and predictive modelling. The course gives students an introduction on how these techniques can be applied to business analytics problems.

Level: 3
Credit Units: 5
Presentation Pattern: Every semester


  • Overview of Data Mining
  • Process of Data Mining
  • Data Exploration
  • Data Visualisation
  • Data Cleaning
  • Data transformation and reduction
  • Association analysis
  • Clustering analysis
  • Decision Trees
  • Model evaluation
  • Business and data phases
  • Model evaluation and deployment

Learning Outcome

  • Differentiate the various aspects of data mining.
  • Recommend data mining tools for association analysis, clustering and predictive modelling.
  • Assess the following applications: association analysis with Apriori, clustering with K-means, classification with classification and regression tree.
  • Categorise different aspects of data preparation.
  • Discuss the use of data exploration techniques using summary statistics, dimension reduction and visualisation.
  • Plan the process of data mining, i.e. CRISP-DM framework.
  • Prepare data for mining and analysis.
  • Execute techniques such as association analysis with Apriori, clustering with K-means, and classification with CART (classification and regression tree).
  • Analyse data using exploration techniques such as summary statistics, dimension reduction and visualisation.
  • Defend the use of appropriate data mining techniques for different business problems.
  • Interpret the results of a data mining analysis.
  • Evaluate the validity of different data mining models.
  • Apply the above-mentioned data mining tasks using the software package specified in this course, examine the output produced by the software, and infer their implications for the problem(s) under consideration.
Back to top
Back to top