Logo Leibniz Universität Hannover
Logo: Institut für Verteilte Systeme - Fachgebiet Wissensbasierte Systeme (KBS)
Logo Leibniz Universität Hannover
Logo: Institut für Verteilte Systeme - Fachgebiet Wissensbasierte Systeme (KBS)
  • Zielgruppen
  • Suche
 

Data Mining I (Lecture) - SS18

Overview

Data mining, also called knowledge discovery in databases (KDD), is the process of discovering interesting and useful patterns and relationships in large volumes of data. The field combines tools from statistics and artificial intelligence (such as neural networks and machine learning) with database management to analyze large digital collections, known as data sets. Data mining is widely used in business (insurance, banking, retail), science research (astronomy, medicine), and government security (detection of criminals and terrorists). Clifton, Christopher (2010). "Encyclopædia Britannica: Definition of Data Mining". Retrieved 2016-03-15.

Course content

  • Introduction to Data Mining (DM) and Knowledge Discovery in Databases (KDD)
  • Main data mining tasks (supervised learning, unsupervised learning)
  • Datasets, feature spaces, preprocessing and distance functions
  • Frequent itemsets mining and association rules mining
  • Classification

    • Different classifiers: Decision trees, SVMs, Naive Bayes, kNNs ..
    • Evaluation of classifiers: evaluation setup (training-validation-testing); evaluation measures, overfitting-underfitting

  • Clustering

    • Different clustering methods: partitioning methods (kMeans, kMedoids), density-based methods (DBSCAN), hierarchical methods, hybrid methods (bisecting kMeans), model-based methods (EM)
    • Cluster validity measures (internal, external) & parameter selection

  • Outlier detection

    • distance-based approaches, statistical approaches, density-based approaches (LOF), clustering-based approaches

ECTS points: 5 

Schedule

  • Lecture: every Wednesday 12:15 - 13:45, starting from 11.04.2018
  • Tutorials: directly after the lecture, 14:00-15:30
  • Room: Multimedia-Hörsaal (3703 - 023), Appelstraße 4, 30167 Hannover

Teaching team

    Literature

    About the bonus

    • 2 data mining projects will run through the semester giving you the opportunity to explore a topic in depth and to get hands-on experience on data mining methods.
    • the projects are optional, but they will count towards your grade

    Exam

    • Written exam (90')

    !!! Please check the Stud.IP page for announcements, material and up-to-date information on the course.