Data Mining I (Lecture) - SS18
Overview
“Data mining, also called knowledge discovery in databases (KDD), is the process of discovering interesting and useful patterns and relationships in large volumes of data. The field combines tools from statistics and artificial intelligence (such as neural networks and machine learning) with database management to analyze large digital collections, known as data sets. Data mining is widely used in business (insurance, banking, retail), science research (astronomy, medicine), and government security (detection of criminals and terrorists).” Clifton, Christopher (2010). "Encyclopædia Britannica: Definition of Data Mining". Retrieved 2016-03-15.
Course content
- Introduction to Data Mining (DM) and Knowledge Discovery in Databases (KDD)
- Main data mining tasks (supervised learning, unsupervised learning)
- Datasets, feature spaces, preprocessing and distance functions
- Frequent itemsets mining and association rules mining
- Classification
- Different classifiers: Decision trees, SVMs, Naive Bayes, kNNs ..
- Evaluation of classifiers: evaluation setup (training-validation-testing); evaluation measures, overfitting-underfitting
- Clustering
- Different clustering methods: partitioning methods (kMeans, kMedoids), density-based methods (DBSCAN), hierarchical methods, hybrid methods (bisecting kMeans), model-based methods (EM)
- Cluster validity measures (internal, external) & parameter selection
- Outlier detection
- distance-based approaches, statistical approaches, density-based approaches (LOF), clustering-based approaches
ECTS points: 5
Schedule
- Lecture: every Wednesday 12:15 - 13:45, starting from 11.04.2018
- Tutorials: directly after the lecture, 14:00-15:30
- Room: Multimedia-Hörsaal (3703 - 023), Appelstraße 4, 30167 Hannover
Teaching team
- Responsible Professor: Prof. Dr. Eirini Ntoutsi
- Assistance: Damianos Melidis, Tai Le Quy
Literature
- Tan/Steinbach/Kumar: Introduction to Data Mining; Pearson 2006 (or later editions).
- Han/Kamber: Data Mining - Concepts and Technques; 3rd ed., Morgan Kaufmann Publ., 2011 (or later editions)
- C. Aggarwal, Data Mining The textbook, Springer, 2015.
- Witten/ Frank/Hall: Data Mining: Practical Machine Learning Tools and Techniques; 3rd ed., Morgan Kaufmann Publ., 2011 (or, later editions)
About the bonus
- 2 data mining projects will run through the semester giving you the opportunity to explore a topic in depth and to get hands-on experience on data mining methods.
- the projects are optional, but they will count towards your grade
Exam
- Written exam (90')