Machine Learning
Overview
Machine Learninig shares many methods with the event data mining. Here, a distinction can be made between supervised, unsupervised and reinforcement learning. Sophisticated methods, like Deep Learninig procedures are used for regression and classification. Especially the handling of image and text data and their quantitative processing in image recognition and text mining are becoming more and more relevant.
Examples:
- Supervised learning
- Predictive vs. Explanatory modelling , performance measurement
- Regularization (ridge, lasso, elastic net) and feature engineering
- Tree-based modelling (CART, bagging, random forests)
- ANN, recurrent NN and deep learning
- Support vector machines (SVM) and Bayes’s classifiers
- Ensemble methods and super learners (boosting, stacking)
- Interpretable Machine Learning (LIME, Shapley)
- Unsupervised learning
- Clustering and pattern detection
- Advanced clustering techniques
- PCA as a dimension reduction technique
- Basics of Reinforcement learning
- Text Mining
- Basics of Image Processing (recognition) and CNN
Data
A thesis can also be edited starting from a dataset. Here are a number of possible data set sources:
Requirements
All topics should have, in addition to the theoretical foundations (i.e., model building and model assumptions), an empirical part in which a real-world, topic-related data set is evaluated using a programming language (Python or R).
Literature
- Backhaus et al., 2011, Multivariate Analysemethoden – eine anwendungsorientierte Einführung, Springer
- Backhaus et al., 2011, Fortgeschrittene Multivariate Analysemethoden – eine anwendungsorientierte Einführung, Springer
- James et al.; An Introduction to Statistical Learning - with Applications in R; 2013; Springer
- Download-Link
- Hastie et al.; The Elements of Statistical Learning – Data Mining, Inference and Prediction; 2009; Springer
- Rencher, Methods of multivariate analysis, 2002, John Wiley & Sons Inc.
- Nisbet et al., 2009, Handbook of Statistical Analysis and Data Mining Applications, Academic Press
- Hand et al., 2001, Principles of Data Mining, The MIT Press
- Runkler, 2010, Data Mining: Methoden und Algorithmen intelligenter Datenanalyse, Vieweg+Teubner
- Bishop, Pattern Recognition and Machine Learning, 2006, Springer
- Fahrmeir et al., Regression – Modelle, Methoden und Anwendungen, 2007, Springer
- Tutz, Regression for Categorical Data, 2012, Cambridge Verlag
- Toutenburg, Lineare Modelle – Theorie und Anwendungen, 2003, Physika Verlag
- Kaufman, Rousseeuw; Finding Groups In Data – An Introduction to Cluster Analysis; 1990; Wiley&Sons
- Breiman et al., Classification and Regression Trees, 1998, Chapman & Hall
- …