Advanced Topics in Text Mining [2018 Sommer] | ||||
---|---|---|---|---|

CodeIATM |
NameAdvanced Topics in Text Mining |
|||

Leistungspunkte4 LP |
Dauerone semester |
Turnusirregular (every 2nd to 3rd summer semester) |
||

LehrformLecture 2 SWS + Exercise course 1 SWS |
Arbeitsaufwand120 h; thereof 30 h lectures 20 h preparation for examination 70 h self-study and working on assignments/projects (optionally in groups) |
VerwendbarkeitB.Sc. Angewandte Informatik, M.Sc. Angewandte Informatik, M.Sc. Scientific Computing |
||

Lernziel |
Students - can apply and evaluate methods of data preparation - know advantages and drawbacks of different data representations - can apply and evaluate selected methods of text mining - know the theoretical background of machine learning methods deep enough to be able to choose parameters and adapt an algorithm to a given problem - can evaluate and compare text mining models and patterns |
|||

Inhalt |
The lecture introduces the fundamentals as well as selected advanced topics from the domain of text mining. - fundamentals of data modeling and preprocessing, in particular for textual data - statistical and algorithmic foundations of the analysis methods - basics of computer linguistics and natural language processing for processing textual data (e.g., morphological analysis, part-of-speech tagging, named entity recognition) - selected and current focus topics such as classification, cluster analysis, sequence pattern mining, association rule mining, topic modeling, and embeddings with an emphasis on the application to textual data |
|||

Voraussetzungen |
recommended are: Algorithmen und Datenstrukturen (IAD), Knowledge Discovery in Databases (IKDD), Einführung in die Wahrscheinlichkeitstheorie und Statistik (MA8) | |||

Prüfungsmodalitäten |
Assignments; at least 50% of the credit points for the assignments need to be obtained to be eligible to participate in the final written exam; students can also work on a project (non-graded); final written exam | |||

Literatur |
- Chru Aggarwal and Zhai ChengXiang: Mining text data. Springer, 2012. - Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze: Introduction to Information Retrieval, Cambridge University Press. 2008. - Jerome H. Friedman, Robert Tibshirani und Trevor Hastie: The Elements of Statistical Learning, 2001. - Bing Liu: Web Data Mining (2nd Edition). Springer, 2011. |