[IATM] - [de] - [Advanced Topics in Text Mining]

Advanced Topics in Text Mining [2018 Sommer]
Advanced Topics in Text Mining
4 LP
one semester
irregular (every 2nd to 3rd summer semester)
Lecture 2 SWS + Exercise course 1 SWS
120 h; thereof
30 h lectures
20 h preparation for examination
70 h self-study and working on assignments/projects (optionally in groups)
B.Sc. Angewandte Informatik,
M.Sc. Angewandte Informatik,
M.Sc. Scientific Computing
Lernziel Students
- can apply and evaluate methods of data preparation
- know advantages and drawbacks of different data representations
- can apply and evaluate selected methods of text mining
- know the theoretical background of machine learning methods deep enough to be able to choose parameters and adapt an algorithm to a given problem
- can evaluate and compare text mining models and patterns
Inhalt The lecture introduces the fundamentals as well as selected advanced topics from the domain of text mining.
- fundamentals of data modeling and preprocessing, in particular for textual data
- statistical and algorithmic foundations of the analysis methods
- basics of computer linguistics and natural language processing for processing textual data (e.g., morphological analysis, part-of-speech tagging, named entity recognition)
- selected and current focus topics such as classification, cluster analysis, sequence pattern mining, association rule mining, topic modeling, and embeddings with an emphasis on the application to textual data
Voraussetzungen recommended are: Algorithmen und Datenstrukturen (IAD), Knowledge Discovery in Databases (IKDD), Einführung in die Wahrscheinlichkeitstheorie und Statistik (MA8)
Assignments; at least 50% of the credit points for the assignments need to be obtained to be eligible to participate in the final written exam; students can also work on a project (non-graded); final written exam
Literatur - Chru Aggarwal and Zhai ChengXiang: Mining text data. Springer, 2012.
- Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze: Introduction to Information Retrieval, Cambridge University Press. 2008.
- Jerome H. Friedman, Robert Tibshirani und Trevor Hastie: The Elements of Statistical Learning, 2001.
- Bing Liu: Web Data Mining (2nd Edition). Springer, 2011.