[IATM] - [de] - [Advanced Topics in Text Mining]


Advanced Topics in Text Mining [2019 SoSe]
Code
IATM
Name
Advanced Topics in Text Mining
LP
4 LP
Dauer
one semester
Angebotsturnus
irregular (every 2nd to 3rd summer semester)
Format
Lecture 2 SWS + Exercise course 1 SWS
Arbeitsaufwand
120 h; thereof
30 h lectures
20 h preparation for examination
70 h self-study and working on assignments/projects (optionally in groups)
Verwendbarkeit
B.Sc. Angewandte Informatik,
M.Sc. Angewandte Informatik,
M.Sc. Scientific Computing
Sprache
Lehrende
Prüfungsschema
Lernziele Students
- can apply and evaluate methods of data preparation
- know advantages and drawbacks of different data representations
- can apply and evaluate selected methods of text mining
- know the theoretical background of machine learning methods deep enough to be able to choose parameters and adapt an algorithm to a given problem
- can evaluate and compare text mining models and patterns
Lerninhalte The lecture introduces the fundamentals as well as selected advanced topics from the domain of text mining.
- fundamentals of data modeling and preprocessing, in particular for textual data
- statistical and algorithmic foundations of the analysis methods
- basics of computer linguistics and natural language processing for processing textual data (e.g., morphological analysis, part-of-speech tagging, named entity recognition)
- selected and current focus topics such as classification, cluster analysis, sequence pattern mining, association rule mining, topic modeling, and embeddings with an emphasis on the application to textual data
Teilnahme-
voraus-
setzungen
recommended are: Algorithmen und Datenstrukturen (IAD), Knowledge Discovery in Databases (IKDD), Einführung in die Wahrscheinlichkeitstheorie und Statistik (MA8)
Vergabe der LP und Modulendnote successfull assignments, students can also work on a project (non-graded); passing the module exam
Nützliche Literatur - Chru Aggarwal and Zhai ChengXiang: Mining text data. Springer, 2012.
- Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze: Introduction to Information Retrieval, Cambridge University Press. 2008.
- Jerome H. Friedman, Robert Tibshirani und Trevor Hastie: The Elements of Statistical Learning, 2001.
- Bing Liu: Web Data Mining (2nd Edition). Springer, 2011.