***Cancelled***
Text Mining with R will presumably take place in winter term 2020/21
DS406 Text Mining with R
Lecturer: | Dr. Gregor Wiedemann (Universität Hamburg) |
Course description: | DS406 |
Language: | English |
Recommended for this semester or higher: | 1 |
ECTS-Credits: | 6 |
Course can be taken as part of following programs/modules: | Data Science in Business and Economics Economics and Finance European Management General Management International Business International Economics Economics Management and Economics |
Prerequisites | Good programming skills in R |
Course Type: | Lecture (2 weekly lecture hours) block course |
Date: | Block Course: Monday, April 6, 2020 from 9 am s.t. - 5 pm Tuesday, April 7, 2020 from 9 am s.t. - 5 pm Wednesday, April 8, 2020 from 9 am s.t. - 4 pm All courses take place in PC Lab, ground floor, Nauklerstr. 47 |
Registration: | Registration in Ilias required. Registration is open from Monday, March 2, 2020 (originally from March 16) on ILIAS - end of registration time: March 29, 2020 (23:55 pm). Preferred access for students in M.Sc. Data Science in Business and Economics, remaining places are open to students from all programs. In case the number of registrations exceeds the remaining available places, a random selection will be made. Link is announced here. |
Downloads: | ILIAS |
Method of Assessment: | Assignment Assignment deadline: May, 31, 2020 - Upload in ILIAS - 8 pm s.t. (more details in the first lecture) |
Content: | The course teaches an overview of text mining in connection with data acquisition (basics of web scraping), text preprocessing and methodological integration using the statistical programming language R. In sessions alternating between lectures and tutorials, we teach theoretical and methodological foundations, introduce exemplary studies and get hands on programming to realize different analyses. We will cover a range of text mining methods from basic lexicometric measures such as word frequencies, key term extraction and co-occurrence analysis, to more complex machine learning approaches such as topic models. |
Objectives: | Students will know how (1) to perform web scraping of textual data from websites for corpus creation, (2) to apply fundamental text preprocessing technieques and how they affect outcomes, (3) to perform basic quantitative text analysis, and (4) perform topic modeling on large text corpora. |
Literature: | Grimmer, J. & Stewart, B. (2013). Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts. Political Analysis 21 (3), 267–297. doi:10.1093/pan/mps028 Ignatow, G. & Mihalcea, R. F. (2017). An Introduction to Text Mining: Research Design, Data Collection, and Analysis: SAGE. Lemke, M. & Wiedemann, G. (Hrsg.). (2016). Text Mining in den Sozialwissenschaften. Grundlagen und Anwendungen zwischen qualitativer und quantitativer Diskursanalyse. Wiesbaden: Springer VS. Welbers, K., van Atteveldt, W. & Benoit, K. (2017). Text Analysis in R. Communication Methods and Measures 11 (4), 245–265. doi:10.1080/19312458.2017.1387238 |