We encourage students to use data in their projects.
This course is a gentle, programming-free combination of lectures and practical demonstrations of real-life data workflows in various Social Studies and Humanities (SSH) research areas. It aims at motivating the SSH students to improve their digital literacy in more advanced data analytics courses. The curriculum has arisen as a joint effort of Charles University (CU), University of Warsaw (UW), and Sorbonne University (SU).
This course does not require any prior data analysis or computer science experience. All you need to get started is basic computer literacy.
You will learn how to tell data stories and captivate your future audiences with TableauPublic, how to use the systems Transkribus and Pero for the digitization of historical documents, and how to annotate texts in TEITOK. We will acquaint yout with the André Mazon's digitized correspondence archive and with the migrant stories published at i am a migrant.
We cordially invite you to the workshop 2023.
No. | Date | Topic | Teaching materials |
---|---|---|---|
1. | Feb 14 |
Introduction (CU) -- course organization, motivation, outline -- basic terminology |
- npfl134-lec-1.pdf |
2. | Feb 21 |
Collection of André Mazon's correspondence I (SU) -- Mazon’s correspondence -- digitization |
- npfl134-lec-2-part-1.pdf - npfl134-lec-2-part-2.pdf |
3. | Feb 28 |
Beginner's guide to data analysis with Google sheets (CU) -- Titanic dataset -- pivot tables, box plots, histograms -- missing values, duplicates |
- npfl134-lec-3.pdf -- Titanic data set in Google sheets (url) -- Titanic train.csv at Kaggle (url) |
4. | Mar 7 |
Collection of André Mazon's correspondence II -- analysis of metadata using the Tableau system |
- Lecture slides: npfl134_lec-4-TableauMazonMetadata.pdf
Mazon metadata
Tableau tutorials in a mind map: https://www.orgpad.com/s/RVO0h1pEGYd |
5. | Mar 14 |
Collection of André Mazon's correspondence III -- analysis of letters (images and transcriptions) -- Optical Character Recognition, Handwritten Text Recognition -- Transkribus and Pero systems |
- Presentation about OCR and HTR - slides: npfl134-lec-5-transkribus.pdf |
6. | Mar 21 |
Introduction to the Universal Dependencies framework & Corpus Linguistics for Information Extraction |
- Presentation - slides in pdf, UD_infoextr_handouts_big.pdf - Strudel paper https://onlinelibrary.wiley.com/doi/10.1111/j.1551-6709.2009.01068.x |
7. | Mar 28 |
Collection of André Mazon's correspondence IV -- annotating data -- linguistic processing using the UDPipe and NameTag tools -- searching and querying data in TEITOK |
- Introduction slides to the class
- Follow the search examples at two corpora in TeiTOK: |
8. | Apr 4 |
Quantitative textual analysis in Sociology -- Migrant stories |
- lecture-2023-04-04-hajek.pdf |
9. | Apr 11 |
Quantitative textual analysis in Sociology -- Computer-assisted qualitative data analysis software -- reQual tool |
|
10. | Apr 18 |
Network analysis of Migrant Stories -- visualization in Gephi, part I |
- lecture video Presentation in html: https://cunicz-my.sharepoint.com/:u:/g/personal/50243070_cuni_cz/EZsPFYz... To view speaker's notes, put your cursor on the presentation in your web browser and press s. |
11. | Apr 25 |
Network analysis of Migrant Stories -- visualization in Gephi, part II |
- lecture video |
12. | May 2 | Introduction to Machine Learning | |
13. | May 9 | Student presentations |
|
14. | May 16 | Sharing data in repositories |
- npfl134-lec-14.pdf - lecture video from 2021/22 |
By courtesy of DataCamp, you will receive a six-month access to their e-learning materials. These will help you master Tableau Public to the level you wish.
The dataset of André Mazon's correspondence is available for the course's activities based on the Partnership Agreement between the Center of Slavic Studies (Sorbonne University) and the Institue of Formal and Apllied Linguistics (Charles University).
This course is funded by the 4EU+ Alliance under grant agreement No 2021_F3_10, visit this site.