Lecture notes

Date Materials & Topics Homework
Oct 4
  • Practicalities (see below, this page)
  • R Big Picture
    • Our scope within "Programming"
    • Where R is strongest
  • Navigation in R Studio
    • Manage files
    • Install packages 
    • Functions
    • Data types (digits, letters, etc.)
    • Vectors & Data Frames
    • Variable assignment
  • Presentation file
  • Exercise file

Deadline: before next lecture

https://app.datacamp.com/learn/courses/free-introduction-to-r , Chapters 1 and 2

 

Oct 11
  • Lectures moved to B103B, for good!
  • Live Quizzes:    https://quest.ms.mff.cuni.cz/class-quiz/
  • Logical operators
  • Vectors
    • select (subset) by position or condition (logical operators)
  • Functions - how to read help, argument order, defaults
  • Inspect a file with readLines.
  • Read a tabular file with the right read.... function

Deadline: before next lecture

DataCamp, Introduction to R, Chapter 5 - Data Frames.

And: Tasks in the end of the presentation from today's lecture, after the quizzes:

  • copy today's lecture folder from cinkova's folder to your home programmatically
  • Check out the functions readLines and read.delim (or one of its friends) on the file Transatlantyk... (in the lecture folder)
  • Play around with the str_c (or paste) function in the Madly difficult exercise slide. (NB: the following (last) slide contains the clue.) To be able to use str_c, remember to call library(stringr) first!

Optional, but highly recommended to do any time is Chapter 4 -Factors.

2024-10-11_02.zip presentation, data files and exercises

Presentation converted from revealjs to pdf over pptx, rather well-preserved

Oct 18
  • No statistics lecture. We must leave B103B at 10:50, exceptionally.
  • Optional individual work and consultations with SC 11:00-11:30 in C420, possible extension to 12:20.
  • Recap + solution of tasks in last lecture's slides (readLines, read in a tabular file with read.delim). Show read_lines and read_delim, too.
  • Data frames
    • strsummary, ncol, nrow
    • column names
    • subsetting the base R way (positions, column names,
    • filtering and selecting with dplyr
    • read a table from a file
    • factors

 Exercises in the lecture

Homework assignment for next time:

!!!! All finish DataCamp, Introduction to R, Chapter 5 - Data Frames.

And, a new one: Introduction to the tidyverse, Chapter 1 - Data wrangling.

https://campus.datacamp.com/courses/introduction-to-the-tidyverse/data-wrangling-1?ex=1

Oct 25 Dean's day at Faculty of Social Sciences  
Nov 1
Nov 8

Finishing the worksheets quiz from the previous lecture

You should have trained data wrangling with dplyr. Here comes a real case study with data from the wild. Group collaboration in DataCamp's collaborative IDE DataLab. Here is a link to the assignment. Make your own copy of the worksheet, share it with the group and you are all set to work:

https://www.datacamp.com/datalab/w/f78bedf6-526a-4951-a796-97eaa0575d0e/edit

Part of the work is brainstorming over possible visualizations, but the coding concerns only tables.

DataCamp Introduction to Data Visualization with ggplot2, first three chapters (Data, Aesthetics, Geometries). YOU MUST HAVE COMPLETED IT FOR THE LECTURE ON NOVEMBER 22!!!!

Feel free to experiment with ggplot2 to illustrate your research on British traffic accidents.

Nov 15 Lecture cancelled (Silvie away on a conference)  
Nov22

Flipped class on plotting with ggplot2. ONLY UNDERSTANDABLE IF YOU HAVE DONE YOUR GGPLOT2 HOMEWORK ASSIGNMENT!!!!

We have worked through the British traffic accidents data with an abundandly commented script data_wrangling_task.pdf, extracting metadata from the metadata file and mapping labels of sex_of_casualty values on the data. The students were encouraged to train this procedure with other variables. We determined that each row stands for one person killed/injured rather than for one accident, and we have counted casualties in each accident. Here is the entire code folder.

 

 Think of what you have learned so far. Try to formulate questions to topics/cases that you are in troubles with. In the next lecture, we will elect top n most interesting questions in which we will delve deeper together.

If you want to get more info about the concept of loop in R used in the guided code, please work through this tutorial on DataCamp.

Nov 29 Still ggplot2, with dplyr preparation of tables:  code folder Homework for next time: Complete the course Introduction to the tidyverse on DataCamp
Dec 6 Joining data frames with dplyr . Homework for next time: the complete course Joining data with dplyr
Dec 13 tidyr : code folder Homework for next time: the complete course Reshaping data with tidyr
Dec 20

Silvie is sick. Copy to your home today's folder 2024-12-20 from Silvie's home directory of from here: code folder. It is meant as a substitute of a mildly interactive live lecture with guided code example.

If you are lazy, just cast an eye on the source Rmd or the simple html file where you will see all code right away. But then never complain that you have not trained on real-life examples!

If, on the other hand, you want to synthesize what you have drilled on DataCamp in a real-life project, only open the README.nb.html file. In this file, all code is hidden and to see it, you have to click a button. Start your own file and try to work according to the instructions. Only hit the button when you have given up, accept the help and head on to struggle with the next step. Happy coding!

 

 

 
Jan 10

rectangular data from JSON

here is the working folder 2025-01-09.zip . This was a guided example of interacting with an API to retrieve a json file string (not encoded in raw bytes). The focus is on the JSON structure, not the API interaction.

JSON syntax and how object literals and arrays translate to R when using the jsonlite::fromJSON function with two levels of attempted structure simplification. Finally, how to use R to dig out data frames nested in a data frame column to become regular columns of that outer data frame. 

We work with this data set: https://data.police.uk/docs/method/crime-street/

 
Back to Teaching Materials