# A tibble: 18 × 4
student_id name quiz score
<dbl> <chr> <dbl> <dbl>
1 192297 Alice 1 64
2 192297 Alice 2 96
3 192297 Alice 3 68
4 291857 Bob 1 58
5 291857 Bob 2 91
6 291857 Bob 3 91
7 500286 Charlotte 1 70
8 500286 Charlotte 2 94
9 500286 Charlotte 3 71
10 449192 Dante 1 57
11 449192 Dante 2 85
12 449192 Dante 3 84
13 372152 Ethelburga 1 74
14 372152 Ethelburga 2 91
15 372152 Ethelburga 3 70
16 627561 Felix 1 77
17 627561 Felix 2 86
18 627561 Felix 3 68
Exercise A - (10 min)
Answer the following, consulting the dplyr help files as needed.
Run right_join(gradebook, emails). What happens? Explain.
Run full_join(gradebook, emails). What happens? Explain.
Run inner_join(gradebook, emails). What happens? Explain.
Above I ran left_join(gradebook, emails). How could I have used the pipe?
Add a column called name to the emails tibble, containing the following names in order: c('Joe', 'Alice', 'Ethelburga', 'Mark', 'Bob'). Then use a left join to merge gradebook with emails. What happens? Now try setting the parameter by = 'student_id'. What changes?
# Part 1# The result contains students whose ids are in emails. Those with ids# in gradebook who are *not* in gradebook are dropped.right_join(gradebook, emails)
Joining with `by = join_by(student_id)`
# A tibble: 5 × 9
student_id name quiz1 quiz2 quiz3 midterm1 midterm2 final email
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 192297 Alice 64 96 68 81 90 99 alice.liddell…
2 291857 Bob 58 91 91 75 75 79 microsoftbob@…
3 372152 Ethelburga 74 91 70 63 73 96 ethelburga@ly…
4 101198 <NA> NA NA NA NA NA NA unclejoe@whit…
5 918276 <NA> NA NA NA NA NA NA mzuckerberg@g…
# Part 2# The result contains everyone whose id appears in *either* dataset. This# requires lots of padding out with missing values.full_join(gradebook, emails)
Joining with `by = join_by(student_id)`
# A tibble: 8 × 9
student_id name quiz1 quiz2 quiz3 midterm1 midterm2 final email
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 192297 Alice 64 96 68 81 90 99 alice.liddell…
2 291857 Bob 58 91 91 75 75 79 microsoftbob@…
3 500286 Charlotte 70 94 71 81 70 74 <NA>
4 449192 Dante 57 85 84 83 94 83 <NA>
5 372152 Ethelburga 74 91 70 63 73 96 ethelburga@ly…
6 627561 Felix 77 86 68 78 83 75 <NA>
7 101198 <NA> NA NA NA NA NA NA unclejoe@whit…
8 918276 <NA> NA NA NA NA NA NA mzuckerberg@g…
# Part 3# The result contains only those whose id appears in *both* datasets. Everyone# else is dropped.inner_join(gradebook, emails)