All About That Base (R)

A Crash Course in R Programming

Francis J. DiTraglia

University of Oxford

Introduction

What is this?

Four 80 minute sessions on base R programming.
Base R means no bells or whistles: no packages 🔕
Start by doing things “the hard way” 🔩 🔨 😫
(We have calculators, but kids still need to learn addition.)

⚠️ Health Warning

I assume most of you have some programming knowledge
E.g. you understand variables, if-then, for/while etc.
I do not assume prior knowledge of R.
This means I move quickly / focus on how things work in R.
If your programming background is weak/rusty, you can catch up but need to put in the hours right now.

I can’t cover everything

These lectures are a starting point to help you learn.
Develop skills and confidence to teach yourself more 💪
Key skills: experimentation, getting help
Suggested Readings on Base R Programming:

Part 1 - Getting Started with RStudio

What you’ll see when you load a new project on posit cloud. There are three “panes”: the console (left), environment (top right), and file browser (bottom right)

Type commands at the console and hit return to run them. Objects appear in the environment pane. Here we have one object x and it equals 3.

Commands typed directly into the console are not reproducible, so we’ll store and run our R commands from a script: select File > New > R Script. Alternatively: Ctrl+Shift+N

A new pane has appeared: the editor pane. This is a text editor where we can enter and save R commands. (Yes, there are vim and emacs keybindings!)

A text file with a .R extension containing R commands is called an R script. Make sure to save your work! Your file will appear in the file browser.

Put your cursor on the line you want to run then type Ctrl-Return (Command-Return on Mac) to run it. Alternatively click “Run” or “Source” to run the whole script.

Getting weird results? Clear your environment by entering rm(list = ls()) or clicking on the broom icon on the environment pane. Then re-run your script from the top.

Search the help files using ? or the help tab in the bottom right pane. Be warned: the help files aren’t always user-friendly 😵

Never underestimate the power of googling the error message!

DuckDuckGo is the connoisseur’s choice 🍷

If you’re not already, you will soon become very familiar with StackOverflow. Limit your results to R by starting your search string with [r]

chatGPT is incredibly good at programming but occasionally gives you 💩 so be vigilant!

From now on

I will present R code examples and output on the slides.
Type the code into your R script for today’s lecture, run it and check that you get the same output. (Save your work!)
We will pause periodically so you can work on an exercise and then discuss our solutions as a class.
Raise your hand if you need help.

greeting <- 'Hello core ERM!'
print(greeting)

[1] "Hello core ERM!"

Part 2 - R Basics

A matter of style 😎

Good programming style is like good spelling and grammar: we want to make it easy for others to understand us.
When programming in R, I follow the tidyverse style guide and you should too: this is one of the marking criteria.
Follow my lead when it comes to spacing, indentation, etc. You’ll get the hang of it in no time.

Naming Objects

Can contain letters, digits, dot ., and underscore _
Can only start with a letter or the dot .
Don’t use a name that’s already an R command!
Tidyverse Style Guide
- Use only lowercase letters, numbers, and _
- snake_case using _ to separate words
- Keep it concise; make it meaningful

Comments

Any line of R code that begins with # is a comment
Comments are for humans not the computer!
Tidyverse style guide
- Comment lines should begin with # and then a space
- Capitalize. Put a period if at least two sentences.
- Explain the “why” rather than the “what” or “how”
- Too many comments $\implies$ unclear code; try again!

What does ⬅️ mean?

R assignment operator: less than sign followed by a hyphen.
x <- 3 creates an object named x with the value 3
RStudio Shortcut: ALT “minus”
Advanced R: Names and Values for some subtleties.
Pronounced gets as in “x gets 3”.
Single equals sign (=) also works but it’s bad style.

The Assignment Operator

x <- 3
x

[1] 3

# Bad style, but it works
y = 5
y

[1] 5

# Also bad style, but this works too! 
2 -> z
z

[1] 2

R as an overgrown calculator

Basic arithmetic with +, -, *, / and parentheses

(3 + 5) * 3 - (2 / 7)

[1] 23.71429

⚠️ Implied multiplication doesn’t work; * is mandatory

x <- 3
2x

Error: <text>:2:2: unexpected symbol
1: x <- 3
2: 2x
    ^

2 * x

[1] 6

R as an overgrown calculator

Exponentiation with ^ and $e^x$ with exp(x)

2^8

[1] 256

exp(1)

[1] 2.718282

$\log_b a$ with log(a, base = b), defaults to natural log

log(100, base = 10)

[1] 2

log(100)

[1] 4.60517

Data Types in R

Numeric Types
- Double: “default” numeric type, for example x <- 3
- Integer: create with L, for example x <- 3L
- Read my blog post about floating point arithmetic!
Character: a chunk of text within single/double quotes
- x <- 'hello world!' or y <- "ERM rocks!"
Logical: TRUE or FALSE
typeof() tells you the type

Examples with Types

x <- 3.14159
typeof(x)

[1] "double"

y <- 10L
z <- 10
typeof(y)

[1] "integer"

typeof(z)

[1] "double"

economist_buried_nearby <- 'Francis Ysidro Edgeworth'
typeof(economist_buried_nearby)

[1] "character"

typeof(TRUE)

[1] "logical"

Logical Operators

NOT: ! turns a TRUE into a FALSE and vice-versa
AND: x & y is TRUE iff both x and y are TRUE
OR: x | y is TRUE iff at least one of x and y is TRUE
XOR: xor(x, y) is TRUE iff exactly one of x and y is TRUE

Logical Operators: Examples

!TRUE

[1] FALSE

TRUE & FALSE

[1] FALSE

TRUE & TRUE

[1] TRUE

TRUE | FALSE

[1] TRUE

TRUE | TRUE

[1] TRUE

xor(TRUE, FALSE)

[1] TRUE

xor(TRUE, TRUE)

[1] FALSE

Relational Operators

These operators return TRUE or FALSE
<, >, <=, and >= all mean what you think they do
== tests for equality; don’t confuse it with =
!= tests for lack of equality

Relational Operators: Examples

5 > 3

[1] TRUE

2 < 2

[1] FALSE

7 <= 7

[1] TRUE

9 >= 9

[1] TRUE

-1 == -1

[1] TRUE

5 == 2

[1] FALSE

0 != 0

[1] FALSE

75 != 42

[1] TRUE

⚠️ `==` and `!=` can be Dangerous

If you need a single TRUE / FALSE value use identical()
Testing equality of doubles can give unexpected results.
To test for near equality use all.equal()

`NA`s and `Inf`s and `NaN`s

NA means not available / missing
Inf means infinity and -Inf means minus infinity
NaN means not a number
You can do math with all of these in R

`NA`s, `Inf`s, and `NaN`s: Examples

NA + 1 # NAs are *contagious*

[1] NA

2 * Inf

[1] Inf

-1 / 0

[1] -Inf

1 / Inf

[1] 0

0 / 0

[1] NaN

Testing the type of an object

is.integer(3L)

[1] TRUE

is.integer(3)

[1] FALSE

is.na(4)

[1] FALSE

is.na(NA)

[1] TRUE

is.character('Hi mom!')

[1] TRUE

is.character(NA)

[1] FALSE

is.double(3)

[1] TRUE

is.double(3L)

[1] FALSE

💪 Exercise A - (5 min)

Why does this code throw an error? Try to fix it.

x <- 3; x > 2 & < 9

Does (NA & TRUE) equal (NA | TRUE)? Explain.
Does (Inf - Inf) equal (Inf - 1)? Explain.
Run the following. What happens? (further reading)

y <- (1 - 0.8); z <- 0.2
y == z; y < z; all.equal(y, z); identical(y, z)

Why do I use double quotes here?

important_message <- "The harder you try, the more you'll learn."

Part 3 - Atomic Vectors

R’s Simplest Data Structure

Atomic vector: one-dimensional, stores data of a single type
Futher reading: Chapter 3 - “Vectors” from Advanced R
Key types: double, integer, character, logical
c() “concatenate” to create an atomic vector
typeof() to find out the type of an atomic vector
length() to find the length of an atomic vector

Atomic Vectors: Some Examples

x <- c(2.3, -1.7, 5, 0, 6.2, 7)
y <- c('core ERM', 'is', 'the best course', 'in history') 
z <- c(TRUE, FALSE, TRUE) 

length(x)

[1] 6

length(y)

[1] 4

length(z)

[1] 3

typeof(x)

[1] "double"

typeof(y)

[1] "character"

typeof(z)

[1] "logical"

💪 Exercise B - (1 minute)

Predict the result that you will obtain if you use typeof() to find the type of each of the following atomic vectors. Then check to see if you were right!

foo <- c('1', '2', '3')
bar <- c('TRUE', 'FALSE')

Subsetting Atomic Vectors

Use brackets [] to access elements of an atomic vector.
Various ways to subset:
- With an index x[2] or vector of them x[c(2, 5, 7)]
- With a logical vector of the same length as x
Negative indices remove elements x[-1]
Repeated indices are allowed x[c(2, 2)]

Warning

R (like Julia) indexes from one, unlike Python and C/C++ which index from zero.

Subsetting Atomic Vectors: Example

y <- c('Keble', 'LMH', 'Univ', 'Merton') 
y[1]

[1] "Keble"

y[c(1, 4)]

[1] "Keble"  "Merton"

y[c(FALSE, TRUE, FALSE, TRUE)]

[1] "LMH"    "Merton"

y[-c(1, 4)]

[1] "LMH"  "Univ"

y[-3]

[1] "Keble"  "LMH"    "Merton"

y[c(-1, -4)]

[1] "LMH"  "Univ"

y[c(1, 2, 1, 2)]

[1] "Keble" "LMH"   "Keble" "LMH"

R has no scalars; only 1-vectors

Did you notice the [1] that keeps appearing everywhere?

2^8

[1] 256

w is a vector; [1] denotes its first (and only) element.

w <- 2^8
length(w)

[1] 1

Here the first element [1] is 30 and the 26th [26] is 55:

r <- 30:70 # shorthand for the vector c(30, 31, ..., 69, 70)
r

 [1] 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
[26] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70

💪 Exercise C - (5 minutes)

y <- c('Keble', 'LMH', 'Univ', 'Merton')

Enter the command y[5]. What result do you get? Why?
I want to extract the second and fourth elements of y so I enter y[2,4]. What happens? Can you fix it? How?
Select 'Keble' and 'Univ' two different ways.
Below is a vector of sales in $ over several months. Using [], length() and -, compute the monthly growth rates in %

sales <- c(100, 120, 90, 110, 105, 130, 140, 135, 125, 145, 150, 160)

Doing math with vectors

Mathematical operations in R are vectorized and operate element by element:

x <- 5:9 # shorthand for c(5, 6, 7, 8, 9)
y <- 0:4 # shorthand for c(0, 1, 2, 3, 4)
x + y

[1]  5  7  9 11 13

x - y

[1] 5 5 5 5 5

x * y

[1]  0  6 14 24 36

y / x

[1] 0.0000000 0.1666667 0.2857143 0.3750000 0.4444444

x^y

[1]    1    6   49  512 6561

Some vectorized functions

Nearly all R Functions are vectorized: they accept vector input

x <- (-2:2)  # shorthand for c(-2, -1, 0, 1, 2)
sum(x) # sum

[1] 0

prod(x) # product

[1] 0

abs(x) # absolute value

[1] 2 1 0 1 2

sqrt(x^2) # (positive) square root

[1] 2 1 0 1 2

min(x) # minimum

[1] -2

max(x) # maximum

[1] 2

Relational / Logical Operators are Vectorized

x <- c(1, 2, 3, 4)
y <- c(-1, 3, 2, 6)
x > y

[1]  TRUE FALSE  TRUE FALSE

z <- c(TRUE, FALSE, TRUE)
w <- c(FALSE, TRUE, TRUE)
z | w

[1] TRUE TRUE TRUE

z & w

[1] FALSE FALSE  TRUE

any(x > y) # checks if *any* element is true

[1] TRUE

all(x > y) # checks if *all* elements are true

[1] FALSE

♻️ Recycling

Allows operations with vectors of different lengths, e.g. “scalars” with vectors:

x <- (-2:2) # shorthand for c(-2, -1, 0, 1, 2)
x + 3

[1] 1 2 3 4 5

x - 1

[1] -3 -2 -1  0  1

2 * x

[1] -4 -2  0  2  4

x / 2

[1] -1.0 -0.5  0.0  0.5  1.0

x^3

[1] -8 -1  0  1  8

♻️ Recycling Rules

If any vector has length zero, the result will too.

x <- numeric(length = 0) # creates a zero length numeric vector
x

numeric(0)

length(x)

[1] 0

x + -2:2

numeric(0)

Otherwise, length of result equals length of longer vector.

c(1, 2) + c(5, 6, 7, 8) # equivalent to c(1, 2, 1, 2) + c(5, 6, 7, 8)

[1]  6  8  8 10

⚠️ Recycling Can Be Dangerous

R warns you if the length of the longer vector isn’t a multiple of the length of the shorter one:

c(1, 2, 3) + c(5, 6, 7, 8) # equiv. to c(1, 2, 3, 1) + c(5, 6, 7, 8)

Warning in c(1, 2, 3) + c(5, 6, 7, 8): longer object length is not a multiple
of shorter object length

[1]  6  8 10  9

Advice: avoid recycling except for “scalar with vectors” case unless you really know what you are doing.

💪 Exercise D - (3 min)

The probability mass function of a Binomial$(n, p)$ random variable is given by \[ \mathbb{P}(X=x) = \binom{n}{x} p^x (1 - p)^{n-x} \] Use vectorized mathematical operations and the choose() function to calculate the pmf of a Binomial$(5, 0.3)$ random variable in one fell swoop.

Replacing elements of a vector

LHS: describe values you want to modify
RHS: use <- to overwrite

x <- c(1, 3, 7, 8, -4)
x

[1]  1  3  7  8 -4

x[4] <- 5
x

[1]  1  3  7  5 -4

x[x < 0] <- NA
x

[1]  1  3  7  5 NA

x[c(1, 3)] <- c(-4, 7)
x

[1] -4  3  7  5 NA

Naming elements of a vector

Method 1: create first, then use names()

Frank <- c(1983, 40, 1)
names(Frank) <- c('birth year', 'age', '#siblings')
Frank

birth year        age  #siblings 
      1983         40          1

Method 2: name when creating

Frank <- c('birth year' = 1983, 'age' = 40, '#siblings' = 1)
Frank

birth year        age  #siblings 
      1983         40          1

You can rename with names():

Frank[3] <- 0
names(Frank)[3] <- '#Nobel Prizes'

Accessing Elements by Name

Frank <- c('birth year' = 1983, 'age' = 40, '#siblings' = 1)
Frank['age']

age 
 40

Frank['birth year']

birth year 
      1983

Frank[c('age', 'age', 'birth year')]

       age        age birth year 
        40         40       1983

Frank[c('salary', 'zodiac sign')]

<NA> <NA> 
  NA   NA

Set Operations

NULL is the empty set. You can assign it, e.g. x <- NULL
union(A, B) $\equiv A \cup B$
intersect(A, B) $\equiv A\cap B$
setdiff(A, B) $\equiv A \setminus B \equiv A - B \equiv A \cap B^{c}$
setequal(A, B) is TRUE iff $A \subseteq B$ and $B \subseteq A$
A %in% B returns a vector of length(A) with TRUE for each element of A that is contained in B, FALSE otherwise

Set Operations: Examples

A <- 1:5; B <- 7:3; C <- NULL
union(A, B)

[1] 1 2 3 4 5 7 6

union(A, C)

[1] 1 2 3 4 5

intersect(A, B)

[1] 3 4 5

intersect(A, C)

NULL

setdiff(A, B)

[1] 1 2

setequal(A, B)

[1] FALSE

A %in% B

[1] FALSE FALSE  TRUE  TRUE  TRUE

Coercion

Remember: an atomic vector stores a single type
If you try to put in multiple types, R converts: coercion
Precedence: character > numeric > logical
Highest precedence type “wins”; everything else coerced:

x <- c(1, TRUE, '1')
x

[1] "1"    "TRUE" "1"

Note

To coerce manually: as.character(), as.numeric(), as.logical()

Examples of Coercion

y <- c(1, TRUE, 5, 0)
y

[1] 1 1 5 0

as.character(1:5)

[1] "1" "2" "3" "4" "5"

as.numeric(c('1', '2', '3'))

[1] 1 2 3

# Handy for mathematics:
x <- c(TRUE, FALSE, TRUE)
sum(x)

[1] 2

prod(x)

[1] 0

💪 Exercise E - (5 min)

Replace all of the 999s in this vector with NAs

x <- c(5, 10, 3, 7, 999, 2, 999, 17, 0)

In a deck of Italian playing cards, the face cards are fante (Knave), cavallo (Knight), and re (King). In the game Scopa, fante is worth 8, cavallo 9, and re 10. Convert cards to the appropriate numeric values.

cards <- c('re', 'cavallo', 're', 'fante', 'cavallo', 'fante', 're')

This code throws an error. Coerce y to make it work.

y <- c('1', '2', '3')
sum(y)

What happens if you run as.logical(-2:2)? Can you figure out the coercion rule for numeric to logical?

Part 4 - Functions and Control Flow

📚 References

Start with Hands-On Programming with R. For more:

Functions: What, Why and When?

Function: a chunk of code that encapsulates a particular task, so you never have to program it “from scratch” again!
Abstraction: the user (including the future you) only needs to know the inputs and outputs, not how the function works.
Advantages:
- Increased productivity: don’t repeat yourself! Automate!
- Fewer copy-paste errors, code that is easier to understand

A good rule of thumb is to consider writing a function whenever you’ve copied and pasted a block of code more than twice. – Hadley Wickham

Example: Computing a z-score

Recall the definition of a sample z-score: \[ z_i \equiv \frac{x_i - \bar{x}}{s}, \quad \bar{x} \equiv \frac{1}{n} \sum_{i=1}^n x_i, \quad s \equiv \sqrt{\frac{\sum_{i=1}^n (x_i - \bar{x})^2}{n-1}} \]
R has a function called scale() to compute z-scores.
But suppose it didn’t have such a function!
We compute z-scores all the time.
Avoid doing it from scratch constantly

Function Ingredients: z-score

z_score <- function(x) {
# Center and standardize a numeric vector x, returns z-scores
  (x - mean(x)) / sd(x)
}

Give it a meaningful name: z_score <-
- “Hey R, I’m creating something called z_score.”
Specify arguments: function(x)
- “And this something is a function with one input: x.”
Enclose function body in curly brackets: { ... }
- Notice the space after function() and the linebreaks.
- Whitespace in R is not syntactic, but this is good style.

Function Ingredients: z-score

z_score <- function(x) {
# Center and standardize a numeric vector x, returns z-scores
  (x - mean(x)) / sd(x)
}

Write the function body
- This is the code that actually does the work
- Good to add comment explaining purpose, inputs, output
By default, R returns the last evaluated expression
- return() is bad style; reserve for “early returns”

Using the `z_score()` function

z_score <- function(x) {
# Center and standardize a numeric vector x, returns z-scores
  (x - mean(x)) / sd(x)
}

example_data <- c(-2, 6, 3, -1, 7, 8, 0, 4, 3, -5)
z <- z_score(example_data)
z

 [1] -1.0195160  0.8772580  0.1659677 -0.7824193  1.1143547  1.3514514
 [7] -0.5453225  0.4030645  0.1659677 -1.7308062

Function Defn: z_score <- function(x) { ... }
Function Call: z_score(example_data)

`\()` is Shorthand for `function()`

R Version 4.1 introduced \() shorthand for function()

z_score <- \(x) {
  (x - mean(x)) / sd(x)
}

z_score(example_data)

 [1] -1.0195160  0.8772580  0.1659677 -0.7824193  1.1143547  1.3514514
 [7] -0.5453225  0.4030645  0.1659677 -1.7308062

Convenient when creating anonymous functions, i.e. functions with no name. (More in a later lecture)

A Function with Multiple Arguments

The $k$th raw moment of a random variable is $\mathbb{E}[X^k]$. The sample analogue is $\frac{1}{n} \sum_{i=1}^n x_i^k$.

raw_moment <- function(x, k) {
  mean(x^k)
}

example_data <- c(-2, 6, 3, -1, 7, 8, 0, 4, 3, -5)

raw_moment(example_data, 1)

[1] 2.3

raw_moment(example_data, 2)

[1] 21.3

raw_moment(example_data, 3)

[1] 105.5

Matching Args by Name vs Position

raw_moment(example_data, 1)

[1] 2.3

raw_moment(x = example_data, k = 1)

[1] 2.3

raw_moment(k = 1, x = example_data)

[1] 2.3

raw_moment(1, example_data) # Oops!

[1] 1

💪 Exercise F - (10 min)

Call z_score(w) where w <- c(1, 2, NA). What happens? See ?mean().
Test out this function. What happens? Now try adding return(z) at the bottom of the function body. Explain your results.

bad_z_score <- function(x) {
  z <- (x - mean(x)) / sd(x)
}

Write a function to compute skewness using sum(), length(), mean() and sd(). \[ \text{Skewness} \equiv \frac{1}{n} \sum_{i=1}^n\left( \frac{x_i - \bar{x}}{s}\right)^3. \]
Use sum(), length() and is.na() to write a function called my_var() that drops NAs and then computes the sample variance.
Write a function called summary_stats() that returns a named vector with two elements: the sample mean and standard deviation.

dot-dot-dot `...`

Special arg that can be included in function definitions
Allows user to supply any number of “extra” args in call
These args can be “passed through” to functions in the body
See Advanced R Chapter 6.6 for more details.

raw_moment <- function(x, k, ...) {
  mean(x^k, ...)
}
some_data <- c(1, 2, 3, NA)

raw_moment(some_data, 1)

[1] NA

raw_moment(some_data, 1, na.rm = TRUE)

[1] 2

🔭 Scope

What objects can a given chunk of code “see”?
This is subtle and complicated but important.
I’ll show you some examples of the key distinctions.
For more details, consult:
- Hands-On Programming with R, Chapter 8
- Advanced R, Chapter 7.

🔭 Scope: Examples

k is defined in the “global environment” so f() “can see it”

k <- 3
f <- \(x) {
  x^k
}  
f(2)

[1] 8

m is defined inside g() so the global environment “can’t see it”

g <- \(x) {
  m <- 2
  x^m
}
5^m

Error in eval(expr, envir, enclos): object 'm' not found

🔭 Scope: Another Example

x <- 0.5
h <- \(x) {
  sin(pi * x) # pi is a built-in constant in R
}
h(2) # Returns sin(2 * pi), not sin(pi * 0.5)

[1] -2.449294e-16

Here there are two objects named x.
h() looks inside the function first and finds x
Since it found x it stops looking.
Only checks global env if it doesn’t find x in h()

`if ()` statements

If LOGICAL_CONDITION is TRUE, run code inside { ... }

if (LOGICAL_CONDITION) {
  # Your code goes here
}

Examples:

if (3 > 5) {
  print('Everything you know is wrong!')
}
my_name <- 'Frank'
if (identical(my_name, 'Frank')) {
  print('Hi Frank!')
}

[1] "Hi Frank!"

Warning

LOGICAL_CONDITION must be length one: an individual TRUE of FALSE value.

Early Returns

“break out” of function early: before completing everything

f <- function(x) {
  if (length(x) < 2) {
    return('Cannot compute variance for fewer than two observations')
  }
  mean((x - mean(x))^2) / (length(x) - 1)
}

f(1:3)

[1] 0.3333333

f(2)

[1] "Cannot compute variance for fewer than two observations"

`if ()...else` adds “default case”

if (LOGICAL_CONDITION) {
  # This code runs if LOGICAL_CONDITION is TRUE
} else {
  # This code runs if LOGICAL_CONDITION is FALSE 
}

Examples:

if (3 > 5) {
  print('Everything you know is wrong!')
} else {
  print('The laws of mathematics continue to apply.')
}

[1] "The laws of mathematics continue to apply."

my_name <- 'Sam'
if (identical(my_name, 'Frank')) {
  print('Hi Frank!')
} else {
  print('You should change your name to Frank.')
}

[1] "You should change your name to Frank."

`if ()...else if ()...else()`

if (A) {
  # Run if A is TRUE
} else if (B) {
  # Run if A is FALSE but B is TRUE
} else if (C) {
  # Run if A, B are both FALSE but C is TRUE
} else if (D) {
  # Run if A, B, C are all FALSE, but D is TRUE 
} else {
  # Run if A, B, C, D are all FALSE
}

Block: chunk of code inside of {...}
R starts at the top, looks for the first TRUE condition
Once it finds a TRUE, R skips the remaining blocks
If all conditions are FALSE, R runs else block, if present

Example: 3 Mutually Exclusive Cases

f <- \(x, y) {
  if (x > y) {
    print('x is bigger than y')
  } else if (x < y) {
    print('y is bigger than x') 
  } else {
    print('x is equal to y')
  }
}

f(x = 1, y = 2)

[1] "y is bigger than x"

f(x = 3, y = 4)

[1] "y is bigger than x"

f(x = 5, y = 5)

[1] "x is equal to y"

🌳 An overgrown `if ()` tree

get_value <- function(x) {
  if (identical(x, 'queen')) {
    9
  } else if (identical(x, 'rook')) {
    5
  } else if (identical(x, 'knight')) {
    3 
  } else if (identical(x, 'bishop')) {
    3
  } else if (identical(x, 'pawn')) {
    1
  } else {
    NA
  }
}

🌳 Lookup Tables vs `if ()` trees

get_value2 <- function(x) {
  values <- c(9, 5, 3, 3, 1)
  names(values) <- c('queen', 'rook', 'knight', 'bishop', 'pawn')
  values[x]
}

get_value('queen')

[1] 9

get_value2('queen')

queen 
    9

Note

if () trees are best for running different code in each branch; lookup tables are best for assigning different values in each branch.

💪 Exercise G - (8 min)

What happens if you run the following code? Why?

x <- c(TRUE, TRUE)
if (x) {
  print('hello world!')
}

What happens if you run this code? Try to fix it.

if (3 > 5) {
 print('3 is greater than 5') 
}
else {
 print('3 is not greater than 5') 
}

Write a function called mycov() that calculates the sample covariance between x and y. Use an early return to print an error message when x and y have different lengths.
Consult ?trunc(). Then use trunc() to write a function called myround() that rounds x to the nearest integer.

Part 5 - Iteration

`for ()` loops

Basic syntax:

for (INDEX in SOME_ATOMIC_VECTOR) {
  # Run this chunk of code 
  # (Typically involves the elements of SOME_ATOMIC_VECTOR)
}

Example:

for (i in 1:3) {
  print(i^2)
}

[1] 1
[1] 4
[1] 9

`for ()` loop details

for (INDEX in SOME_ATOMIC_VECTOR) {
  # Run this chunk of code 
  # (Typically involves the elements of SOME_ATOMIC_VECTOR)
}

Each “pass” through the loop assigns next value to INDEX
R creates INDEX if it doesn’t exist; overwrites if it does.

i <- c('John', 'Paul', 'Ringo', 'George')
i

[1] "John"   "Paul"   "Ringo"  "George"

for (i in 1:2) {
  print(1 / i)
}

[1] 1
[1] 0.5

[1] 2

More `for()` loop details

for (INDEX in SOME_ATOMIC_VECTOR) {
  # Run this chunk of code 
  # (Typically involves the elements of SOME_ATOMIC_VECTOR)
}

INDEX created in environment where loop was called

i <- c('John', 'Paul', 'Ringo', 'George')

f <- \() { # A function with no arguments!
  for (i in 1:2) {
    print(1 / i)
  }
} 

f()

[1] 1
[1] 0.5

[1] "John"   "Paul"   "Ringo"  "George"

Even more `for ()` loop details

for (INDEX in SOME_ATOMIC_VECTOR) {
  # Run this chunk of code 
  # (Typically involves the elements of SOME_ATOMIC_VECTOR)
}

for () can iterate over any type of atomic vector
We can call the index variable anything we like

personnel <- c('Ana', 'Frank', 'Inbar', 'Sam')
for (person in personnel) {
  print(paste0('Hi, ', person, '!'))
}

[1] "Hi, Ana!"
[1] "Hi, Frank!"
[1] "Hi, Inbar!"
[1] "Hi, Sam!"

“What happens in a `for ()` loop stays in the `for ()` loop.”

Why doesn’t anything happen?!

for (i in 1:5) {
  i^2
}

Store the results somewhere to access later:

# Create an empty atomic vector to store results
squares <- vector(length = 5) 

for (i in 1:5) {
  # Fill in the element i of the vector squares  
  squares[i] <- i^2
}

squares

[1]  1  4  9 16 25

`while ()` loops

while (CONDITION) {
  # As long as CONDITION remains TRUE:
  # keep repeating this block of code
}

Use while () when you don’t know in advance how many iterations you’ll need.
Use for () when you do know in advance how many iterations you’ll need.

Example: How many coin flips are needed to get our first heads?

set.seed(1234) # We'll discuss this in Week 3

coin_flip <- 'Tails'
n_flips <- 0

while (identical(coin_flip, 'Tails')) {
  coin_flip <- sample(c('Heads', 'Tails'), size = 1)
  n_flips <- n_flips + 1
}

n_flips

[1] 5

We’ll usually avoid explicit loops

Vectorized functions / math operations replace many loops
Functional Programming: cleaner approaches to iteration
Loops “under the hood” but we don’t write them explicitly.
That said: sometimes there’s no way around writing a loop.

Are Loops in R Slow?

A loop itself is rarely the problem: it’s everything else
- Needless work inside the loop
- Lots of function calls
- Making copies of objects
- Nested loops
Writing faster code:
- Preallocate objects you’ll “fill” with a loop
- Think about what can be computed in advance of the loop
- Prefer vectorized functions / matrix operations to loops

Recall from above:

get_value <- function(x) {
  if (identical(x, 'queen')) {
    9
  } else if (identical(x, 'rook')) {
    5
  } else if (identical(x, 'knight')) {
    3 
  } else if (identical(x, 'bishop')) {
    3
  } else if (identical(x, 'pawn')) {
    1
  } else {
    NA
  }
}

get_value2 <- function(x) {
  values <- c(9, 5, 3, 3, 1)
  names(values) <- c('queen', 'rook', 'knight', 'bishop', 'pawn')
  values[x]
}

A case study in writing faster code

Generate a character vector of 1 million chess pieces:

set.seed(1234) # We'll discuss in Week 3
chess_pieces <- sample(c('queen', 'rook', 'knight', 'bishop', 'pawn'),
                       size = 1e6, replace = TRUE)

Consider three methods to assign these pieces numeric values:

A for () loop that repeatedly calls get_value() and doesn’t pre-allocate any memory to store the result.
A for () loop that repeatedly calls get_value(), but does pre-allocated memory to store the result.
Vectorized solution without loops or function calls: get_value2()

A case study in writing faster code

method1 <- function(x) {
  results <- NULL
  for (i in 1:length(x)) {
    results[i] <- get_value(x[i])
  }
  results
}

method2 <- function(x) {
  results <- vector(length = length(x))
  for (i in 1:length(x)) {
    results[i] <- get_value(x[i])
  }
  results
}

Note

Method 3 is simply get_value2() so I don’t need a third function.

Vectorized approach is much faster

system.time(
  results1 <- method1(chess_pieces)
)

   user  system elapsed 
  0.849   0.034   0.884

system.time(
  results2 <- method2(chess_pieces)
)

   user  system elapsed 
  0.701   0.001   0.702

system.time(
  results3 <- get_value2(chess_pieces)
)

   user  system elapsed 
  0.008   0.000   0.009

names(results3) <- NULL
identical(results1, results2) & identical(results2, results3)

[1] TRUE

💪 Exercise H - (8 min)

The Fibonacci Sequence is defined by $F_1 = 1$, $F_2 = 1$ and $F_n = F_{n-1} + F_{n-2}$ for $n > 2$. Write a function that uses a for () loop to compute first n Fibonacci numbers.
Come up with a way to generate the same output as f() without using a loop or if () ... else.

f <- \(x) {
  for (j in 1:length(x)) {
    if (x[j] > 0) {
      x[j] <- x[j]^3 + x[j]
    } else {
      x[j] <- x[j]^2 - x[j]
    } 
  }
  x
}

Part 6 - Matrices and Arrays

Attributes

“Extra information” that can be “attached” to an R object.
- attributes(x) to view the attributes of x
- attributes(x) returns NULL if x has no attributes
- names() are an example of an attribute

x <- 1:5
attributes(x)

NULL

Frank <- c('birth year' = 1983, 'age' = 40, '#siblings' = 1) 
attributes(Frank)

$names
[1] "birth year" "age"        "#siblings"

A Matrix is a Vector with Dimensions

Another important attribute is dimension: dim()
Atomic vectors have no dimensions:

x <- 1:6
dim(x)

NULL

If we give a vector dimensions, it becomes a matrix:

dim(x) <- c(3, 2) # ([number of rows], [number of columns])
x

     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

Key Points About Matrices / Arrays

A matrix is just an atomic vector with dimensions!
Matrices inherit key properties of atomic vectors:
- Only store elements of a single type
- Element-wise mathematical operations
- Vectorized functions often work on matrices
- Recycling rules
An array is a higher-dimensional generalization of a matrix.
Matrix / array operations are usually very fast

Whatever happened to class?

Adding / changing dimensions doesn’t change the type

x <- 1:6
typeof(x)

[1] "integer"

dim(x) <- c(3, 2)
typeof(x)

[1] "integer"

But it can change the class

class(1:6)

[1] "integer"

class(x)

[1] "matrix" "array"

⚠️ A Vector is Not a $(1\times n)$ Matrix

x <- 1:6
x

[1] 1 2 3 4 5 6

class(x)

[1] "integer"

dim(x) <- c(1, 6) # ([number of rows], [number of columns])
x

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    2    3    4    5    6

class(x)

[1] "matrix" "array"

Warning

Some R functions / operations only work with matrices. A $(n\times 1)$ or $(1 \times n)$ matrix is not equivalent to an atomic vector. Remember: attributes and class.

R fills matrices by column.

One way to create a matrix is by setting dimensions:

x <- 1:9
dim(x) <- c(3, 3) # ([number of rows], [number of columns])
x

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

Another is by using matrix()

matrix(data = 1:9, nrow = 3, ncol = 3)

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

Filling a matrix by row

Set byrow = TRUE in matrix()

matrix(data = 1:9, nrow = 3, ncol = 3)

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

matrix(data = 1:9, nrow = 3, ncol = 3, byrow = TRUE)

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

`ncol()` and `nrow()`

These tell us how many rows / columns a matrix has:

pieces <- c('Queen', 'Rook', 'Knight', 'Bishop', 'Pawn', 'King')
M <- matrix(pieces, 2, 3)
M

     [,1]    [,2]     [,3]  
[1,] "Queen" "Knight" "Pawn"
[2,] "Rook"  "Bishop" "King"

nrow(M)

[1] 2

ncol(M)

[1] 3

This is the same information as dim()

dim(M)

[1] 2 3

Diagonal Matrices

For vector x, diag(x) constructs a diagonal matrix

M <- diag(1:4)
M

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    2    0    0
[3,]    0    0    3    0
[4,]    0    0    0    4

For matrix M, diag(M) extracts the main diagonal

diag(M)

[1] 1 2 3 4

For integer k, diag(nrow = k) is the identity matrix $I_k$

diag(nrow = 4)

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    0    0    1    0
[4,]    0    0    0    1

Accessing Elements of a Matrix

Same idea as vectors but two dimensions [row, col]

M <- matrix(1:16, 4, 4)
M

     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

M[3, 3]

[1] 11

M[1:2, 3:4]

     [,1] [,2]
[1,]    9   13
[2,]   10   14

M[c(TRUE, FALSE, FALSE, TRUE), c(FALSE, TRUE, TRUE, FALSE)]

     [,1] [,2]
[1,]    5    9
[2,]    8   12

Accessing Elements of a Matrix

Empty means everything from this dimension

     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

M[1, ]

[1]  1  5  9 13

M[, 4]

[1] 13 14 15 16

M[, c(FALSE, TRUE, FALSE, TRUE)]

     [,1] [,2]
[1,]    5   13
[2,]    6   14
[3,]    7   15
[4,]    8   16

`rbind()` and `cbind()`

Create / expand a matrix by binding rows or columns

x <- 1:3 
y <- 4:6
z <- 7:9
M <- cbind(x, y, z)
M

     x y z
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9

R <- rbind(x, y, z)
R

  [,1] [,2] [,3]
x    1    2    3
y    4    5    6
z    7    8    9

cbind(M, diag(nrow = 3))

     x y z      
[1,] 1 4 7 1 0 0
[2,] 2 5 8 0 1 0
[3,] 3 6 9 0 0 1

rbind(M, diag(1:3))

     x y z
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
[4,] 1 0 0
[5,] 0 2 0
[6,] 0 0 3

💪 Exercise I - (8 min)

Create a $5\times 5$ matrix called A, each of whose rows contains the elements 1:5. Hint: see ?rep.
Display all elements of A except row 3 and column 2.
Form a matrix B by stacking the $(4\times 4)$ identity matrix on top of itself.
Display the seventh row of B.
Write a function that uses a for() loop to construct the $(n\times n)$ exchange matrix $J_n$.

Matrix Selection Gone Wrong

A failed attempt to produce the $(3\times 3)$ identity matrix:

n <- 3
eye <- matrix(0, n, n)
eye

     [,1] [,2] [,3]
[1,]    0    0    0
[2,]    0    0    0
[3,]    0    0    0

eye[1:n, 1:n] <- 1 
eye

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    1
[3,]    1    1    1

Oops! We wrote over the entire matrix by mistake!

Indexing a Matrix with Another Matrix

Instead: subset using a matrix of indices for the “target” matrix

n <- 3
eye <- matrix(0, 3, 3)

# 1st column of elements specifies the *row indices* of eye
# 2nd column of elements specifies the *col indices* of eye
elements <- cbind(1:n, 1:n) 
elements

     [,1] [,2]
[1,]    1    1
[2,]    2    2
[3,]    3    3

eye[elements] <- 1
eye

     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1

Elementwise Matrix Operations

Because a matrix is a vector with dimensions, +, -, *, and / are elementwise, just as they are for atomic vectors:

M <- matrix(1:4, 2, 2)
M

     [,1] [,2]
[1,]    1    3
[2,]    2    4

R <- matrix(c(0, -1, 1, 2), 
            2, 2)
R

     [,1] [,2]
[1,]    0    1
[2,]   -1    2

M + R

     [,1] [,2]
[1,]    1    4
[2,]    1    6

M - R

     [,1] [,2]
[1,]    1    2
[2,]    3    2

M * R

     [,1] [,2]
[1,]    0    3
[2,]   -2    8

M / R

     [,1] [,2]
[1,]  Inf    3
[2,]   -2    2

Matrix Products

(More on matrix algebra in R in future lectures)

v <- 1:2
A <- diag(v)
A

     [,1] [,2]
[1,]    1    0
[2,]    0    2

B <- matrix(-1, 2, 2)
B

     [,1] [,2]
[1,]   -1   -1
[2,]   -1   -1

Element-wise product: *

A * B

     [,1] [,2]
[1,]   -1    0
[2,]    0   -2

Usual matrix product: %*%

A %*% v # v is "promoted"

     [,1]
[1,]    1
[2,]    4

Outer product: %o%

v %o% v

     [,1] [,2]
[1,]    1    2
[2,]    2    4

Matrices can have names

temps <- matrix(c(5, 2, 2,
                  12, 7, 8), nrow = 2, ncol = 3, byrow = TRUE)
rownames(temps) <- c('Aberdeen', 'Plymouth')
colnames(temps) <- c('Mon', 'Tues', 'Weds')
temps

         Mon Tues Weds
Aberdeen   5    2    2
Plymouth  12    7    8

rownames(temps)

[1] "Aberdeen" "Plymouth"

colnames(temps)

[1] "Mon"  "Tues" "Weds"

Subsetting Matrices by Name

temps

         Mon Tues Weds
Aberdeen   5    2    2
Plymouth  12    7    8

temps['Aberdeen', ]

 Mon Tues Weds 
   5    2    2

temps[, 'Weds']

Aberdeen Plymouth 
       2        8

Warning

There’s something funny about the second example: look closely!

`drop()` deletes “extra” dimensions

temps[, 'Weds']

Aberdeen Plymouth 
       2        8

temps[, 'Weds', drop = FALSE]

         Weds
Aberdeen    2
Plymouth    8

M <- matrix(1:4, 1, 4)
M

     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4

dim(M)

[1] 1 4

class(M)

[1] "matrix" "array"

v <- drop(M)
v

[1] 1 2 3 4

dim(v)

NULL

class(v)

[1] "integer"

💪 Exercise J - (15 min)

Write a function to constructs the $(n\times n)$ exchange matrix $J_n$ without using a loop.
Compute the element-wise product of $J_3$ with itself, and the square of $J_3$, i.e. the ordinary matrix product $J_3 J_3$.
Let $X$ be a Bernoulli$(0.2)$ and $Y$ be a Binomial$(2, 0.5)$ RV. Construct a matrix p_XY that represents the joint pmf of $X$ and $Y$, under the assumption that $X$ and $Y$ are independent. Name the rows and columns.
Consult ?rowSums() and ?colSums(). Then extract the marginal pmfs of $X$ and $Y$ from the matrix p_XY.

Part 7 - Lists and Dataframes

Atomic Vectors & Matrices / Arrays

Advantages
- Mathematically convenient
- Fast operations
- Efficient use of memory
Disadvantages
- Only store elements of one type
- “Shape” must be rectangular

Lists: R’s Most General Data Structure

Reach for a list when you need to store:
- more than one type
- something with non-rectangular “shape”
Lists can store anything:
- atomic vectors
- matrices / arrays
- even other lists!
Important special case of a list: data frame

List Basics

list() creates a list, just like c() creates an atomic vector

my_list <- list(c(TRUE, FALSE, FALSE), pi, diag(nrow = 2))
my_list

[[1]]
[1]  TRUE FALSE FALSE

[[2]]
[1] 3.141593

[[3]]
     [,1] [,2]
[1,]    1    0
[2,]    0    1

str() tells us what’s inside:

str(my_list)

List of 3
 $ : logi [1:3] TRUE FALSE FALSE
 $ : num 3.14
 $ : num [1:2, 1:2] 1 0 0 1

Accessing List Elements

This is complicated because lists are hierarchical
Single brackets []
- Always returns a list
- Otherwise same behavior as atomic vectors
Double brackets [[]]
- Extract an object from the list by position
- Doesn’t, in general, return a list
After extracting an object, index into it “as usual”

Examples of Accessing List Elements

my_list[1:2] # returns a list

[[1]]
[1]  TRUE FALSE FALSE

[[2]]
[1] 3.141593

my_list[3] # returns a list

[[1]]
     [,1] [,2]
[1,]    1    0
[2,]    0    1

my_list[[3]] # returns a matrix

     [,1] [,2]
[1,]    1    0
[2,]    0    1

my_list[[3]][1, 1] # (1, 1) element of the matrix my_list[[3]]

[1] 1

Named Lists

When creating a list, you can name the elements as with c()

core_ERM <- list('lecturer' = 'Frank', n_students = 70,
                 'TAs' = c('Ana', 'Inbar', 'Sam'))

Now we can access objects by name

core_ERM['lecturer'] # returns a list

$lecturer
[1] "Frank"

core_ERM[['lecturer']] # returns a vector

[1] "Frank"

$NAME_HERE is a shortcut for [['NAME_HERE']]

core_ERM$lecturer

[1] "Frank"

Data Frames

A special kind of list that is almost like a matrix.
Each element of the list is an atomic vector
These vectors don’t have to contain the same type
But they do have to be of the same length.
It’s basically a spreadsheet

Creating a Data Frame

A data frame is has type list and class data.frame

students <- data.frame('name' = c('Xerxes', 'Xanthippe', 'Xanadu'),
                       'age' = c(19, 23, 21),
                       'grade' = c(65, 70, 68),
                       'favorite_color' = c('blue', 'red', 'orange'))
students

       name age grade favorite_color
1    Xerxes  19    65           blue
2 Xanthippe  23    70            red
3    Xanadu  21    68         orange

typeof(students)

[1] "list"

class(students)

[1] "data.frame"

Accessing Elements of a Dataframe

We can mix-and-match selection rules for lists and matrices:

students

       name age grade favorite_color
1    Xerxes  19    65           blue
2 Xanthippe  23    70            red
3    Xanadu  21    68         orange

students[1, 2] # The element in row 1 of column 2

[1] 19

students[, 2] # Everything in column 2

[1] 19 23 21

students[, 'favorite_color']

[1] "blue"   "red"    "orange"

students$name

[1] "Xerxes"    "Xanthippe" "Xanadu"

students[students$name == 'Xerxes', ]

    name age grade favorite_color
1 Xerxes  19    65           blue

Why so few slides on data frames?

Data frames are a real pain!
We’ll work with something better: tibbles.
More in our next lecture.

💪 Exercise K - (7 min)

I used students$name == 'Xerxes' above. Why didn’t I instead use identical(students$name, 'Xerxes')?
Use the following code chunk to construct the employees data frame. Then display it.

employees <- data.frame(
  name = c("Alice", "Bob", "Cathy", "David", "Eva", 
           "Frank", "Grace", "Hank", "Ivy", "Jack"),
  age = c(25, 31, 28, 40, 35, 23, 30, 45, 33, 29),
  department = c("HR", "IT", "Finance", "IT", "HR", 
                 "Finance", "IT", "HR", "Finance", "IT"),
  salary = c(50000, 60000, 55000, 70000, 53000, 
             51000, 62000, 71000, 57000, 59000)
)

Display the age column of employees.
Display the sixth row of employees.
Display the employee record for Eva.
Display employee records for everyone in the IT department.
Repeat the preceding, restricted to people with a salary of at least 60,000.

All About That Base (R)

Introduction

What is this?

⚠️ Health Warning

I can’t cover everything

Part 1 - Getting Started with RStudio

From now on

Part 2 - R Basics

A matter of style 😎

Naming Objects

Comments

What does ⬅️ mean?

The Assignment Operator

R as an overgrown calculator

R as an overgrown calculator

Data Types in R

Examples with Types

Logical Operators

Logical Operators: Examples

Relational Operators

Relational Operators: Examples

⚠️ == and != can be Dangerous

NAs and Infs and NaNs

NAs, Infs, and NaNs: Examples

Testing the type of an object

💪 Exercise A - (5 min)

Part 3 - Atomic Vectors

R’s Simplest Data Structure

Atomic Vectors: Some Examples

💪 Exercise B - (1 minute)

Subsetting Atomic Vectors

Subsetting Atomic Vectors: Example

R has no scalars; only 1-vectors

💪 Exercise C - (5 minutes)

Doing math with vectors

Some vectorized functions

Relational / Logical Operators are Vectorized

♻️ Recycling

♻️ Recycling Rules

⚠️ Recycling Can Be Dangerous

💪 Exercise D - (3 min)

Replacing elements of a vector

Naming elements of a vector

Accessing Elements by Name

Set Operations

Set Operations: Examples

Coercion

Examples of Coercion

💪 Exercise E - (5 min)

Part 4 - Functions and Control Flow

📚 References

Functions: What, Why and When?

Example: Computing a z-score

Function Ingredients: z-score

Function Ingredients: z-score

Using the z_score() function

\() is Shorthand for function()

A Function with Multiple Arguments

Matching Args by Name vs Position

💪 Exercise F - (10 min)

dot-dot-dot ...

🔭 Scope

🔭 Scope: Examples

🔭 Scope: Another Example

if () statements

Early Returns

if ()...else adds “default case”

if ()...else if ()...else()

Example: 3 Mutually Exclusive Cases

🌳 An overgrown if () tree

🌳 Lookup Tables vs if () trees

💪 Exercise G - (8 min)

Part 5 - Iteration

for () loops

for () loop details

More for() loop details

Even more for () loop details

“What happens in a for () loop stays in the for () loop.”

while () loops

Example: How many coin flips are needed to get our first heads?

⚠️ `==` and `!=` can be Dangerous

`NA`s and `Inf`s and `NaN`s

`NA`s, `Inf`s, and `NaN`s: Examples

Using the `z_score()` function

`\()` is Shorthand for `function()`

dot-dot-dot `...`

`if ()` statements

`if ()...else` adds “default case”

`if ()...else if ()...else()`

🌳 An overgrown `if ()` tree

🌳 Lookup Tables vs `if ()` trees

`for ()` loops

`for ()` loop details

More `for()` loop details

Even more `for ()` loop details

“What happens in a `for ()` loop stays in the `for ()` loop.”

`while ()` loops

`ncol()` and `nrow()`

`rbind()` and `cbind()`

`drop()` deletes “extra” dimensions