```
<- \() {
dice_sum # Roll a pair of fair, six-sided dice and return their sum
<- sample(1:6, 1)
die1 <- sample(1:6, 1)
die2 + die2
die1
}<- map_dbl(1:10000, \(i) dice_sum()) sims
```

# Running a Simulation Study - Solutions

# Exercise A - (3 min)

Recall this code snippet from our earlier lecture on simulation:

- Use your new-found knowledge of
`purrr`

to explain line 7. - Write a
`for`

loop to replace line 7.

## Solution

The anonymous function `\(i) dice_sum()`

has one argument: `i`

. But this argument isn’t used in any way! Regardless of the value of `i`

we simply call `dice_sum()`

. This is just a sneaky way of getting `map_dbl()`

to *repeatedly* call `dice_sum()`

a total of `10000`

times. It is equivalent to the following `for()`

loop:

```
<- 10000
nreps <- rep(NA_real_, nreps)
sims for(i in seq_along(sims)) {
<- dice_sum()
sims[i] }
```

# Exercise B - (35 min)

- Write a function called
`get_avg_after_streak()`

. Given a vector`shots`

, it should construct`after_streak`

with \(k = 3\) and return`mean(after_streak)`

. Test your function using`c(0, 1, 1, 1, 1, 0, 0, 0)`

. - Write a function called
`draw_shots()`

that simulates a sequence of`n_shots`

iid Bernoulli trials, each with probability of success`prob_success`

. - Amos knows that Liz makes 1/2 of her shots on average. He watches her take 100 shots and then computes \(T =\)
`get_avg_after_streak()`

, where a streak is defined by \(k = 3\). Amos argues: “If Liz does**not have a hot hand**, then each shot is independent of the others with probability of success 1/2. This means the expected value of T will be 1/2.” Carry out a simulation with 10,000 replications to check Amos’ argument. - Repeat the preceding over a parameter grid with
`n_shots`

\(\in \{50, 100, 200\}\) and`prob_success`

\(\in \{0.4, 0.5, 0.6\}\). What do you conclude from your simulation?

## Solution

### Part 1

```
<- function(shots) {
get_avg_after_streak
# shots should be a vector of 0 and 1; if not STOP!
stopifnot(all(shots %in% c(0, 1)))
<- length(shots)
n <- rep(NA, n) # Empty vector of length n
after_streak
# The first 3 elements of shots by definition cannot
# follow a streak
1:3] <- FALSE
after_streak[
# Loop over the remaining elements of shots
for(i in 4:n) {
# Extract the 3 shots that precede shot i
<- shots[(i - 3):(i - 1)]
prev_three_shots # Are all three of the preceding shots equal to 1?
# (TRUE/FALSE)
<- all(prev_three_shots == 1)
after_streak[i]
}
# shots[after_streak] extracts all elements of shots
# for which after_streak is TRUE. Taking the mean of
# these is the same as calculating the prop. of ones
mean(shots[after_streak])
}
get_avg_after_streak(c(0, 1, 1, 1, 1, 0, 0, 0))
```

`[1] 0.5`

### Part 2

```
<- function(n_shots, prob_success) {
draw_shots rbinom(n_shots, 1, prob_success)
}set.seed(420508570)
mean(draw_shots(1e4, 0.5))
```

`[1] 0.4993`

### Part 3

The average value of \(T\) appears to be noticeably *less* than 0.5:

```
library(tidyverse)
<- 1e4
nreps <- map(1:nreps, \(i) draw_shots(100, 0.5))
sim_datasets <- map_dbl(sim_datasets, get_avg_after_streak)
sim_estimates <- mean(sim_estimates)
mean_T mean_T
```

`[1] 0.4647162`

To check whether this discrepancy is statistically meaningful, we can construct a worst-case 99.7% confidence interval for \(\mathbb{E}(T)\) as follows. The standard error of a sample proportion is \(\sqrt{p(1 - p)/n}\), where \(p\) is the true proportion. But the whole problem is that we don’t know \(p\): that’s why we want a standard error in the first place! A simple calculation shows, however, that the standard error is *maximized* if \(p = 0.5\). Hence, \(1/(2 \times \texttt{nreps})\) is the *worst case* standard error. Three times this quantity is the worst-case margin of error for a 99.7% confidence interval based on the central limit theorem:

`+ c(-1, 1) * 3 / (2 * nreps) mean_T `

`[1] 0.4645662 0.4648662`

This shows that we can rule out \(\mathbb{E}(T)\) with *extremely high confidence* based on the number of simulation replications that we used. Amos is **wrong**. In this simulation the shots are iid coin flips and yet we found evidence of a **cold hand**. In fact, this approach is biased *against* finding evidence of a hot hand even if it exists.

## Part 4

The bias appears to shrink as the length of the sequence or the probability of success increase:

```
<- expand_grid(n_shots = c(50, 100, 200),
sim_params prob_success = c(0.4, 0.5, 0.6))
<- \(n_shots, prob_success, nreps = 1e4) {
run_sim map(1:nreps, \(i) draw_shots(n_shots, prob_success)) |>
map_dbl(get_avg_after_streak)
}
<- pmap(sim_params, run_sim)
sim_results
# In very short sequences there may be no streaks, leading to an NaN
# Drop these, so we effectively condition on their being at least one
# streak in the sequence
<- map_dbl(sim_results, mean, na.rm = TRUE)
summary_stats <- sim_params |>
summary_stats bind_cols(mean_T = summary_stats) |>
mutate(bias = mean_T - prob_success)
|>
summary_stats ::kable(digits = 2) knitr
```

n_shots | prob_success | mean_T | bias |
---|---|---|---|

50 | 0.4 | 0.29 | -0.11 |

50 | 0.5 | 0.43 | -0.07 |

50 | 0.6 | 0.55 | -0.05 |

100 | 0.4 | 0.33 | -0.07 |

100 | 0.5 | 0.46 | -0.04 |

100 | 0.6 | 0.58 | -0.02 |

200 | 0.4 | 0.37 | -0.03 |

200 | 0.5 | 0.48 | -0.02 |

200 | 0.6 | 0.59 | -0.01 |