Problem Set - Collatz Conjecture
One of the most famous unsolved problems in mathematics is the Collatz conjecture. The problem sounds simple enough: consider a positive integer \(n\). Then,
- if \(n\) is even: divide it by \(2\)
- if \(n\) is odd: multiply by \(3\) and add \(1\)
The Collatz conjecture holds that repeatedly applying this Collatz function eventually transforms any positive integer \(n\) into 1. No matter the input, the output is always the same!
This has been shown to hold for integers up to \(2.95 \times 10^{20}\), but mathematicians have not found a general proof.
We’ll test this conjecture using Base R.
Exercises
- Define the function
collatz_sequence
that takes an integern
as input and returns the sequence of numbers leading to1
. To organize your thinking, please proceed step-by-step as follows:- To warm up consider \(n=3\). Write an
if
-else
-statement that uses%%
to check whether 3 is even, performs the correct operation, and prints out the next integer in the sequence. Hint: You’ll need to use the R modulo operator%%
. For example,16
divided by3
equals5
with a remainder of1
, so16 %% 3
returns1
. Experiment with%%
to make sure you understand how it works. If necessary, read the R help file “Arithmetic Operators”. - Now instead of printing out the result, store it.
seq_3
should contain the first two integers of the Collatz sequence for 3: store 3 as its first value usingc()
, then usec()
again to append the next element to the sequence. Print outseq_3
to check your work. - Now write a
while
loop that builds on your code from (b). The loop should repeat the Collatz function until the sequence reaches 1. Start with \(n=5\) this time. Store the sequence inseq_5
and print it out to check your work. Hint: You need to initialise your running variable andseq_5
before your while loop and later update both of them inside yourwhile
loop.
- You’ve written some handy code! But now suppose you want to compute the Collatz sequence for another integer. Copying-and-pasting your existing code is tedious and error prone. Instead, you can define a function building on your code from (c).
collatz_sequence
should take a positive integern
, iteratively apply the Collatz function, and return the Collatz sequence. Test your function for \(n=5\). Your result should look like this:
collatz_sequence(5) # Expected output: 5 16 8 4 2 1
- To warm up consider \(n=3\). Write an
- You will now use your function
collatz_sequence
to check if the Collatz conjecture holds for all integers in the range \(\{a, a+1, ..., b-1, b\}\). You will define the functioncollatz_holds
that takes integersa
andb
as inputs and returns a vector containingTRUE
orFALSE
for each integer, depending on whether the Collatz conjecture holds for all integers in this range. To organize your thinking, please proceed step-by-step as follows:- Use a
for
loop to iterate over all integers between \(a=3\) and \(b=10\). Print out the Collatz sequences for these integers usingcollatz_sequence
. - Now write a function that can do the same for arbitrary \(a\) and \(b\). Test your function for \(a=3\) and \(b = 10\). Hint: look up
tail()
! - Modify your function from the preceding step so that it prints out only the last element of each sequence rather than the whole sequence. Test your function for \(a=3\) and \(b = 10\). Hint: look up
tail()
in the R help files. - Modify your function from the preceding step as follows. Rather than printing out the last element of each sequence, use an
if
-else
-statement to check whether this element equals one and store the appropriateTRUE
/FALSE
value in a vector. Then return this vector. Hint: Notice that you need to initialize an empty vector to store yourTRUE
/FALSE
values within the function and append your results to it. This will look similar to what you did in question 1 to store the Collatz sequence. Test your function for \(a=3, b = 10\). Your output should look something like this:
collatz_holds(3, 10) # Expected output: TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
- Use a
- You saw in 2(a) that the number of steps it takes for the Collatz sequence to reach 1 varies a lot. Write a function called
longest_collatz
that returns two numbers for any starting integer \(n \in \{a, ..., b\}\): the integer with the longest Collatz sequence, and the length of that sequence. Your function’s structure will be similar tocollatz_holds
. In yourfor
loop, keep track of thelength()
of the sequence instead of the last integer. You can then use the Base Rmax()
andwhich.max()
to identify the integer with the longest Collatz sequence. Check your function for \(a=6\), \(b = 100\). Hint: When identifying the integer with the longest Collatz sequence, don’t forget that you are iterating starting froma
, not from 1. You should also try using a named vector for your function output to make it more human-readable. Aim to replicate the following output:
longest_collatz(6, 100)
# Expected output:
# n steps
# 97 119
- Write a function
get_collatz_steps()
that takes one argumentn_max
and returns a tibble with two columns:n
andsteps
. In a given row of the tibble,n
is the starting value for the Collatz sequence whilesteps
is the number of steps needed to reach one. The tibble should containn_max
rows, withn
ranging from1:n_max
. Based on our results from above, we know that ifn_max
is at least97
then the 97th row of this tibble will haven
equal to97
andsteps
equal to119
. Use your function to create a tibble withn_max = 500
. Then useggplot2
to create a scatter plot withn
on the x-axis andsteps
on the y-axis to visualize your results.