Problem Set - Collatz Conjecture
One of the most famous unsolved problems in mathematics is the Collatz conjecture. The problem sounds simple enough: consider a positive integer \(n\). Then,
- if \(n\) is even: divide it by \(2\)
- if \(n\) is odd: multiply by \(3\) and add \(1\)
The Collatz conjecture holds that repeatedly applying this Collatz function eventually transforms any positive integer \(n\) into 1. No matter the input, the output is always the same!
This has been shown to hold for integers up to \(2.95 \times 10^{20}\), but mathematicians have not found a general proof.
We’ll test this conjecture using Base R.
Exercises
- Define the function
collatz_sequencethat takes an integernas input and returns the sequence of numbers leading to1. To organize your thinking, please proceed step-by-step as follows:- To warm up consider \(n=3\). Write an
if-else-statement that uses%%to check whether 3 is even, performs the correct operation, and prints out the next integer in the sequence. Hint: You’ll need to use the R modulo operator%%. For example,16divided by3equals5with a remainder of1, so16 %% 3returns1. Experiment with%%to make sure you understand how it works. If necessary, read the R help file “Arithmetic Operators”. - Now instead of printing out the result, store it.
seq_3should contain the first two integers of the Collatz sequence for 3: store 3 as its first value usingc(), then usec()again to append the next element to the sequence. Print outseq_3to check your work. - Now write a
whileloop that builds on your code from (b). The loop should repeat the Collatz function until the sequence reaches 1. Start with \(n=5\) this time. Store the sequence inseq_5and print it out to check your work. Hint: You need to initialise your running variable andseq_5before your while loop and later update both of them inside yourwhileloop.
- You’ve written some handy code! But now suppose you want to compute the Collatz sequence for another integer. Copying-and-pasting your existing code is tedious and error prone. Instead, you can define a function building on your code from (c).
collatz_sequenceshould take a positive integern, iteratively apply the Collatz function, and return the Collatz sequence. Test your function for \(n=5\). Your result should look like this:
collatz_sequence(5) # Expected output: 5 16 8 4 2 1 - To warm up consider \(n=3\). Write an
- You will now use your function
collatz_sequenceto check if the Collatz conjecture holds for all integers in the range \(\{a, a+1, ..., b-1, b\}\). You will define the functioncollatz_holdsthat takes integersaandbas inputs and returns a vector containingTRUEorFALSEfor each integer, depending on whether the Collatz conjecture holds for all integers in this range. To organize your thinking, please proceed step-by-step as follows:- Use a
forloop to iterate over all integers between \(a=3\) and \(b=10\). Print out the Collatz sequences for these integers usingcollatz_sequence. - Now write a function that can do the same for arbitrary \(a\) and \(b\). Test your function for \(a=3\) and \(b = 10\). Hint: look up
tail()! - Modify your function from the preceding step so that it prints out only the last element of each sequence rather than the whole sequence. Test your function for \(a=3\) and \(b = 10\). Hint: look up
tail()in the R help files. - Modify your function from the preceding step as follows. Rather than printing out the last element of each sequence, use an
if-else-statement to check whether this element equals one and store the appropriateTRUE/FALSEvalue in a vector. Then return this vector. Hint: Notice that you need to initialize an empty vector to store yourTRUE/FALSEvalues within the function and append your results to it. This will look similar to what you did in question 1 to store the Collatz sequence. Test your function for \(a=3, b = 10\). Your output should look something like this:
collatz_holds(3, 10) # Expected output: TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE - Use a
- You saw in 2(a) that the number of steps it takes for the Collatz sequence to reach 1 varies a lot. Write a function called
longest_collatzthat returns two numbers for any starting integer \(n \in \{a, ..., b\}\): the integer with the longest Collatz sequence, and the length of that sequence. Your function’s structure will be similar tocollatz_holds. In yourforloop, keep track of thelength()of the sequence instead of the last integer. You can then use the Base Rmax()andwhich.max()to identify the integer with the longest Collatz sequence. Check your function for \(a=6\), \(b = 100\). Hint: When identifying the integer with the longest Collatz sequence, don’t forget that you are iterating starting froma, not from 1. You should also try using a named vector for your function output to make it more human-readable. Aim to replicate the following output:
longest_collatz(6, 100)
# Expected output:
# n steps
# 97 119- Write a function
get_collatz_steps()that takes one argumentn_maxand returns a tibble with two columns:nandsteps. In a given row of the tibble,nis the starting value for the Collatz sequence whilestepsis the number of steps needed to reach one. The tibble should containn_maxrows, withnranging from1:n_max. Based on our results from above, we know that ifn_maxis at least97then the 97th row of this tibble will havenequal to97andstepsequal to119. Use your function to create a tibble withn_max = 500. Then useggplot2to create a scatter plot withnon the x-axis andstepson the y-axis to visualize your results.