1. Random Number Generators #

1.1 Uniform distribution on the unit interval #

An LCRNG produces integers $x_i$ in the set $\{0, \dots, m-1\}$ . In applications it is more useful to have a uniform distribution on $[0, 1]$ . To obtain reals $\omega_i$ in $[0, 1]$ we can scale in a number of different ways, such as:

$\omega_i = x_i / (m - 1)$
$\omega_i = x_i / m$
$\omega_i = (x_i + \frac 1 2) / m$

Discuss advantages and disadvantages of the different scalings.

1.2 Periods of poorly chosen iterations #

Compute the periods of the following LCRNGs. Coding is not needed. Reminder: the period is the shortest cycle over all possible seeds $x_0$ .

$(7x + 3) \bmod{10}$
$5x \bmod{13}$, and also of $7x \bmod{13}$
$(3x + 4) \bmod{60}$ . Note that the period can be small, even for large $m$ .

1.3 Getting started with the LCRNG implementation #

Use the C project from the set-up. The goal of this exercise is to write functionality that computes pseudo-random numbers efficiently for the example LCRNGs. Make a new file, uniform.c, for this and the following exercises.

Coding tips #
Starting with your first coding exercise, here are some tips for good code:
Don’t Repeat Yourself (DRY). Avoid code duplication at all costs.
Keep the main() clean. main() is the program’s entry point. Code in the main cannot easily be reused within your program.
Use comments. Using // (single line) and /* ... */ (multiline). Describe what is not obvious from reading the code, but do not repeat code in words.

Design a struct named LCRNG, and let the struct members be the parameters of the LCRNG. Initialize a struct for Sample bad, as a global (outside the main()). To compute powers, e.g. $2^{31}$, don’t use pow(int a, int b), this works with floating point numbers under the hood, and is slow. Instead, use the bitshift operator (hint: what number is 1 << p?).
Write code for a function unsigned int next(struct LCRNG * rng, unsigned int x) to implement the next random number $x_{i+1}$ from $x_i$. The modulo operator is %.
Write a function double scale(struct LCRNG * rng, unsigned int x) that converts an $x$ to $\omega \in [0, 1]$ .
As users of the code, we would like a frontend to generate uniform numbers $\omega$. However, our current code requires keeping track of $x_i$, and scaling. Instead, we would like to make the process of generating $\omega$ easier by writing an abstraction that allows the following:
```
int main() {
    struct LCRNG randu = {...};  // choose a generator
    unsigned int seed = ...;
    struct UniformDistribution distr = {...}; // make a distribution

    // draw from the distribution without handling x, next() or scale()
    for (int i = 0; i < 1000; i++)
        printf("%f\n", draw(&distr));

    return 0;
}
```
Note that the code above, from the user-perspective, is much more intuitive: the user needs only to initialize a UniformDistribution distr and use draw(&distr) to proceed. To make this possible, design the struct UniformDistribution, and the function draw(...).
A compiler has many options to translate C code into CPU instructions, and some optimization flags can improve the runtime of the program considerably. See the GCC manual. To see this, write code that draws $n$ numbers and prints $\omega_n$. Make $n$ progressively larger until the program takes at least a couple of seconds to complete. Then run the compiler with optimization flags -O0 (no optimization) and -O3 (aggressive optimization), for a few choices of $n$, and time the outcome. For example:
```
gcc uniform.c -o uniform -O3
time ./uniform
```
Report the findings. Tip: Don’t use printf other than to print the last number, it’s slow.

Writing tips If you haven’t already, this is probably a good moment to start adding results to a draft of the report. Don’t focus too much on a storyline yet.

1.4 Schrage’s trick and the Park-Miller generator #

LCRNGs may produce large numbers through the multiplication $a\cdot m$. If $a$ and $m$ are stored in an unsigned int, and the result becomes larger than what fits in 32 bits, an overflow occurs. In this situation the overflowing bits are truncated, and the outcome of the multiplication becomes what is left in the 32 bits. Schrage’s trick is a procedure to compute the result of $ax \bmod m$ while avoiding an explicit computation of $a \cdot x$. Note that these are LCRNGs with $c = 0$: Schrage’s trick thus only applies to multiplicative generators.

Schrage’s trick #
Factorise $m$ as $m = aq + r$, using $q := m\, \text{div}\,{a}$ and $r := m \bmod{a}$ . Here $\text{div}$ is defined as the quotient. Subsequently, define \[b := a (x\bmod{q}) - r (x \,\text{div}\, q).\] Schrage proved that \[\begin{cases} ax \bmod{m} = b & \text{if } b \geq 0 \\ ax \bmod{m} = b + m & \text{otherwise.} \end{cases}\]
And, if \[r < q,\] none of the quantities to compute $b$ exceeds $m$.

Describe, for each generator, if overflow may occur. Which generators allow the application of Schrage’s trick?

Coding tips #
Defensive programming. In C/C++ a program should not continue running in undefined situations (unlike Python’s Easier to Ask for Forgiveness than Permission). Think of checks that you can add to code to reassure this. if statements are for ordinary checks that can be recovered from, assert is a check for something that should never occur, but checks your implementation. Use assert as much as you like: the compiler can be told to ignore them, and they have no effect on the performance.
“Premature optimization is the root of all evil”. Donald Knuth’s famous statement. Forget about early, small efficiency gains. Focus on writing well-organized code first, and identify optimization opportunities later.

Extend the code from previous exercise to use Schrage’s trick. Now next() must work for generators that would potentially cause integer overflow. $\text{div}$ can be implemented by dividing two ints. Test with the Park-Miller generator.

An alternative implementation of next(...), that prevents overflow as well, is to compute the LCRNGs using the double type, rather than with unsigned integers. While % is the modulo operation for unsigned integers, fmod is the implementation for floating-point types.

    #include <math.h>  // needs compilation with `-lm`
    ...

    // convert a, x, c, and  m to `double` first
    fmod(a * x + c, m)

Implement a second next(...) which uses this alternative approach. Make a compile option that enables either the original next(...), with Schrage’s trick, or the alternative. For this use C preprocessing directives. For example #define ALTERNATIVE_NEXT, and with #ifdef ALTERNATIVE_NEXT, #else and #endif. Compile twice, once with the definition enabled, once disabled: the numbers that are generated with both methods should be the same.
Compare Schrage’s trick and the previous approach, which is more efficient? Design and run a few experiments to validate your hypothesis, and explain the outcome.

1.5 The efficient Quick generator #

In last exercise it was shown that integer overflow had to be avoided with $a\cdot x$. In this exercise, overflow is used to our advantage by using truncation as an implicit modulo operation. This avoids using the slow, “expensive”, modulo operation %.

Describe how, when computing with 32 bits integers and $m=2^{32}$, the unsigned int type leads to an implicit modulo operation. Does this also work for the type of int (signed integer)?
Extend the code to draw from LCRNGs with $m=2^{32}$, using overflow for efficiency. This is challenging, because it is of course not possible to store $2^{32}$ as a number $m$ in the LCRNG structure. You’ll need to find a way around that, and you are likely to implement special cases in next(...) and scale(...) to draw from $m=2^{32}$-generators. To test if your implementation works use the Quick generator.

A similar trick as the overflow trick can also be used for generators that have $m$ as a power of $2$. For example, when $m=2^{30}$, the modulo operator % essentially truncates (zeroes out) the left two bits. In this case, a faster method than % is a single operation of the bitwise AND & and a specially crafted binary number.

Modify your code to support such a fast-modulo computation for generators with $m$ a power of two. Implement the RANDU generator in your code.

1.6 Header files and testing #

In this exercise we improve the LCRNG project by turning uniform.c into a source-header pair. We will then write two programs to access the header: a tests.c, and a efficiency.c.

First create a header file uniform.h that exposes the struct LCRNG, struct UniformDistribution and draw() functionality. This keeps next() and scale() internal. If there is a main() in uniform.c, this can probably be removed.

Coding tips #
Testing In the following exercise, we write software to test software. This is called testing. Sometimes, it’s possible to write the tests firsts, and then develop the software to satisfy the tests afterwards. That is called test-driven development. The test coverage is the percentage of code, or possible execution pathways, that are covered by the tests. High coverage, however, is not always realistic with numerical code.

To test if the written software works as expected we’ll create a few simple software tests: small pieces of code that test the output of a program. Create a new file, tests.c. Use #include "uniform.h" in tests.c. Compile with:
```
cc tests.c uniform.c -o tests
```
Design and implement a few tests, such as void test_randu(), a void test_parkmiller() and void test_quick() that check if the implementations of the LCRNGs behave correctly (e.g., produce the correct numbers). You will have to think about appropriate tests yourself.
Now write a second program, efficiency.c, to evaluate the efficiency of several RNGs. Generate timings for a sufficiently large amount of numbers. What is the fastest/slowest LCRNG? Does it help to compile with optimization flags?

1.7 Statistical tests and TestU01 #

One flaw of multiplicative LCRNGs is that random numbers fall mainly in hyperplanes (Marsaglia, 1968). To test this, draw a series of numbers and plot $(\omega_{3i}, \omega_{3i+1}, \omega_{3i+2})$ for $i=0, 3, 6, \dots$ as points in a 3D volume. If the points exhibit a pattern, the sequence is not random. Do this for RANDU, Park-Miller and Quick (in MatLab, Python, or your favorite language). Which of the LCRNGs puts points in 3D hyperplanes?

Download and install TestU01. This is a C library to test random number generators. Windows users with the WSL should install this into their (Ubuntu) Linux. Try to install from source, which is a bit more complicated than installing an already-compiled library with sudo apt-get install ..., but very instructive to understand compilation. Your installation procedure should be a bit like this:

# get the file (or just open a browser and find the download link)
cd ~/Downloads
wget http://simul.iro.umontreal.ca/testu01/TestU01.zip
unzip TestU01.zip
cd TestU01-1.2.3/
# There is a README file 

# prepare compilation script, I choose to install in ~/local/TestU01-1.2.3
./configure --prefix=/home/<yourusernamehere>/local/TestU01-1.2.3
# MacOS: ./configure --prefix=/Users/<yourusernamehere>/local/TestU01-1.2.3
# compile, and then install (copy the build to directory above)
make install 

# check if `include` and `lib` directory are there
ls ~/local/TestU01-1.2.3/include
ls ~/local/TestU01-1.2.3/lib

The include/ directory should contain headers (bbattery.h, unif01.h), and the lib/ directory should contain a libtestu01.so. To see if you can use the library, you could try compiling a snippet like this:

#include "bbattery.h"
#include "unif01.h"
int main() { return 0; }

and the following shouldn’t return any error.

# MacOS: replace /home/ by /Users/
gcc snippet.c -o snippet \
    -ltestu01 \
    -I/home/<yourusernamehere>/local/TestU01-1.2.3/include/ \
    -L/home/<yourusernamehere>/local/TestU01-1.2.3/lib/ \
    -Wl,-rpath \
    -Wl,/home/<yourusernamehere>/local/TestU01-1.2.3/lib/

Note: -Wl,-rpath builds in a hardcoded path to the shared library, so that during the run (./snippet) the library is found.

Write a new C file (executable) to test the RANDU, Park-miller and Quick, on the bbattery_SmallCrush. For this consult the documentation, page 15/16 on external generators. The library asks you to pass a pointer to a function, but the function function cannot take any arguments, so you cannot give in draw. This might require a bit of research of your own: you’ll need to understand function pointers and add global variables. After you have implemented and ran tests, report the outcome of the tests. It is not necessary to explain the outcome of (individual) test results.

1.8 “A bit of a problem” #

The UNIX LCRNG, traditionally the default of the BSD Unix operating system, is known to have bad randomization in the rightmost bits. Even though it has a large period, the generator is not very good, since successive numbers are correlated.

Install GSL (GNU Scientific Library). Either compile yourself (as in the previous exercise) or download the pre-compiled library with the system’s package manager. MacOS users can either compile from source (similar as previous exercise) or use brew.sh.
Consult the documentation on how to use the generator gsl_rng_rand. Write and compile a small program, using the GSL library, that prints numbers from this generator in binary. Can you spot the problem?
Draw numbers with the current C implementation of rand(). Is the problem still present?
The SUN generator has the same parameters as the UNIX generator, but fixes the problem in the following way:
```
unsigned int next(struct LCRNG *rng, unsigned int x) {
    return rng->a * x + rng->c;
}

double scale(struct LCRNG *rng, unsigned int x) {
    // 0x7fff is hexadecimal for ... 0111 1111 1111 1111
    double omega = (x >> 16) & 0x7fff; 
    return omega / 0x7fff;
}
```
How does this generator work, and does it solve the problem? Would you still consider this to be a LCRNG? In your own project, extend the struct LCRNG and/or next(...) and draw(...) to support the SUN random number generator.
For completeness in your report, you may also want to time, test (with TestU01), and make e.g. a histogram or 3D scatter plot for the generator.

Writing tips To finish up the code:
make sure that the code is nicely formatted;
no debugging printf statements/tests/old functions;
use comments where helpful, but not superfluous;
good division between source and header files;
don’t leave old binary/object files in your project.
In your report:
put (special) compilation instructions in an Appendix;
make sure to mention any creative work clearly in the introduction/discussion.