2. Distributions

2. Distributions #

CLion users: Start a new project for this mini-report. Preferably choose the C++ ‘20 standard. In Settings -> Build, Execution, Deploy make sure to have a working C++ compiler, CMake installation, and debugger.

Everybody:

  1. Add directories app/, include/, src/.
  2. Create app/rng.cpp with a main() function. CLion users may remove the default main.cpp.
  3. Create a source-header pair: src/rng.cpp and include/rng.hpp.
  4. Make/modify the CMakeLists.txt:
    CMakeLists.txt
    cmake_minimum_required(...)
    project(project_two LANGUAGES CXX)
    set(CMAKE_CXX_STANDARD 20) 
    
    # Library target called `distributions`,
    add_library(distributions
    STATIC
    src/rng.cpp
    # src/foo.cpp # add like this when necessary
    # src/bar.cpp 
    # src/baz.cpp
    )
    # and, tell the linker that the headers of `distributions` are in `include`
    target_include_directories(distributions PUBLIC include)
    
    # Executable target called `rng`
    add_executable(rng app/rng.cpp)
    # and, tell the linker to add `distributions` when compiling `rng`
    target_link_libraries(rng PUBLIC distributions)
    
    # add_executable(foo app/foo.cpp) # add another executable like this
    # target_link_libraries(foo PUBLIC distributions)
    

From here on, add extra targets (usually additional executables) whenever you add files to the app/ directory. When you add new files to src/, add those to the distributions library target.

Coding tips #

  • Keep filenames lowercase. Whenever you create new files, opt for lowercase naming: this maximizes portability between different file systems.

2.1 RNGs with a common interface #

For the inversion method, we need to be able to draw uniform numbers. We will therefore implement a LinearCongruentRng and a subclass, as well as a XorShift generator. These generators have in common that they can all produce a next() number, and have a maximum() number. The abstract class, or interface below enforces this. By having a common interface for all generators, we allow easily swapping of one generator for another in other parts of the code.

#pragma once

class Rng {
public:
    /* Next random number. */
    virtual unsigned next() = 0;
    /* Maximum M of the range [0, M) of the generator. */
    virtual unsigned maximum() const = 0;
};

Coding tips #

  • Conventions It is common convention to postfix class members (functions and variables) with an underscore whenever they are protected or private, for example: int foo_;. This is just notational convention. You might also encounter Hungarian notation, in which variables are prefixed with types, such as m_iFoo, meaning “class member, integer foo”. With modern IDEs the need for such a discriptive style seems diminishing.
  1. LinearCongruentRng implementation Add the Rng class above to rng.hpp. Similar to the struct LCRNG we’ve used previously, write a class LinearCongruentRng : public Rng { ... }; to satisfy the interface. It must have:

    • a constructor, taking input parameters $a$, $c$, $m$ and a seed, and initialize the members of the class;
    • four members that are protected, and store $a$, $c$, $m$ and the current value of $x$ (which of these members can be const?);
    • a method unsigned next() override, that always returns the next $x$;
    • a method unsigned maximum() const override, returning the number to scale with.

    Reminder: class definitions go into include/rng.hpp, and member implementations go into src/rng.cpp. You may use app/rng.cpp to initialize an object of the type LinearCongruentRng, and generate a few numbers from it.

  2. LCRNG subclass implementation Choose your favorite LCRNG from the LCRNG project and re-implement this as a subclass of LinearCongruentRng by fixing a few of the parameters as constants in the class. E.g., class Quick : public LinearCongruentRng. To do so, call the constructor of the parent class from the subclass’ initializer list. If necessary, override the next() and or maximum() functions a second time.

  3. XorShift implementation The 13,17,5-XorShift is a random number generator with a maximum of $2^{32}-1$, and a next function

    x ^= x << 13;
    x ^= x >> 17;
    x ^= x << 5;
    return x;
    

    Write a generalized $l$,$m$,$n$-XorShift, that satisfies the interface Rng, with three shifts. Keep $l=13$, $m=17$, $n=5$ as default arguments in the constructor.

2.2 Uniform distributions #

Now that we have a few random number generators, we can proceed with writing classes for $\Omega$, the uniform distribution. Next to that, we will write separate classes that allow sampling (drawing) from $\Omega$. For the latter we will use the base class Rng.

  1. We would like to write two new classes, one for a uniform distribution on $[a, b]$, and one for $[0, 1]$. The classes should represent the “mathematical definition” of $\Omega$, but not the functionality for drawing. What members should the classes contain? Which one, do you think, should be the parent class, and which one the child?

Coding tips #

  • Capitalization conventions In your code, choose a style for naming variables, functions, and types. For instance, you might want to name identifiers as int snake_case, and types using PascalCase. There is no formal convention, the only rule is that you need to be consistent.
  1. Add a distribution.cpp and distribution.hpp to your project. Write an abstract class class Distribution for the parent. Add pure virtual function(s) that you think all possible distributions must be able to implement.

  2. Write the class definitions for class UniformDistribution and class UnitUniformDistribution. Make sure to implement Distribution, and use inheritance. Have the child class reuse the parent class as much as possible.

  • Pass-by-reference vs. pass-by-value In C++, like in C, calling constructors and functions incurs an overhead when copies need to be made. In C, the way to avoid this was to pass a pointer. The pointer was then copied (by value), which made the function call inexpensive. In C++, a copy can instead be avoided by passing a reference. It is up to the compiler to use a pointer, or do something more intelligent.
  • Pass scalars by value It’s generally not faster to pass a scalar by reference. Allowing your compiler to copy the scalar may even improve the performance in some situations.

We proceed by writing a class UniformSampler, to draw from a UniformDistribution, using any Rng. Thanks to the pure virtual functions in Rng, any concrete subclass of Rng is guaranteed to have a next() and a maximum() implementation.

  1. Add a sampler.cpp and sampler.hpp to your project. Write a class UniformSampler with constructor UniformSampler(UniformDistribution & U, Rng & rng). Note this uses the pass-by-reference principle. Store the references as protected members in the class. Then define a double draw() inside the class that generates uniform random numbers on $[a, b]$. Make an executable C++ file in the app/ folder to test whether your implementation works well.

2.3 Correctness of the inversion method #

The inversion method applies when we have an analytic expression of the inverse of a cdf $F$. To prove that the inversion method works, takes a bit of care. Note that, due to finite precision on computers, numbers from from $X$ are discrete, and that $F$ is therefore not a continuous function. An exact inverse of $F$, i.e., $F^{-1}$, in the formal sense does not exist. This motivates the notion of a generalized inverse.

Definition 2.1.1 Let $F$ be a cdf. Define the generalized inverse, $F^\dagger$ to be

$$
  F^\dagger(\omega) = \inf\left\{x \in \mathbb{R} \, | \,  F(x) \geq \omega \right\} \quad (0 < \omega < 1). \tag{2.1}

$$
Lemma 2.1.2

Lemma 2.1.2 Let $F$ be a right-continuous function. For all $x \in \mathbb{R}$ and $\omega \in (0, 1)$ we have:

$$
  \omega \leq F(x) \iff F^\dagger (\omega) \leq x \tag{2.2}
$$

Proof: Note $F^\dagger(\omega) \leq x$ follows immediately from definition 2.1.1. Now we want to show $\omega \leq F(x)$. Since $F$ is right-continuous, for any $C \subseteq \mathbb{R}$,

$$F(\inf \left\{  y \,|\, y \in C \right\}) = \inf\left\{F(y) \,|\, y \in C \right\}.$$

Therefore,

$$F(F^\dagger(\omega)) = \inf\{F(y) \, | \, F(y) \geq \omega \} \geq \omega.$$

Since a cdf is an increasing function, $F(F^\dagger(\omega)) \leq F(x)$ when $\omega \leq F(x)$. $\square$

  1. With Lemma 2.1.2, prove Theorem 2.1.3.

    Theorem 2.1.3 (Inversion method). Let $X$ be a distribution, with pdf $f$, and cdf $F$. Furthermore, let $\omega$ be a random variable on $\Omega$, a uniform distribution on $[0, 1]$. Then \(x=F^{\dagger}(\omega)\) is distributed like $X$.

  2. Let $\omega \in [0, 1]$.

    • Let $\lambda > 0$. Verify that the cdf for an exponential distribution, \[ F(x) = 1 - e^{-\lambda x},\] has generalized inverse \(F^{\dagger}(\omega) = -1/\lambda \log(1 - \omega)\) .
    • Let $\sigma > 0$. Verify that the cdf for a Cauchy distribution, \[ F(x) = 1/2 + 1/\pi \arctan(x / \sigma),\] has generalized inverse \(F^{\dagger}(\omega) = \sigma \tan(\pi(\omega - 1/2))\) .

2.4 Nonuniform distributions and the inversion method #

With the uniform distribution, we are now finally ready for implementations of the inversion method and rejection method. In this exercise, you will make an implementation plan for both methods, and then implement the inversion method.

  1. First, identify all different concepts/entities, and how they relate to eachother.
    • What are all the different distributions, and how can they be categorized?
    • What are the different methods? Which distributions can use which methods?
    • How does one method need to make use of the other?

Coding tips #

  • Prefer composition over inheritance Newcomers to OOP are often overusing inheritance as a strategy for sharing code between two (or more) classes. To see how this can go wrong, consider the class Pizza. It may be tempting to make subclasses as TomatoPizza and MushroomPizza. However, as soon as another kind of variation of Pizza shows up, there is a problem. If Pizza is also either Italian or American, the possible subclasses multiply: ItalianTomatoPizza, AmericanTomatoPizza, etc. This easily becomes unmaintainable. The solution? Make a class Ingredient with subclasses, and PizzaStyle with subclasses, and let Pizza compose a Ingredient & and compose PizzaStyle & as members. As a rule of thumb, composition is often more appropriate than inheritance!
  • Use std::numbers In #include <numbers> are constants, such as std::numbers::pi, std::numbers::inv_pi ($1 / \pi$), std::numbers::sqrt2 and std::numbers::e. See numbers for a list.
  1. Sketch out a class design, incorporating the different distributions and different methods. What members do the different classes have? What arguments should their constructors take? Make sure that it is easy to reuse the inversion method for new distributions. Check your class design with the supervisor (not graded).

  2. Using your class design, implement the the inversion method, the Cauchy distribution, and Exponential distribution. Draw numbers from the classes, and verify your results using a file in the app/ directory.

Writing tips This is probably a good time to make a few plots of the distributions and drawn numbers.

2.5 The rejection method #

Using your class design, proceed with the rejection method. From here on, the exercises assume that you will know how to organize files and classes in your project.

  1. Implement the rejection method for a sampler of $X$ and a distribution $Y$. Also write a class for the normal distribution.

  2. As an experiment, we would like to sample from the Normal distribution ($Y$), using the Cauchy distribution ($X$). The requirement is that $g(x) \leq c f(x)$ for all $x$. Using default parameters for both distributions, and the fact that \[\frac{1}{2 \pi} \exp(-x^2 / 2) \leq \frac{1}{\pi (2 + x^2)}\] find a small(est) $c \in \mathbb{R}$ for which $g(x) \leq cf(x)$.

  3. In your code, keep count of the number of draws and rejections. How many draws are rejected using the Cauchy bound? Is this what you expected?

In some cases, there may not be an appropriate distribution $X$ to draw from. An alternative is then to draw from the uniform distribution $\Omega$ on $[a, b]$, using an oversampling factor $c/(b-a)~\geq~\max_x{g(x)}$. However, since this distribution has finite support, it may not cover the entire domain of $g(x)$.

  1. Describe the problem/side-effects of sampling with the uniform distribution on a distribution with infinite extent. One possible solution is to artificially generate extra samples on the remaining domain of $g(x)$. What other solutions can you think of? What information would you need from the distribution $Y$ to implement these?

  2. Extend your code to counteract the side-effect with the suggested solution or an alternative of your own. Test the code for a distribution of choice.

2.6 Distribution arithmetic and operator overloading #

Operator overloading is a language concept that allows an operation (e.g., =, +, * >, but also language operators) to be customized for user-defined types. This allows us to write arithmetic operators for distributions, for example:

CauchyDistribution cauchy {1.23};
NormalDistribution normal {4.56};
auto funky = cauchy * 0.2 + normal * 0.8;
// allow sampling from `funky`, with the rejection method
  1. Design a class ScaledDistribution : public Distribution with a constructor that takes a const Distribution &a and multiplication value double b. Make sure that the class implements all methods from Distribution. Note that the ScaledDistribution is strictly not a distribution, as the probability density does not integrate to one.

  2. Similarly to (1.), design a class ComposedDistribution : public Distribution that adds two distributions.

  3. Overload the operator+ and operator* for the Distribution class, in such a way that adding two distributions yields a ComposedDistribution, and right-multiplying a distribution by a double yields a ScaledDistribution. Note: inside a class the current object can be obtained through *this. Test the result, for instance by seeing if the code sample above works with the rejection method and uniform upper bound.

When two distributions with invertible cdf are added, the resulting distribution has again an invertible cdf. A distribution that is composed (and/or scaled) should therefore also be processable by the inversion method.

  1. Write additional operator overloads so that added and/or multiplied invertible distributions are still processable by the inversion method. Start by writing a class InvertibleComposedDistribution : public InvertibleDistribution and InvertibleScaledDistribution : public InvertibleDistribution and do not inherit from ComposedDistribution or ScaledDistribution, even though this duplicates a bit of code. Test the code by making plots for a few composed distributions using the inversion method.

Coding tips for optional exercise 5. #

  • Multiple inheritance In C++ it is possible to inherit methods from multiple parent classes. This is a powerful concept that is easily misused. Programmers tend to overuse inheritance (remember: favor composition over inheritance). Typically, multiple inheritance would be used to derive from so-called God classes: large, messy, base classes full with unrelated functionality. However, a InvertibleComposedDistribution is-a ComposedDistribution but also is-a InvertibleDistribution: this is a good reason for multiple inheritance.
  • Diamond problem According to Wikipedia, the diamond problem is an ambiguity that arises when two classes B and C inherit from A, and class D inherits from both B and C. If D does not override a method, but the method is overridden in both B and C, it is not clear which version needs to be inherited. In C++ that is solved with virtual inheritance and requires the virtual public specifiers.
  1. Optional Use multiple inheritance to avoid code duplication between InvertibleComposedDistribution and ComposedDistribution (and similarly so for scaling). Hint: to use a Distribution member from the base class as an invertible distribution may require a run-time type conversion using dynamic_cast.

2.7 The standard RNG #

To use the built-in C random number generator in your distributions library, one could consider writing a class that wraps the built-in functions srandom() and random(), and implements the Rng interface:

#include <cmath>

class StdRng : public Rng {
public:
    StdRng(unsigned seed) {
        // Seeds the RNG. If not called, the seed defaults to 1.
        srandom(seed);
    }

    unsigned next() override {
        return random();
    }
    
    unsigned maximum() const override {
        return RAND_MAX;
    }
};
  1. What is problematic about this class? Consider making multiple object instances of the StdRng type. Hint: what happens to the random numbers drawn from a first object of StdRng when a second object of StdRng is created?

  2. Modify the class (or start over), and use a constructor that does not cause this problem. Allow setting the seed through a static member setSeed(unsigned seed). It should now be clear to users of the class that setting the seed applies to all objects of StdRng.

  3. Modify the class with a method getSeed() that allows the user of the class to get the current seed.

Coding tips for optional exercise 4. #

  • Singleton pattern A singleton class is a class that permits having only a single object instance of its type. The implementation is the following: for a class T that is singleton, keep a single static object of T in the class (either as a static member, or static in a function). The class T should keep the constructor private (so new objects cannot be constructed), and have a function static T &getInstance() that returns its object.
  1. Optional For the StdRng type, it is not necessary to have multiple object instances – they would occupy memory but all perform the same task. Write a SingletonStdRng class, following the Singleton pattern.