3. Monte Carlo #
Start a new C++ project for this mini-report.
3.1 One-dimensional Monte Carlo #
In this exercise, the goal is to implement Hit-or-miss Monte Carlo and Simple-sampling Monte Carlo for real-valued integrands on one-dimensional domains. The exercise is to start the project from scratch.
- Use object-oriented programming, with interfaces and polymorphism. Use the
app/
directory to demonstrate the written code. - Write classes with value members (not references), and using move semantics (
std::move
) where necessary to avoid making copies. - Use
std::uniform_real_distribution
to draw uniform random numbers. A usage example is shown at the bottom of the link. - Research
std::function
and lambda functions. Like function pointers, these allow passing functions (integrands) into Monte Carlo methods.
Experiments #
Experiment 1: #
Knowing that $\int_0^1 \sqrt{1 - x^2}\,\text{d}x$
describes the area of a quarter circle with unit radius, estimate the value of $\pi$. How do accuracy and efficiency trade off between the two different Monte Carlo methods?
Experiment 2: #
For hit-or-miss, show that the unbiased sample variance can conveniently be computed as:
$$
\frac{1}{N-1} \sum_{i=1}^N\left(b_i - \bar{B}_N\right)^2
= \frac{N_A}{N - 1}\left(1 - \frac{N_A}{N}\right).
$$
Using the previous experiment, demonstrate the sample mean and variance for different sample sizes of $N$, and compare with their theoretical counterparts.
Experiment 3: #
For a probability $\beta \in [0, 1]$, a minimum sample size can be computed to achieve an $\frac{1}{b-a} \int_a^b f(x)\,\text{d}x$
estimate with tolerance $\delta$. Using a sharp bound $c = \text{max}_x(f)$, for hit-or-miss the minimum sample size is:
$$
N \geq \left[\frac{\Phi^{-1} \left(\frac{\beta + 1}{2}\right) \cdot c}{2\delta}\right]^2.
$$
where $\Phi^{-1}$ denotes the inverse cdf of the normal distribution on $[0, 1]$. For simple-sampling, the required sample size is instead
$$
N \geq \left[\frac{\Phi^{-1} \left(\frac{\beta + 1}{2}\right) \cdot S}{\delta}\right]^2,
$$
where $S$ needs to be chosen as the average between the maximum and minimum value that $f$
attains on the integration interval, i.e., $S := (\text{max}_x(f) - \text{min}_x(f)) / 2$
. For a derivation, see Determining Sample Sizes for Monte Carlo Integration (Neal D., 1993).
Estimate the average value of $f(x) = \ln(x^2 -4x + 6)$
on $[0, 6]$
, with probability at least $\beta=0.95$ and a tolerance of $\delta=0.1$.
3.2 Multi-dimensional Monte Carlo #
In this exercise, we extend last week’s Monte Carlo methods to work with functions defined on
$$
\int f(\mathbf x)\,\text{d}\mathbf x \approx \frac{1}{N} \sum_{k=0}^{N-1} \alpha_i f\left(\mathbf x_i\right)
$$
where now $f : \mathbb{R}^d \rightarrow \mathbb{R}$
. For convenience later on, we denote $\mathbf x = (x_1, \dots, x_d)$
. Henceforth $\mathbf x_i$
denotes a $i$-th integration point, and $x_j$
a $j$-th vector component. The previously discussed methods extend rather naturally to higher dimensions.
3.2.1 Multi-dim hit-or-miss #
For hit-or-miss, points are sampled from $d+1$-dimensional hypercubes. Practically, one first samples a domain vector $\mathbf x = (x_1, \dots, x_d)$
from a hyper-rectangle, and evaluates $f(\mathbf x)$
. The main difference with the 1-dim. method is that now a point in the domain can be missed as well. If $f$ has no support on the entire rectangle, it may throw an exception – or simply be defined to return $0$. If the domain was not missed (no throw or zero), the algorithm may proceed with drawing the $d+1$-th sample. When this is under the graph, it counts as a hit, and otherwise as a reject.
Theory The theoretical extension is straightforward. Given a hyperrectangle $[a_1, b_1] \times \dots \times [a_{d + 1}, b_{d + 1}]$
. The $\text{area}\,A$
remains the integral, whereas $\text{area}\,R$
simply becomes the volume $c(b_d - a_d)\cdots(b_1 -a_1)$
. We therefore have:
$$
\int_{[a_1, b_1] \times \cdots \times [a_d, b_d]} f(\mathbf x)\,\text{d}\mathbf x \approx \frac{N_A}{N} c(b_d - a_d)\cdots(b_1 -a_1).
$$
Implementation The multi-dim method takes an integrand $f$ (that throws or outputs zero outside its domain), the bounds of the domain $a_1, \dots a_d$
and $b_1, \dots, b_d$
, and a $c$ for the range $[0, c]$
.
3.2.2 Multi-dim simple sampling #
For simple-sampling, we would like to sample, again, a $d$-dim. point from a integration domain in $\text{dom}(f)$
. The problem is again that $\text{dom}(f)$
may not be rectangular, and not easy to sample from. Rather than guessing points – letting $f$ handle the misses is inefficient when $\text{dom}(f)$
is small but high-dimensional – we may be able to rewrite an integral to the form
$$
\int_{\mathbf x \in \text{dom}(f)} f(\mathbf x)\,\text{d}\mathbf x = \int_{a_1}^{b_1} \cdots \int_{a_d}^{b_d} f(\mathbf x)\,\text{d}x_d\dots\text{d}x_1.
$$
The integral bounds $a_j$
and $b_j$
effectively describe the domain, and each bound potentially depends on any not-yet-integrated-out variables. I.e., $a_j \equiv a_j(x_1,\dots, x_{j-1})$
$b_j$
.
Theory By uniformly sampling $x_j \in \Omega[a_j, b_j]$
for each $j$, the integral approximates
$$
\int f(\mathbf x)\,\text{d}\mathbf x \approx \frac{1}{N} \sum_{k=0}^{N-1} \alpha_i(\mathbf x_i) f(\mathbf x_i).
$$
where an appropriate $\alpha_i$
must be chosen to compensate for the sampling density of $\mathbf x_i \in \text{dom}(f)$
. Calculating this weight together with $\mathbf x_i$
is the exercise of Experiment 1.
Implementation In C++, one should provide the integrand $f$, the bounding functions $a_1, \dots a_d$
and $b_1, \dots, b_d$
. For the latter, it may be easier to supply a single function a(std::vector<double>& x, size_t j)
(and similar for $b$), rather than supplying individual functions for every $a_j$
and $b_j$
.
3.2.3 Project goals #
- Implement a multi-dimensional hit-or-miss and simple-sampling method:
- use Hit-or-miss with the similarly named strategy for the domain;
- use Simple-sampling using the Domain sampling technique (further detailed in experiment 1).
- Allow pseudo-random and low-discrepancy sampling for the
$\mathbf x_i$
generated in the domain (experiment 2). - Use
std::vector<double>
(similar to lists in Pythona = [3.0, 4.0]
) and range-based for loops (similar to Python’sfor x in [...])
) where possible. - Avoid duplication of code or principles with the 1-dim exercises last week. Hint: can you change the “backend” of the 1-dim case to work as a special instance of the $d$-dim case? If your OOP design of last week was sound, the interface towards your experiments in the
app/
folder can be maintained!
Experiments #
Experiment 1: Nonrectangular domains #
We illustrate the prevoius the two multi-dim integration ideas by the unit hypersphere. Denote the $d$
-dimensional unit hypersphere with $S_d$
. In (domain) hit-or-miss style, the volume $V_d$
of $S_d$
can be written as:
$$
V_d := \int_{[-1, 1]^d} \mathbb{I}_{S_d} \,\text{d}\mathbf x,
$$
where $\mathbb{I}_{S_d}$ is the indicator function. In (domain) sampling style $V_d$
can be computed as repeated 1-dimensional integration over a constant function $f(x):=1$
, but this introduces integration-variable dependent boundaries. To see this, generalize
$$
V'_2 := 4 \int_0^1 \int_0^{\sqrt{1-x_1^2}} 1 \,\text{d}x_2\text{d}x_1
$$
to higher order. In fact, it is possible to mix the two styles, but we will not persue this.
Explain why, for
$V'_d$
, Monte Carlo simple-sampling will give exactly the same results as Monte Carlo hit-or-miss.For 1D simple-sampling, integration points were uniform randoms on
$x_i$
with constant weights$\alpha_i = b - a$
. For $d$-dim. integration, formulate an algorithm to compute an integration point$\mathbf x_i \in \text{dom}(f)$
and associated weight$\alpha_i \in \mathbb{R}$
. Hint: for$V_d'$
, one would first draw$x_1 \in [0, 1]$
, then generate a uniform number$x_2 \in [0, \sqrt{1 - x_1^2}]$
. Generalize this idea to arbitrary dimensions and domains.Aside from the $d$-dim. unit sphere, discuss general advantages and disadvantages of the two approaches. Are both always applicable? Which one would be more efficient? Which one approximates the integral better?
An analytic integral of
$S_d$
is given by:$$ V_d := \frac{\pi^{d/2}}{\Gamma(\frac{d}{2} + 1)}. $$
Compare integration of the unit sphere for several dimensions $d$ with both approaches. Include runtime and accuracy in your discussion.Gamma function (not required, for your convenience)
/** * Gamma function * https://rosettacode.org/wiki/Gamma_function#C * Content is available under GNU Free Documentation License 1.2 * https://www.gnu.org/licenses/old-licenses/fdl-1.2.html */ double sp_gamma(double z) { const int a = 12; static double c_space[a]; static double *c = NULL; int k; double accm; if (c == NULL) { double k1_factrl = 1.0; /* (k - 1)!*(-1)^k with 0!==1*/ c = c_space; c[0] = sqrt(2.0 * M_PI); for (k = 1; k < a; k++) { c[k] = exp(a - k) * pow(a - k, k - 0.5) / k1_factrl; k1_factrl *= -k; } } accm = c[0]; for (k = 1; k < a; k++) accm += c[k] / (z + k); accm *= exp(-(z + a)) * pow(z + a, z + 0.5); /* Gamma(z+1) */ return accm / z; }
Experiment 2: Quasi-Monte Carlo #
A second topic is that of sampling. Points need not to be sampled from uniform random vectors. Instead, they can be taken from nonuniform distributions (priority sampling, not treated) or be generated from deterministic vectors that attempt to cover the space uniformly (low-discrepancy sequences). In this experiment, the goal is to compare pseudo-random Monte Carlo methods with a partially deterministic, quasi-Monte Carlo method (QCD). A recommended resource is A Practical Guide to Quasi-Monte Carlo Methods (Kuo F. and Nuyens, D., 2016). A more technical and detailed read on the topic is given by High-dimensional integration: The quasi-Monte Carlo way (Dick J., Kuo F. and Sloan I.H., 2013).
One of the many QMC methods is a so-called rank-1 lattice rule. Rather than with pseudo-random numbers, $\mathbf x_i$
is computed with the help of a generating vector $\mathbf z \in \mathbb{R}^d$:
$$
\mathbf x_i = \left\{\frac{i \mathbf z \bmod N}{N} + \Delta\right\},
$$
where $\Delta \in \mathbb{R}^d$
denotes a shift, $i$ is not the imaginary unit, and $\{\cdot \}$
reduces a number to its fractional part, e.g., $\{(1.3, 2.5)\} = (0.3, 0.5)$
. Note that the rule depends on a pre-specified number of samples, $N$, and is deterministic. Therefore, $\text{dom}(f)$ is only covered well when all $N$ points are drawn. If, however, your code allows calls that increase the number of samples, you may want to consult the reading by Kuo and Nuyens (Section 2.4) to implement an extensible lattice rule (this is a bit more difficult).
Test function #
The following integrand test function is taken from Computational Higher Order Quasi-Monte Carlo Integration (Gantner R. and Schwab C., 2016):
$$ f(\mathbf x) := \exp\left(\alpha \sum_{j=1}^{d} x_j j^{-\beta}\right) = \prod_{j=1}^d \exp(\alpha x_j j^{-\beta}). $$
with $\alpha \neq 0$ and decay parameter $\beta \geq 0$. The integral over
$[0, 1]^d$
admits an analytic solution;$$ \int_{[0,1]^d} f(\mathbf x)\,\text{d}\mathbf x = \prod_{j=1}^d \frac{\exp(\alpha j^{-\beta}) -1}{\alpha j^{-\beta}}. $$
Important:
std::exp(x) - 1
is inaccurate for very small arguments. Usestd::expm1(x)
instead.
- For the test function above, choose $\alpha = 1$ and $\beta = 2$, and compute the solution with Monte Carlo simple-sampling. Then, for the same problem, compute a solution with the lattice rule,
$\Delta =\mathbf 0$
and the provided $\mathbf z$ below.
A lattice generating vector $\mathbf z$
up to 250 dimensions
/** The following description and vector is taken from
* the work of Dirk Nuyens "Magic Point Shop".
* https://people.cs.kuleuven.be/~dirk.nuyens/qmc-generators/
*
* Some default lattice sequence with a maximum of 250 dimensions and 2^20
* points.
*
* Generating vector from the following article:
*
* Cools, Nuyens, Kuo. Constructing embedded lattice rules for
* multivariate integration. SIAM J. Sci. Comput., 28(6), 2162-2188,
* 2006.
*
* The maximum number of points was set to 2^{20}, maximum number of
* dimensions is 250 constructed for unanchored Sobolev space with order
* dependent weights of order 2, meaning that all 2-dimensional
* projections are taken into account explicitly (in this case all choices
* of weights are equivalent and this is thus a generic order 2 rule).
*/
std::array<unsigned, 250> z = {
1, 182667, 469891, 498753, 110745, 446247, 250185, 118627, 245333, 283199, 408519, 391023, 246327, 126539,
399185, 461527, 300343, 69681, 516695, 436179, 106383, 238523, 413283, 70841, 47719, 300129, 113029, 123925,
410745, 211325, 17489, 511893, 40767, 186077, 519471, 255369, 101819, 243573, 66189, 152143, 503455, 113217,
132603, 463967, 297717, 157383, 224015, 502917, 36237, 94049, 170665, 79397, 123963, 223451, 323871, 303633,
98567, 318855, 494245, 477137, 177975, 64483, 26695, 88779, 94497, 239429, 381007, 110205, 339157, 73397,
407559, 181791, 442675, 301397, 32569, 147737, 189949, 138655, 350241, 63371, 511925, 515861, 434045,
383435, 249187, 492723, 479195, 84589, 99703, 239831, 269423, 182241, 61063, 130789, 143095, 471209, 139019,
172565, 487045, 304803, 45669, 380427, 19547, 425593, 337729, 237863, 428453, 291699, 238587, 110653,
196113, 465711, 141583, 224183, 266671, 169063, 317617, 68143, 291637, 263355, 427191, 200211, 365773,
254701, 368663, 248047, 209221, 279201, 323179, 80217, 122791, 316633, 118515, 14253, 129509, 410941,
402601, 511437, 10469, 366469, 463959, 442841, 54641, 44167, 19703, 209585, 69037, 33317, 433373, 55879,
245295, 10905, 468881, 128617, 417919, 45067, 442243, 359529, 51109, 290275, 168691, 212061, 217775, 405485,
313395, 256763, 152537, 326437, 332981, 406755, 423147, 412621, 362019, 279679, 169189, 107405, 251851,
5413, 316095, 247945, 422489, 2555, 282267, 121027, 369319, 204587, 445191, 337315, 322505, 388411, 102961,
506099, 399801, 254381, 452545, 309001, 147013, 507865, 32283, 320511, 264647, 417965, 227069, 341461,
466581, 386241, 494585, 201479, 151243, 481337, 68195, 75401, 58359, 448107, 459499, 9873, 365117, 350845,
181873, 7917, 436695, 43899, 348367, 423927, 437399, 385089, 21693, 268793, 49257, 250211, 125071, 341631,
310163, 94631, 108795, 21175, 142847, 383599, 71105, 65989, 446433, 177457, 107311, 295679, 442763, 40729,
322721, 420175, 430359, 480757,
};
By simple-sampling with a few lattices, and shifting each lattice randomly with a uniform $\Delta \in [0, 1]^d$, the space might be better covered than a single large lattice rule.
- Let $Q \in \mathbb{N}^+$ denote a modest amount of random shifts, e.g., $Q := 2^3$. Compare random numbers, a single lattice, and randomized lattices for different dimensions $d$ and increasing numbers of samples. Make sure that $N_{\text{PRNG}} = N_\text{big-lattice} = QN_{\text{small-lattice}}$ to have a fair comparison. What convergence speed is observed?
3.3 Introduction in template metaprogramming #
In this exercise, we will use templates for three illustrative use cases with increasing difficulty: replacing std::vector
by a fixed-size array, a compile-time alternative for virtual
functions, and compiler-optimized vector expressions.
Writing This exercise is purely technical and does not require mentioning in your report.
- The code of the third experiment should be included in the archive that you send in with the report.
Experiments #
Experiment 1: Replacing std::vector<double>
by std::array<double, Dim>
#
Using a typename<size_t Dim>
, convert all classes that use $d$-dimensional vectors with $d$-dimensional arrays. Using std::array
, which is a C++ wrapper around the C-style array []
, has several (smaller) advantages. First, the data in this container is stored statically (on the stack), and avoids slow memory allocation and cache misses on the heap. Furthermore, since stack-arrays are of fixed size, the compiler may make certain optimizations.
To get started, begin with replacing one of the vectors in your classes by an array of size Dim
. You’ll find that the class requires a template parameter, to pass Dim
, and that Dim
in this way propagates through your class structure. E.g.:
template<size_t Dim>
class HitOrMiss : public MonteCarlo<Dim> {
protected:
std::array<double, Dim> a_; // replaced vector
}
Code tips
- Keep all templated classes and members in include/. Template types are inferred by the compiler. When templates are used in .cpp files, only the specialized templates (for example, for a
Dim=3
andDim=5
, when you called those) are compiled. By moving and merging source code from your .cpp into the .hpp file, you allow everybody to compile their own specializations of the template.
Experiment 2: Avoiding virtual
functions
#
We have used
virtual
member functions to allow function overriding, and this has allowed for polymorphism. For example, a functionHitOrMiss::integrate()
may call a virtual functionvirtual next()
that is either implemented byPseudoRandom::next()
or byLattice::next()
. A downside of virtual functions, is that the dynamic dispatch mechanism is slow, due to the indirection with vpointers and vtables. A larger problem, however, is that the compiler cannot infer which of the implemented functions is going to be called during the program’s run. The compiler therefore cannot merge (inline) the functionsintegrate()
andnext()
, leading to slower code.A technique to avoid
virtual
is to substitute a class member, with a templated type. An oversimplified example:template<typename T> class HitOrMiss { double integrate() { ... x_i = sampler_.next(); // calls `T` without `next()` being virtual } T sampler_; // stores a member of type `T`, not the base class } HitOrMiss<Lattice> foo {...}; // compiles `HitOrMiss`+`Lattice` into one type HitOrMiss<PseudoRandom> bar {...}; // similar HitOrMiss<Something> baz {...}; // works whenever `Something` has a `next()`
In your own code, use this template strategy for the relation between the Monte Carlo methods and the sampling/lattice-rule method.
Code tips
- Duck typing The compiler does not check if the template argument is valid (
T
in the above). This is called duck typing. “If it walks like a duck and it quacks like a duck, then it must be a duck”.- Enforcing template types From C++20 onwards, it is possible to use concepts and constraints to check if templates are a valid substitution. Before C++20, programmers would put certain assertions in the code to ensure valid types. We do not treat that in this course.
Experiment 3: Compiler-optimized computations on vectors #
The following is an exercise with templates, that can be performed on the side of your Monte Carlo project. For the lattice generating rule (and maybe other places in your code), you may have added vectors, such as $\mathbf z + \Delta$, using a loop. Many mathematical and scientific software projects have implemented special vector types, which are not just data structures like
std::vector
, but also allow vector addition, multiplication, etc., like what NumPy does withnp.array
. For example, consider the vector sum:$$ \mathbf a + 2 \mathbf b + \mathbf c $$
Of course, we would like to write a
class Vector
withoperator+
andoperator*
overloaded, so that we could have elegant addition:Vector a = {...}; Vector b = {...}; Vector c = {....}; auto result = a + 2*b + c;
and where
result
would be computed with efficient binary code. However, such expression would evaluate as((a + (2 * b)) + c
, and each addition would yield intermediate vectors, which need temporary storage. Moreover, each+
or*
, would loop the entire vector. Instead, if the compiler would find a single loop witha[j] + 2 * b[j] + c[j]
, the CPU would not need to store any intermediate values in RAM. That would be much faster!With a templating technique called expression templates, we help the compiler to derive such efficient calculations. The main idea is to have classes for the operators and operands in a nested templated form. For example, a
could be a type that stores an equation Sum<Sum<Vector, Vector>, Vector>
a + b + c
– without yet evaluating it. During compilation, the compiler sees this expression as one object of that type. All the nested, substituted, functions fromVector
andSum
in the compiler-generated type will collapse into a single fast computation if the compiler is instructed with an optimization flag.
For the purpose of illustration, we will build a simple Vector
and Sum
type with templates.The Vector
class wraps a std::vector
, and the Sum
class allows a compile-time sum of a Vector
/Sum
with another Vector
/Sum
.
Design a
class Vector
that holds astd::vector
by value.- Constructor should be initialized with a
std::vector<double>
. operator[](size_t i) const
should return the $i$-th element of the internal vector.
- Constructor should be initialized with a
Design a
template<typename S, typename T> class Sum
, that holds constant references to objects of typeS
andT
. Note that these can beVector
s,Sum
s, or even something totally different.- Constructor should take an object of
S
and an object ofT
. What is better: taking these by reference or by value? operator[](size_t i) const
should return the sum of $i$-th elements of the internal objects.
- Constructor should take an object of
Write overloads of
operator+
forSum
andVector
.- Write a
template <typename T> Sum<Vector, T> operator+(const T &t)
for theVector
class. - Write a similar style overload for the
Sum
class.
- Write a
At this point, it should be possible to build expressions of the form Sum<Sum<Vector, Vector>, Vector>
Vector a {std::vector<double>(10, 1.)}; // a size-10 vector filled with 1.0
Vector b {std::vector<double>(10, 2.)};
Vector c {std::vector<double>(10, 3.)};
std::cout << (a + b + c)[2] << std::endl; // "6.0" was computed efficiently
- Our templates, up to now, describe objects that contain a to-be-computed expression: unless
operator[]
was called, no computations would be performed. To force the evaluation of the entire expression, we would like to assign the expression to a vector, e.g., . This requires us to write a copy assignment constructor, that takes the expression and calculates all elements to form a new vector:Vector result = a + b + c;
- First, write
size_t size() const
members forVector
andSum
that return the dimension of the vector or sum. We needsize()
in the following bullet. - Now, for
Vector
, write a templated constructor:template <typename T> Vector(const T& t)
that initializes the vector from an object of typeT
. Regardless ift
is aVector
or aSum
, you can callt[]
andt.size()
.
- First, write
If all went well, your templated vectors should be much fast than a naive vector implementation. If you want (but you don’t have to) you may use the templated vectors in your own code (with a modification for std::array
), or run the example below for a performance test.
CLion users To see a performance difference, do not compile with Debug profile. This builds without optimization (-O0
). Instead, go to Settings, Build, Execution, Deployment -> CMake, press the + to add a Release profile. Close settings, and set the profile to Release in the top-right dropdown.
Timing application
#include <iostream>
#include <chrono>
#include "expr.hpp"
class NaiveVector {
public:
NaiveVector(std::vector<double> v) : vec_(std::move(v)) {}
NaiveVector operator+(const NaiveVector &other) {
auto copy = std::vector<double>(vec_); // makes a copy of `vec_`
for (size_t i = 0; i != copy.size(); ++i)
copy[i] += other[i]; // add current
return NaiveVector(copy); // wrap in the right type
}
double operator[](size_t i) const {
return vec_[i];
}
size_t size() const {
return vec_.size();
}
protected:
std::vector<double> vec_;
};
template<typename T>
T test(T a, T b, T c) {
return a + b + c;
}
int main() {
using std::chrono::high_resolution_clock;
const int N = 1e4;
const int dim = 100;
// how fast are our vectors with templates?
Vector a{std::vector(dim, 12.34)};
Vector b{std::vector(dim, 23.45)};
Vector c{std::vector(dim, 34.56)};
auto t1 = high_resolution_clock::now();
for (int i = 0; i < N; i++) a = test(a, b, c);
auto t2 = high_resolution_clock::now();
std::cout << "Expr. templs.: " << (t2 - t1).count() << "ms\n";
// versus our naive implementation...
NaiveVector a2{std::vector(dim, 12.34)};
NaiveVector b2{std::vector(dim, 23.45)};
NaiveVector c2{std::vector(dim, 34.56)};
auto t3 = high_resolution_clock::now();
for (int i = 0; i < N; i++) a2 = test(a2, b2, c2);
auto t4 = high_resolution_clock::now();
std::cout << "Naive vectors: " << (t4 - t3).count() << "ms\n";
return 0;
}