Monte Carlo Simulations

Abstract

Monte Carlo simulations are a class of algorithms that use probability and randomness to solve problems. In this article, I discuss what Monte Carlo simulations are and how to use them.

A Roll of the Dice

Say you had a twenty sided dice, and you wanted to know what the probability is of rolling a twenty. I play Dungeons and Dragons fairly regularly, and if you do as well you'll recognise this as the chance that you land a critical hit in combat, or do something else exceptionally well.

The naive answer that you'll encounter in introductory statistics classes is that the chance is 1 in 20. If you have an ideal dice, then I'd agree. Realistically though, your dice isn't perfect, it's just good enough for your use case of having fun with some friends.

So, given a dice, how would you know if it actually rolls all of the numbers equally often? One approach might be to do expensive scans of the dice to find the relative density of the plastic at different points, measure the volume of the lettering etched in, and do some hairy integrals to figure out exactly where the centre of mass is on your little icosahedron.

The simpler solution is to roll it a whole bunch and record the results.

What are Monte Carlo simulations?

If you go searching for information on Monte Carlo, you'll find it's a small town in Monaco with a world famous casino. If your search engine has decided you're particularly mathematically inclined, or you add "simulation" to your search terms, you'll find a class of algorithms that are inspired by gambling.

The trick is that if you can phrase your problem in the form of "if I just repeat this many many times, the average case will approach the correct answer", then you might be able to skip over some complicated analysis and find an accurate answer. In the dice example, rolling the dice many times to find out the probability of any particular side coming up is a Monte Carlo method.

Typically, these algorithms come up in the context of computers that are simulating something. This is because computers would be able to try many different random numbers very quickly, making the method more viable, and so in these cases we can talk of it as being a Monte Carlo simulation.

An example: Finding Pi

One of the classic examples of a Monte Carlo simulation is a method for finding the value of Pi. Pi is a really useful numerical constant that comes up whenever you're working with the geometry of circles. If you know the radius of a circle, you can find both its area and circumference using Pi.

$AreaOfCircle = \pi \times Radius^2$

So, how are we going to figure out what Pi is? Let's start with a 2 by 2 square that has a circle in the middle. To simplify the maths, we're going to say that the center of the circle is the point (0, 0), and the circle has a radius of 1. This makes the area of this circle Pi.

If I generate a random x and y point in the square, it could be inside the circle or outside the circle. Thanks to Pythagoras' theorem, we can easily figure this out for any point. If the distance from the point to the center is less than 1, it's in the circle. The ratio of random points inside the circle to the random points inside the square will be directly related to the ratio of the area to the circle to the area of the square.

$\frac{PointsInCircle}{PointsInSquare} = \frac{AreaOfCircle}{AreaOfSquare}$

If you plug in that we know the area of the unit circle is Pi, and we know the area of the square is 4, then we can rearrange the equation to get this:

$\pi = 4 \times \frac{PointsInCircle}{PointsInSquare}$

If I write it up in Rust, it looks something like this:

// cargo-deps: rand="0.4.2"
extern crate rand;

use rand::distributions::{IndependentSample, Range};
use rand::Rng;
use std::time::{Duration, Instant};

struct Point {
    x: f64,
    y: f64
}

impl Point {

    // our random point is just two random numbers uniformly chosen
    // between -1 and 1
    fn random<R: Rng>(rng: &mut R) -> Point {
        let distribution = Range::new(-1f64, 1.);

        Point {
            x: distribution.ind_sample(rng),
            y: distribution.ind_sample(rng)
        }
    }

    // Pythagoras' theorem at work
    fn is_in_unit_circle(&self) -> bool {
        self.x * self.x + self.y * self.y <= 1.
    }
}

fn calculate_pi<R: Rng>(rng: &mut R, total: u64) -> f64 {
    // this will generate the random points and count the number that
    // fall in the unit circle
    let in_circle = (0..total)
        .map(|_| Point::random(rng))
        .filter(|p| p.is_in_unit_circle())
        .count();

    // knowing that, we can solve for pi
    let square_area = 4.;
    square_area * (in_circle as f64) / (total as f64)
}

fn main() {
    let mut rng = rand::thread_rng();

    // more iterations are more accurate, but will take longer to run
    for i in 0..10 {
        let base: u64 = 10;
        let iterations = base.pow(i);

        let start_time = Instant::now();
        let pi = calculate_pi(&mut rng, iterations);
        let duration = start_time.elapsed();

        println!("Iterations: {:10}, Pi: {:10.8}, Time: {}s{:09}ns",
                 iterations, pi, duration.as_secs(), duration.subsec_nanos());
    }
}

Iterations:          1, Pi: 4.00000000, Time: 0s000000330ns
Iterations:         10, Pi: 3.60000000, Time: 0s000000211ns
Iterations:        100, Pi: 3.28000000, Time: 0s000001134ns
Iterations:       1000, Pi: 3.10400000, Time: 0s000015639ns
Iterations:      10000, Pi: 3.16200000, Time: 0s000194181ns
Iterations:     100000, Pi: 3.14104000, Time: 0s002184030ns
Iterations:    1000000, Pi: 3.14088800, Time: 0s020392041ns
Iterations:   10000000, Pi: 3.14161120, Time: 0s193885646ns
Iterations:  100000000, Pi: 3.14159600, Time: 1s942411880ns
Iterations: 1000000000, Pi: 3.14168716, Time: 19s518056721ns

As we add more iterations, we get a more and more accurate value for Pi. It also, of course, takes longer to run.

How many iterations are enough?

In our example, we just chose a fixed number of iterations. In real examples, you'll rather have some measure of accuracy that you want to achieve. You might want to know Pi accurate to 4 decimal places.

In general terms, you want to make sure that your Monte Carlo simulation has converged on an answer.

How do you know that you've converged? Run the simulation again with the same number of iterations and check that you get the same answer. Since the Monte Carlo simulation is based on random numbers, you won't get exactly the same result. The difference between your two results will give you an indicator of how accurate your current measure is.

Where else can I use Monte Carlo simulations?

Monte Carlo simulations can be applied to many other situations that a large enough collection of random samples will represent the whole. It can be used to model physics system. It can be used to model financial systems.

It is also relevant in artificial intelligence for games. The bot in your game may not have time to work out all of the possible states that a move could put them in, but they probably can check out enough possibilities randomly to give them an idea on if a move is a good idea or not.

Really, it's a generally useful tool to know about. Hopefully, now, when you a hit a problem where it fits, you'll remember to give Monte Carlo simulations a try.

Support

If you get value from these blog articles, consider supporting me on Patreon. Support via Patreon helps to cover hosting, buying computer stuff, and will allow me to spend more time writing articles and open source software.

Tail Recursion

Functional programming uses recursion as its fundamental model for iteration. However, if you try to naively use really deeply nested recursion, you'll eventually run into a stack overflow error! In this article, I aim to explain why stack overflows happen, and how they can be avoided using a compiler optimization called tail recursion.

What Note Is This? Part 1: What is sound, and how do I use it in a program?

I'm currently working on a project to help me to improve my trumpet playing by giving me real-time feedback on pitch. This is the first post in a two part series, where I discuss what sound is, and how you can get it into your computer.