Discretely Distributed Random Numbers in C#

posted in The Curiously Recurring Gimlet Pattern

Published October 02, 2014

Random numbers generated from a discrete distribution are commonly needed in game development.

By "discrete distibution" we just mean the roll you get from something like an unfair die, e.g. you want a random number from 0 to 5 but you want 4 and 5 to be twice as likely as 0, 1, 2, or 3. If we think of each possible random value as having a weight, in this case 0, 1, 2, and 3 would have a weight of 1 and 4 and 5 would have a weight of 2.

A simple way to generate these kinds of random values is the following. Given some n such that we want random values ranging from 0 to n-1 where for each 0 <= i < n we have a weight w(i):

Build a data structure mapping cumulative weight to each value i. By cumulative weight we mean for each i the sum of w(0), ... , w(i-1).
Generate a random number r from 0 to W-1 inclusive where W is the total weight i.e. the sum of w(i) for all i.
If r is a cumulative weight in our data structure return the value associated with it; otherwise, find the value v that is the first item in the data structure that has a cumulative weight greater than r and return v-1.

Obviously the data structure in the above could just be an unordered array but a better way to do it is to use a binary search tree because 3. will then be O(log n) rather than linear. In Java you can do this with a TreeMap. In C++ you can do this with an std::map (or in C++ you can do the whole thing with boost::random::discrete_distribution).

However, in C# you can't just use a SortedDictionary, which is a binary search tree under the hood. You can't use a SortedDictionary because it does not expose the equivalent of C++'s std::lower_bound and std::upper_bound or the equivalient of Java's TreeMap.floorEntry(...) and TreeMap.ceilingEntry(...). In order to perform the "otherwise" part of 3. above you need to efficiently be able to find the spot in the data structure where a key would go if it was in the data structure when it is in fact not in the data structure. There is no efficient way to do this with a SortedDictionary.

However, C#'s List does support a BinarySearch method that will return the bitwise complement of the index of the next element that is larger than the item you searched for so you can use that.

The downside of this whole approach is that there will be no way to efficiently add or remove items to the discrete distribution, but often you don't need this functionality anyway and the code to do the whole algorithm is very concise:
[code=csharp:0]class DiscreteDistributionRnd{ private List m_accumulatedWeights; private int m_totalWeight; private Random m_rnd; public DiscreteDistributionRnd(IEnumerable weights, Random rnd = null) { int accumulator = 0; m_accumulatedWeights = weights.Select( (int prob) => { int output = accumulator; accumulator += prob; return output; } ).ToList(); m_totalWeight = accumulator; m_rnd = (rnd != null) ? rnd : new Random(); } public DiscreteDistributionRnd(Random rnd, params int[] weights) : this(weights, rnd) { } public DiscreteDistributionRnd(params int[] weights) : this(weights, null) { } public int Next() { int index = m_accumulatedWeights.BinarySearch(m_rnd.Next(m_totalWeight)); return (index >= 0) ? index : ~index - 1; }}where usage is like

DiscreteDistributionRnd rnd = new DiscreteDistributionRnd(3,1,2,6);int[] ary = new int[4] {0,0,0,0};for (int i = 0; i < 100000; i++)    ary[rnd.Next()]++;System.Diagnostics.Debug.WriteLine(    "0 => {0}, 1 => {1}, 2 => {2}, 3 => {3}",    (float)(ary[0] / 100000.0),    (float)(ary[1] / 100000.0),    (float)(ary[2] / 100000.0),    (float)(ary[3] / 100000.0));

Source

Previous Entry Cocos2d-x + Box2d Breakout Updated

Next Entry What I am working on

3 likes 2 comments

Comments

Bacterius

Nice, this general algorithm is called inverse transform sampling by the way. There is also the alias method for sampling a discrete distribution in constant time given some preprocessing work (useful if you will be drawing from the same distribution many times) but it does have some edge cases.

October 03, 2014 07:15 AM

jwezorek

Yeah, I spent some time looking into this algorithm when I realized I couldn't use a SortedDictionary in C# and was looking for some way of doing this in better than linear time that didn't involve me pulling the red-black tree implementation out of the Mono codebase's implementation of SortedDictionary.

I discovered the alias method and considered using it but then I found that List<T> has a binary search that returns the upper bound of misses so the sampling could be done in 10 lines of c# code... and didnt have a need to pursue the alias algorithm. Never heard of the alias method until now, looks interesting.

October 03, 2014 04:12 PM

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

jwezorek

Author

🎉 Celebrating 25 Years of GameDev.net! 🎉

Discretely Distributed Random Numbers in C#

Comments

jwezorek

Latest Entries

Triangle Plasma

Triangular Life

Lifelike, or the Joy of Killing Time via Breeding Little Squiggles

draak update

My Code for Doing Two Things That Sooner or Later You Will Want to do with Bezier Curves

Untitled Escher Game Update

An Application for Creating Novel Escher-like Tessellations of the Plane

What I am working on

Discretely Distributed Random Numbers in C#

Cocos2d-x + Box2d Breakout Updated

🎉 Celebrating 25 Years of GameDev.net! 🎉

Discretely Distributed Random Numbers in C#

Comments

jwezorek

Latest Entries

Triangle Plasma

Triangular Life

Lifelike, or the Joy of Killing Time via Breeding Little Squiggles

draak update

My Code for Doing Two Things That Sooner or Later You Will Want to do with Bezier Curves

Untitled Escher Game Update

An Application for Creating Novel Escher-like Tessellations of the Plane

What I am working on

Discretely Distributed Random Numbers in C#

Cocos2d-x + Box2d Breakout Updated

Reticulating splines