Technology
Random distributions in programming and video games
Uniform distribution
If you roll a 20-sided die (commonly called a d20), there’s an equal chance of rolling any number between 1 and 20. Over the course of infinite rolls, you’ll roll about the same number of 1s, 2s, 3s, and so on. That’s called a “uniform distribution.” The d20, popularized by Dungeons and Dragons, is the best and worst thing to happen to tabletop gaming: the best because it creates moments of triumph and despair in otherwise ho-hum games, and the worst because it creates periods of boredom in otherwise exciting games.
I’ve heard the d20 described as “swingy.” That’s a good way to think about it: you’re equally as likely to roll a 20 (extraordinary success) as a 1 (devastating failure) or a 9 (mediocre result). Your “luck” is constantly swinging around. In fact, luck is barely relevant. If you roll three 20s in a row you might feel unstoppable, but three 20s is no less likely than three 10s. You could go a whole session without seeing a 20, or it could be the only thing you roll. True randomness is only uniform on a large scale; one person rolling one die at one table isn’t guaranteed any kind of fairness. No wonder tabletop gamers develop so many superstitions.
A d20 is easy to simulate in basically every programming language:
final rnd = Random();
final d20 = rnd.nextInt(20) + 1;
If you graph the distribution, it looks like this:
You should be booing and hissing right now, but I’ll get to that.
Video games took this mechanic and ran with it. Random Number Generation, or RNG, is so hot right now. It’s always been part of games, but it’s more prominent and popular than ever, partially (I suspect) because it’s so easy to code. Upwards of 2,000 roguelike games were released on Steam last year—“roguelike” meaning you’re intended to play the game repeatedly, and certain elements, like enemies, powerups, and level layouts, are randomly chosen on each playthrough. Roguelike junkies like me will trudge through 20 disappointing rounds to feel the rush of a superb one, one where you get all the best powerups and your foes are helpless against you.
It’s okay for the dice to be swingy when the stakes are low: in a roguelike, you rarely lose more than 20 minutes because of a bad roll. You can always just start again. But when the stakes are high, it becomes a problem.
XCOM 2 and percent chance to hit
It’s time to make myself unpopular.
XCOM 2 is a bad video game. This is despite the fact that everything about it is awesome: the concept, the art, the story, the characters, everything. It’s a bad game because when you set up your soldiers to shoot at the aliens, it gives you a certain chance of hitting them, expressed as a percent. If your luck is bad, it doesn’t matter if your percentages are all 70% and up: you’ll miss all your shots, get slaughtered, and lose the level. And when your soldiers die in XCOM, they’re permanently dead. A few minutes of bad luck can spoil hours of careful gameplay.
Someone’s about to send me a strongly-worded email about how I’m “not playing the game right.” Let me save you the trouble: I am playing it right, and it’s a bad game.
Don’t get me wrong, I’m not opposed to random elements in games. They can add a lot of depth and replayability for relatively little effort. What I am opposed to is uniform random distributions in games where they don’t make sense.
What other choice is there? Well, with some math, you can get any distribution you want.
Triangular distribution
With a triangular distribution, you’re most likely to get values in the middle of the range (like 10 on a d20), and progressively less likely to get values toward the top and bottom of the range (like 1 and 20). It looks like this:
A triangular distribution is what you get if you roll two d10 dice instead of a single d20. The highest possible result is still 20, but you’re ten times as likely to roll an 11, because there’s only one way for the dice to add up to 20 (10 + 10) and ten ways for them to add up to 11 (10 + 1, 9 + 2, etc.). This feels more natural. Just like in real life, it’s rare for things to go as poorly as possible or as well as possible. Most of the time they’re closer to mediocre. A triangular distribution still “feels” random, but it also feels fair.
For my money, some version of this should be the default in games. Middling results should be the norm. Spectacular results, for better or worse, should be rare. You should be able to strategize based on the mediocrity principle.
This is a little more complicated to code, but it introduces a technique you can use for any distribution you like: rejection sampling. In short, you plot your desired distribution on a two-dimensional graph, then generate two (uniform) random numbers, x and y, as coordinates on the same graph. If they’re outside the distribution, you reject them and try again. If they’re inside it, you keep x.
Of course, for one kind of triangular distribution you can just generate 2d10 and add them together, but that technique doesn’t generalize.
Let’s say you want a random decimal number between 0 and 1, but you want the value 0.5 to be twice as likely as either 0 or 1, with linear probability in between. You could plot the distribution curve like this:
Think of this as two intersecting lines. The first one has a slope of 1, so its formula is y = 1x + 0.5 or simply y = x + 0.5. The second has a slope of -1, so its formula is y = -x + 1.5. Now you can code your RNG:
final rnd = Random();
double nextDoubleTriangular() {
double x, y;
do {
(x, y) = (rnd.nextDouble(), rnd.nextDouble());
} while (y > (x + 0.5) || y > (-x + 1.5));
return x;
}
This is admittedly less efficient than just calling rnd.nextDouble(), and it admittedly invokes the halting problem, but it’s quick and reliable in practice.
Parabolic distribution
What if you want a smooth curve between the top and bottom of the range, so the center value is still the most likely, but the difference is less dramatic across the middle of the range? Like this:
This is similar, you just have to tweak the rejection clause. The formula for this curve is y = -2(x - 0.5)² + 1.
Thanks to my wife for helping me derive this formula.
final rnd = Random();
double nextDoubleTriangular() {
double x, y;
final plot = (x) => (-2 * pow(x - 0.5, 2)) + 1;
do {
(x, y) = (rnd.nextDouble(), rnd.nextDouble());
} while (y > plot(x));
return x;
}
This version is nice because you can tweak the coefficient -2 to make the curve drop off more or less sharply on each side, without affecting its centerpoint and maximum.
Micro-uniform distribution
I don’t know if there’s an official math name for this one, I’m just a programmer.
Suppose a triangular or parabolic distribution isn’t right for your game. You really do need every result to be equally likely. But you still want it to feel fair—no one should get stuck rolling 1s, and no one should get stuck rolling 20s.
One way to do this is to take the gambler’s fallacy to an extreme: make it so across 20 rolls, every value from 1 to 20 comes up once:
List<int> possibleValues = [];
int nextD20Constrained() {
if (possibleValues.isEmpty) {
// Add all the integers from 1 to 20, inclusive, in random order
possibleValues = [for (var i = 1; i <= 20; i++) i].shuffled();
}
return possibleValues.removeLast(); // You may know this function as pop()
}
It’s still a uniform distribution. It’s still, for most intents and purposes, random. But normal random number generators sometimes produce bullshit like 1 1 1 1 1 1 1 1 1 1 1 1, which isn’t fun for anyone, whereas this one makes sure your luck balances out in the short term.
If you want a slightly less constrained version, there are a couple ways to do it:
- Double up the starting array so each number comes up twice per cycle. You’ll still get an even distribution across 40 calls, but a streak of good or bad luck can last a bit longer.
- Don’t wait for the array to be empty before regenerating it; once you’re down to 5 items or so, go ahead and start over. Depending on the number you use, this can help you find a sweet spot between balanced results and true randomness.
Gaussian distribution
Here’s where I admit defeat. I know it’s possible to generate random numbers according to a bell curve, like this:
And I know all the formulas are right there on Wikipedia. But I haven’t taken a math class since high school calculus, so I can’t read them, and all the relevant Stack Overflow answers assume you know how to invert a CDF, which sounds like a sex thing. If you can write a function that accepts a σ and returns a normally-distributed random double, please send it over, I’d love to use it.
Returning to XCOM 2
I know XCOM 2 isn’t actually a bad game. It encourages a specific play style which I just can’t make myself adopt. But I think it would be an even better game if it used a curved distribution to determine hits and misses, even if the displayed percentages were curved as well and the difference were purely cosmetic. When a game says you have a 90% chance of success, failure should feel rare—or, at least, it should only happen 1 in 10 times in practice, not just in mathematical theory.
That’s my wish for game devs: stop using the uniform RNG built into your programming language unless you can articulate why. Triangular distributions aren’t hard to generate, and they feel more natural and fair. And if you really do need a uniform distribution, make sure it balances out on a reasonable timescale.