Dicing with chaos
January 20, 2021. Why are dice and coins good sources of randomness? The word “symmetry” is bandied about, but symmetry is not enough to explain why starting with very similar initial conditions and evolving deterministically leads to random outcomes. I explore the relevant factors—chaos and jitter—and use them to build deterministic dice.
Introduction
Each time I roll a dice, my muscles try to do the same thing. But even if I hold the dice the same way, and throw in what feels like the same way, I seem to get any side of the dice with equal probability. If you ask why a mathematician why any face is equally probable despite the apparent similarity of the roll, they might say symmetry. Apart from the pips indicating the value [1], the sides are indistinguishable and therefore must have equal probability of landing right side up. This is true and sounds like a nice explanation. But it’s only half the story!
To see why, suppose I take a very large dice, so heavy that I can only drop it. In this case, it’s unlikely to roll, and whatever side happens to be facing up when I “roll” will be facing up when it stops [2]. If I pick this face randomly, I will get a random outcome, but the point of the dice is to outsource the randomness! Symmetry is important, as we will see, but using dice is fundamentally a roll-playing game.
The Butterfly Effect
To illustrate our ideas, we’ll use the even simpler example of flipping a coin. Once again, ignoring the small effect of markings, the coin has a symmetry between the two sides, heads and tails. And similarly, if I have a very large coin I can only drop, then I will need to randomly choose the initial conditions for tossing if I want the coin flip to be random. This defeats the purpose of the coin! So, how does a deterministic process like throwing a coin generate an effectively random and uniform outcome? There are two main ingredients, as I see it: chaos and jitter.
We’ll focus on chaos first, and a particular characterisation of chaos
called the “butterfly effect”, aka sensitivity to initial
conditions.
Suppose a system has some space of configurations
Time here need not be continuous, but could come in discrete steps,
with
We can see how this operates in a very simple chaotic system called
the doubling map.
The configuration space
It’s easy to see this system is chaotic, since if we take two nearby
points
provided the difference remains smaller than
Exploring configuration space
At some point the doubling means that the difference gets so large
it is no longer doubling with each step, since we are
throwing away whole numbers.
To see how this happens, let’s do a specific example.
Set
This does not mean the butterfly effect has “stopped”. Rather, this is
the point at which the effect has fanned the two trajectories out so
they explore the whole system!
We can estimate the time this happens as follows.
Suppose the configuration space
In our example,
which is consistent with our discrete-time result
We’ll call
Chaotic coins
This effective randomness is why we roll dice and throw coins.
Symmetry is still important, since it splits the configuration space
This sounds like a mathematical theorem, and perhaps there is
something we can rigorously prove here.
But technical results about probability, dynamics and mixing involve
averaging over very long times,
To start with, let’s make a coin [5].
Heads will correspond to
The code for generating this and subsequent plots is given in the appendix.
It increases from highly biased towards heads (where our initial conditions start) to fair, after a few
exploration times, just like we expect.
There is nothing special about
Although the curves look somewhat different when plotted in actual steps, they all approach fairness after a few exploration times.
Chaotic dice
Naturally, we can use the same method to create a dice.
Instead of splitting the space into two equal halves, we split it into six equal portions,
approaches a
To make sense of these numbers, we need the
critical values
of the
There is one subtlety.
We only plotted up to
We have eaten up all of the initial data! This is an artefact of the way numbers are stored in a computer rather than a property of chaos.
Deterministic jitter
The story so far is that the effective randomness of a dice is the
result of tiny jitters amplified by chaos.
In this last section, we’ll talk about the source of jitter.
In the examples above, we used a computer to generate initial
conditions uniformly on an interval of size
However, we don’t need quantum mechanics to get effective random dice throws; jitter can be a perfectly deterministic phenomenon. An illustrative example is snipers, who (if films are to believed) must take account of their own breathing to ensure the accuracy of a shot. In this case, the “jitter” is the result of a natural and completely deterministic cycle which the sniper may not be aware of or control. When rolling a dice, there are all sorts of natural cycles that affect the roll, from breathing, blood flow, the twitching of muscle fibres, even potentiation of brain cells, most of which are not under the user’s control.
Some of these cycles are in sync, for instance blood flow and
breathing in the cardiac cycle, but most are independent.
These uncontrolled, uncorrelated cycles lead to ineliminable and
deterministic jitter in the roll of a dice.
We can model this as a high-frequency oscillation we select from,
periodically, but with a period that is unrelated to the oscillation.
We then imagine the oscillation sweeping back and forth in the space
of initial conditions
Our clumsy human operator “samples” at deterministic times
We now plug this in to our dice and check the results are fair, once again using Pearson, but only for
All three quickly approach a fair dice.
We need to put the
We can compute the number of exploration times it takes for the dice to become fair (as measured
by the Pearson test) as a function of
I’m not sure why this happens [7], but from this “onset-of-fairness” curve, we can select parameters to ensure our dice is statistically fair.
Conclusion
The randomness of a dice is not merely the consequence of symmetry,
but rather, the combination of jitter and chaos.
The dynamics amplify small jitters exponentially, so that
even after a short dice roll, they thread through configuration space
approximately uniformly.
The jitter of initial conditions may be genuinely random, but can also
arise deterministically from, e.g., two cycles which are out of sync.
This leads to a simple, deterministic model of a dice roll,
where the parameters
import numpy as np
def dice_roll(n):
f_n = (0.01/2)*(1 + np.sin(10*n))
x = f_n*2**22 % 1
return int(6*x) + 1
It’s difficult to rigorously show that dice exhibit chaotic dynamics. But hopefully our proofs-of-concept suggest that in deterministic games of chance, you really are dicing with chaos.
The pips do create a bias unless they are drilled, then filled with black paint of the same density as the dice. This is standard practice for "precision" dice used at casinos.
In physics parlance, the choice of initial condition—the way you hold the dice—"spontaneously breaks" the symmetry of the dice itself. This is the fancy reason that symmetry of the dice isn't enough to explain the symmetry of the outcomes.
Note that this evolution may not be deterministic in reverse,
i.e.
Specifically, I'm referring to ergodic theory, which concerns how long time averages mimic probability distributions. Although there are some connections to chaos, rigorous results on short time averages seem very hard. The model I'm describing here is a physical ansatz.
I don't think a coin is truly chaotic. Rather, we flip with large jitter, and these jitters get amplified linearly (I expect). But if you try to minimise jitter, and don't let the coin flip too many times, it's very easy to bias the outcome. A similar example is playing "s/he loves me, s/he loves me not" with flower petals, where uncertainty is the result of there being too many petals to eyeball. But we can easily bias the results by counting the petals!
We need many samples so that for each term in the sum approaches a
unit normal distribution by the central limit theorem. The number of degrees of freedom is reduced
by
Presumably, it involves the new factor our deterministic
jitter introduces: the jitter and choice oscillations
are initially in phase.
But I don't know why the effect increases linearly in
Appendix: code
I will get round to putting this in a repo at some point, but in the mean time, here is how to make all the plots. First up, here is our code for making the plot for a chaotic coin:
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('seaborn-whitegrid')
def coin(ell, steps, numrolls = 100000): # simulate many coin flips
tails = 0
for i in range(numrolls):
rand = np.random.uniform(0, ell) # uniformly select an initial condition
evol = rand*(2**steps) % 1 # evolve chaotically
if evol >= 0.5: # check whether heads or tails
tails += 1
return tails/numrolls
def explore(ell, lmbda = np.log(2)): # compute exploration timescale
return np.log(1/ell)/lmbda
def coin_data(ell, multiples = 10, numrolls = 100000): # generate data for plotting
timescale = np.ceil(explore(ell)) # makes timescale an integer
data = []
for n in range(1, multiples+1):
bias = coin(ell, n*timescale, numrolls = 100000) # evolve for n timescales and compute bias
data.append(bias) # add bias to data
return data
data = coin_data(0.1, 10)
plt.plot(range(1, len(data)+1), data);
plt.show()
plt.clf()
data1 = coin_data(0.001, 6)
data2 = coin_data(0.01, 6)
data3 = coin_data(0.1, 6)
data4 = coin_data(0.5, 6)
plt.plot(range(1, len(data1)+1), data1, color='black');
plt.plot(range(1, len(data2)+1), data2, color='red');
plt.plot(range(1, len(data3)+1), data3, color='blue');
plt.plot(range(1, len(data4)+1), data4, color='orange');
plt.show()
Now for our chaotic dice with random jitter:
def dice_bias(ell, steps, sides = 6, numrolls = 100000): # simulate many dice rolls
outcomes = [0] * sides
pearson = 0
for i in range(numrolls):
rand = np.random.uniform(0, ell) # uniformly select an initial condition
evol = rand*(2**steps) % 1 # evolve chaotically
outcome = int(sides*evol) # calculate outcome
outcomes[outcome] += 1
for k in range(sides):
pearson += numrolls*sides*(outcomes[k]/numrolls - 1/sides)**2 # compute pearson test statistic
return pearson
def dice_data(ell, start_mult = 1, end_mult = 10, sides = 6, numrolls = 100000): # generate data for plotting
timescale = np.ceil(explore(ell)) # makes timescale an integer
data = []
for n in range(start_mult, end_mult + 1):
bias = dice_bias(ell, n*timescale, sides, numrolls = 100000) # evolve for n timescales and compute bias
data.append(bias) # add bias to data
return data
data5 = dice_data(0.001, 2, 5)
data6 = dice_data(0.01, 2, 5)
data7 = dice_data(0.1, 2, 5)
data8 = dice_data(0.5, 2, 5)
plt.plot(range(2, 6), data5, color='black');
plt.plot(range(2, 6), data6, color='red');
plt.plot(range(2, 6), data7, color='blue');
plt.plot(range(2, 6), data8, color='orange');
plt.show()
Here are chaotic dice with deterministic jitter:
def det_bias(ell, steps, sides = 6, numrolls = 100000): # simulate many dice rolls
outcomes = [0] * sides
pearson = 0
for i in range(numrolls):
init = (ell/2)*(1 + np.sin(10*i)) # deterministic jitter
evol = init*(2**steps) % 1 # evolve chaotically
outcome = int(sides*evol) # calculate outcome
outcomes[outcome] += 1
for k in range(sides):
pearson += numrolls*sides*(outcomes[k]/numrolls - 1/sides)**2 # compute pearson test statistic
return pearson
def det_data(ell, start_mult = 1, end_mult = 10, sides = 6, numrolls = 100000): # generate data for plotting
timescale = np.ceil(explore(ell)) # makes timescale an integer
data = []
for n in range(start_mult, end_mult + 1):
bias = det_bias(ell, n*timescale, sides, numrolls = 100000) # evolve for n timescales and compute bias
data.append(bias) # add bias to data
return data
dataA = det_data(0.001, 3, 5)
dataB = det_data(0.01, 3, 5)
dataC = det_data(0.1, 3, 5)
plt.plot(range(3, 6), dataA, color='black');
plt.plot(range(3, 6), dataB, color='red');
plt.plot(range(3, 6), dataC, color='blue');
plt.show()
plt.clf()
dataD = det_data(0.5, 10, 20)
plt.plot(range(10, 21), dataD, color='orange');
plt.show()
Finally, here is the code for measuring the onset of fairness:
def fair_time(ell, freq = 10, sides = 6, numrolls = 100000):
pearson_check = 100 # initialise to be bigger than 10
steps = 1
while pearson_check > 10:
outcomes = [0] * sides
pearson = 0
for i in range(numrolls):
init = (ell/2)*(1 + np.sin(freq*i)) # deterministic jitter
evol = init*(2**steps) % 1 # evolve chaotically
outcome = int(sides*evol) # calculate outcome
outcomes[outcome] += 1
for k in range(sides):
pearson += numrolls*sides*(outcomes[k]/numrolls - 1/sides)**2
pearson_check = pearson
steps += 1
return steps/(explore(ell))
ells = np.linspace(0.01, 0.5, 10) # check ell between 0.01 and 0.5
fair_data = [fair_time(ell, numrolls = 10000) for ell in ells] # find onset of fairness
plt.plot(ells, fair_data, color='purple');
plt.show()