euler_436

Project Euler Problem 436 - Unfair Wager

Spoiler Alert! This blog entry gives away the solution to problem 436 of Project Euler. Please don’t read any further if you have yet to attempt to solve the problem on your own. The information is intended for those who failed to solve the problem and are looking for hints or test data to help them track down bugs. It is posted not before the problem has a hundred solvers already.

First I provide references of the major theoretical background that you probably need to know to solve the problem. The last section presents the solution from start to finish.

The post reflects my approach to the problem. Even though the final outcome was accepted by Project Euler doesn’t mean the information is correct or elegant. Algorithms won’t necessarily be the most efficient ones, but they are guaranteed to run within the time limit of one minute.

References

This is a pen and paper problem that requires very little knowledge of maths. You mainly need to calculate conditional probabilities and use the distribution of the sum of uniform variables.

Notation

I’ll use the following notation:

  • $u_i$: The ith uniform number.
  • $s_i$: The sum after i draws
  • $n$: The number of draws to get the sum over 1
  • $m$: The number of draws to get the sum over 2
  • $p$: The first sum over 1, i.e. $p=s_n$
  • $x$: The number that the first player gets.
  • $y$: The number that the second player gets.

The capital letters denote the corresponding random variables and $f_X(x)$ denotes the probability density function of $X$ at $x$ .

Solution

I use the following steps to solve the problem:

  1. Sum after i picks
  2. First Player – Distribution of x
  3. Second Player – Distribution of y
  4. Joint distribution of x and y
  5. Probability of y bigger than x

Sum after i picks

Let

$$ \begin{align} S_i(s) &= \sum_1^i U_i & U_i \sim U(0,1) \end{align} $$

The probability density function (pdf) for the sum after one step corresponds to the pdf of the continuous uniform distribution:

$$ \begin{align} f_{S_1}(s) &= \mathbf{1}_{[0,1]} \end{align} $$

where $\mathbf{1}_{A}$ denotes the indicator function.

For each subsequent pick, the sum $S_n$ is obtained by adding a uniform random variable to $S_{n-1}$ . The uniform random variable is independent of the sum $S_{n-1}$ , and therefore the pdf of $S_{n}$ is obtained by convolution of $f_{S_{n-1}}$ and $f_U$ :

$$ \begin{align} f_{S_n}(s) &= f_{S_{n-1}+U}(s) \\ &= (f_{S_{n-1}} * f_U)(s)\\ &= \int_{-\infty}^{\infty} f_{S_{n-1}}(t) \cdot f_U(s-t) dt\\ &= \int_{0}^{\infty} f_{S_{n-1}}(t) \cdot 1 dt \end{align} $$

By induction we get
$$ f_{S_n}(s) = \begin{cases}\frac{s^{n-1}}{(n-1)!}&0\leq s \leq 1\\ \cdots & \cdots\end{cases} $$

(We will only need the distribution below one).

First Player – Distribution of x

Let $n$ denote the number of picks it takes to get the sum over 1. Consequently, $s_n$ is the first sum over 1, i.e., the sum where the second player takes over. The last summand of $s_n$ is the number $x^{(n)}$ that the first player gets if she draws $n$ numbers:

$$ \begin{align} s_{n-1} + x^{(n)} &= s_{n} \\ s_{n-1} &\leq 1 \\ s_{n} &> 1 \end{align} $$

Because

$$ \begin{equation} s_{n-1} = s_{n}-x^{(n)} \end{equation} $$

and because $s_{n-1} \leq 1$ and $s_n > 1$ , we know that

$$ \begin{equation} 1 < s_n \leq 1 + x^{(n)} \end{equation} $$

so the joint probability density distribution of $x^{(n)}$ and $S_n$ is

$$ \begin{align} f_{X^{(n)},S_{n}}(x,s) &= \begin{cases}f_{S_{n-1}}(s-x) & 1 < s \leq 1 + x \\ 0 & \text{otherwise} \end{cases}\\ &= \begin{cases}\frac{(s-x)^{n-2}}{(n-2)!} & 1 < s \leq 1 + x \\ 0 & \text{otherwise}\end{cases} \end{align} $$

The previous density applies to a given $n$ . Let $x$ be the number of the first player, and let $p$ be the first sum over 1. To get the unconditional probability $f_{P,X}$, we sum up $f_{X^{(n)},S_{n}}(x,s)$ (2) over all potential $n$ :

$$ \begin{align} f_{P,X}(p,x) &= \sum_{n=2}^\infty \frac{(p-x)^{n-2}}{(n-2)!} & 1 < p \leq 1 + x \\
&= \sum_{n=0}^\infty \frac{(p-x)^n}{n!} & 1 < p \leq 1 + x \\ &= e^{p-x} & 1 < p \leq 1 + x \end{align} $$

Second Player – Distribution of y

The first player leaves the sum at $p$ an the second player needs to get the sum over 2. I prefer to shift the values to the point of origin: Let the second player start at 0 and let $y$ be the first value over $2-p$ . Also let $s_{i}‘$ be the sum of the first i number that the second player picked. In other words, if the first player picked $n$ numbers that add up to $s_n$ , and the second player picks $m$ numbers that bring the total to $s_{n+m}$ , then $s_{m}’ = s_{n+m} - s_n$ .

We already know the distribution of $s’_i$ below $2-p$ from the first player from Equation 1:

$$ \begin{equation} f_{S’_i}(s) = \frac{s^{n-1}}{(n-1)!} \mathbf{1}_{[0,2-p]} \end{equation} $$

To get to the distribution of $y$ we have to distinguish two fundamentally different cases.

Case 1: y is less or equal than 2-p

If $y$ is smaller than $2-p$ , then the second player needed at least two picks to get the sum over $2-p$ , and the penultimate sum is at greater than $2-p-y$ (since adding $y$ brings the sum over $2-p$ ) and at most $2-p$ (because if the sum would be bigger than that, it would be the last sum already).

Assume that the second player needs $m > 1$ picks and his last number is $y^{(m)}$ . Therefore, we have

$$ \begin{align} f_{Y^{(m)}|P,y>2-p,m\geq 2}(y) &= \int_{2-p-y}^{2-p} f_{S_{m-1}} ds \\ &= \int_{2-p-y}^{2-p} \frac{s^{n-2}}{(s-2)!} ds \\ &= \left. \frac{s^{m-1}}{(m-1)!} \right|_{2-p-y}^{2-p} \\ &= \frac{(2-p)^{m-1}}{(m-1)!} - \frac{(2-p-y)^{m-1}}{(m-1)!} \end{align} $$

Hence, the pdf of $y$ is:

$$ \begin{align} f_{Y|P,y>2-p}(y) &= \sum_{m=2}^{\infty} \frac{(2-p)^{m-1}}{(m-1)!} - \frac{(2-p-y)^{m-1}}{(m-1)!}\\ &= \sum_{m=1}^{\infty} \frac{(2-p)^{m}}{m!} - \sum_{m=1}^{\infty} \frac{(2-p-y)^{m}}{m!} \\ &= e^{2-p} - e^{2-p-y} \end{align} $$

Case 2: y is larger than 2-p

The second case is special, because this time the second player has a chance to get her number with the first pick. We can cover this case by defining

$$ \begin{equation} f_{S_0}(s) := \delta(s) \end{equation} $$

which means the sum of zero uniform random value is always 0 and the corresponding pdf is 1 at 0, and 0 for any other value.

Secondly, the lower bound of the integration needs to be adjusted: since $2-p-y$ is always smaller than zero, and the pdfs of $S_n$ are zero for negative values, we start integrating at 0.

The rest of the calculation is the same, except this time we include $m=1$ :
As before, let’s assume that the second player needs $m > 1$ picks and his last number is $y^{(m)}$ . Therefore, we have

$$ \begin{align} f_{Y^{(m)}|P,y\leq 2-p,m\geq 1}(y) &= \int_{0}^{2-p} f_{S_{m-1}} ds \\ &= \begin{cases}\int_{0}^{2-p} \frac{s^{n-2}}{(s-2)!} ds & m \geq 2 \\ \int_{0}^{2-p} \delta(s) ds & m = 1 \end{cases} \\ &= 1 + \left( \left. \frac{s^{m-1}}{(m-1)!} \right|_{0}^{2-p} \right) \\ &= \frac{(2-p)^{m-1}}{(m-1)!} \end{align} $$

Hence, the pdf of y is:

$$ \begin{align} f_{Y|P,y\leq 2-p} &= \sum_{m=1}^{\infty} \frac{(2-p)^{m-1}}{(m-1)!}\\ &= \sum_{m=0}^{\infty} \frac{(2-p)^{m}}{m!} \\ &= e^{2-p} \end{align} $$

So all in all the pdf of y given P is:

$$ \begin{align} f_{Y|P} &= \begin{cases}e^{2-p} - e^{2-p-y} & y < 2 - p \\ e^{2-p} & y \geq 2 - p \end{cases}\\ &= e^{2-p} - e^{2-p-y}\cdot \mathbf{1}_{[0,2-p[}(y) \end{align} $$

Joint distribution of x and y

The joint distribution of the picks of the first and second player is:

$$ \begin{align} f_{X,Y}(x,y) &= \int_1^{1+x} f_{X,P}(x,p) f_{Y|P}(y|p) dp\\ &= \int_1^{1+x} e^{p-x} \cdot \left( e^{2-p} - e^{2-p-y}\cdot \mathbf{1}_{[0,2-p[}(y)\right) dp \end{align} $$

Since $L\leq 1 + x$ we have

$$ \begin{align} 2-p \leq 2 - (1+x) = 1-x \end{align} $$

The indicator is therefore always one if $y < 1-x$ and

$$ \begin{align} f_{X,Y|Y<1-X}(x,y|y<1-x) &= \int_1^{1+x} e^{p-x} \big( e^{2-p} - e^{2-p-y}\cdot\big) dp \\ &=\left. e^{2-x-y} \left(-1+e^y\right) p \right|_1^{1-x} \\ &= e^{2-x-y} \left(-1+e^y\right) x \end{align} $$

If, on the other hand, $x + y \geq 0$ , then the indicator is only one if $L < 2 - y$ :

$$ \begin{align} f_{X,Y|Y\geq1-X}(x,y|y\geq1-x) &= \int _1^{1+x}e^{p-x}\left(e^{2-p}\right)dp+\\ &,,,\int _1^{2-y}e^{p-x}\left(-e^{2-p-y}\right)dp \\ &= e^{2-x} x-e^{2-x-y} (1-y) \end{align} $$

And we have:

$$ f_{X,Y}(x,y) = \begin{cases}x e^{2-x} (1-e^{-y}) & x+y < 1 \\ e^{2-x-y}(xe^y-1+y) & x+y \geq 1 \end{cases} $$

Probability of y bigger than x

Player two wins if $y$ greater than $x$ :

$$ \begin{align} P[y\geq x] &= \int_0^1 \int_{x}^1 f_{X,Y}(x,y) dy dx \end{align} $$

First, we tackle the case $x+y<1$ . Since also $y>x$ we have $0 \leq x < 1/2$ and therefore:

$$ \begin{align} P[y\geq x | x+y < 1] &= \int _0^{1/2}\int _x^{1-x}\left(e^{2-x} x\left(1-e^{-y}\right)\right)dydx \\ &= \frac{1}{8} \left(5+40 \sqrt{e}-26 e\right) e \end{align} $$

There are two possibilities for the case $x+y\geq 1$ :

  1. $0 \leq x \leq 1/2$ and $1-x \leq y \leq 1$
  2. $1/2 \leq x \leq 1$ and $x \leq y \leq 1$

Therefore,

$$ \begin{align} P[y\geq x | x+y \geq 1] &= \int _0^{1/2}\int _{1-x}^1\left(x e^{2-x}-(1-y)e^{2-x-y}\right)dydx \\ &,,,+\int _{1/2}^1\int _x^1\left(x e^{2-x}-(1-y)e^{2-x-y}\right)dydx \\ &= \frac{1}{4}+\frac{23 e}{8}-5 e^{3/2}+2 e^2 \end{align} $$

So finally we have from Equations 3 and 4:

$$ \begin{align} P[y\geq x] &= P[y\geq x | x+y < 1] + P[y\geq x | x+y \geq 1] \\ &= \frac{1}{8} \left(5+40 \sqrt{e}-26 e\right) e+\frac{1}{4}+\frac{23 e}{8}-5 e^{3/2}+2 e^2 \\ &= \frac{1}{4} \left(1+14 e-5 e^2\right) \end{align} $$

Archived Comments

Note: I removed the Disqus integration in an effort to cut down on bloat. The following comments were retrieved with the export functionality of Disqus. If you have comments, please reach out to me by Twitter or email.

euler Sep 24, 2013 06:06:46 UTC

We hope that you enjoyed solving this problem. Please do not deprive others of going through the same process by publishing your solution outside Project Euler. If you want to share your insights then please go to thread 436 in the discussion forum.

Thomas Sep 24, 2013 21:58:59 UTC

I got a good deal of the way into this myself, but I got in over my head about 65% of the way through. Your solution is elegant and I commend you.

While I was working through the math over the last week, I quickly came to the conclusion that this isn't a CS problem but a math problem. A computer isn't need, nor is an algorithm. A simple calculator can easily give this answer once you have the probability distribution equation solved. So, my question is: how does one use a computer to solve this without using the equation? A Monte Carlo simulation cannot realistically achieve more than the 7th decimal place of precision. Since it is a continuous probability distribution, brute force is also (seemingly) out of the question. Is there a pure algorithm approach or is the Euler Project just a joke on CS people by math professors?

Reginald Sep 27, 2013 18:18:45 UTC

I also tried to solve this algorithmically. The chances that the sum of more than 10 random variables is larger than 1 is neglectable, so I tried to work with fixed number of draws. It got ugly very quickly and it was a mess. If you check the forum you see a solution that does a similar thing with numerical integration.

This was a very frustrating problem because there doesn't seem to be elegant way to solve it. Even the differential equation approach kinda sucks.

Please folks at Euler give us more programming challenges! I wish there were an other website like euler but with more focus on programming.

CuongVM Oct 01, 2013 20:58:24 UTC

(sorry if my bad English annoy you :) )
can't say that I admire your work when I actually did :) I used Monte Carlo simulation for about half billion times but to produce 12 decimal digits is something that it could never reach... I first thing it could relate to Lesbeg integral but I couldn't find the way to solve it... Thanks to your work, everything is clear now!