Saturday, 1 September 2012

Rating by Maximum Likelihood

In this article, I highlight some limitations of simple rating systems and investigate an improvement.  I apply this improved method to Michael de la Maza’a last 19 games, and find that my previous calculation significantly underestimates his final performance. (See my earlier article Michael de la Maza Statistics.)  Practical rating systems typically:

(1).  Calculate the average rating of the opposition faced by a player.
(2).  Add a difference based on the player’s score against that opposition.

This approach is likely to be reasonably accurate, if all the opponents have ratings close to that of the player concerned.  It can, however, be inaccurate if there are large rating differences.  This limitation can be avoided by using the standard technique of maximum likelihood estimation.

Simple Elo Calculation

The USCF version of the Elo rating system estimates the score sA (as a fraction between 0 and 1) of player A with rating RA against an opponent with a rating RB using the formula:

sA = 1 / (1 + 10^[(RB-RA)/400])


RB - RA = 400*log10(1/sA  - 1)


If our score as against opponent i is si, our overall score against N opponents is:

s = Σi {si} / N

If we have a rating R, and our opponents have ratings Ri, the average of these ratings is:

Rave = Σi {Ri} / N

Our rating R can be estimated as:

R = Rave - 400*log10(1/s - 1)


If the outcome of a game was always a win or a loss, the Elo formula would give the probability of winning an individual game.  If we can count a draw as one point, and a win as two points, we can interpret the Elo formula as giving the probability of winning a single point.  With this assumption, the probability that we will win a single point is:

1 / (1 + 10^[(Ri-R)/400])

The probability that we will lose this point is then:

1 / (1 + 10^[(R-Ri)/400])


Si = 1 -- if we win point i, and

Si = -1 -- if we lose point i.

The probability of obtaining our result for point i is then:

Pi = 1 / (1 + 10^[Si*(Ri-R)/400])

On the assumption that each point is statistically independent, the probability of obtaining our complete set of results is:

P = Πi {Pi} = Πi {1 / (1 + 10^[Si*(Ri-R)/400])}

Maximum Likelihood

We can now find the value of our rating R that maximises P:

ln P = -Σi {ln(1+ 10^[Si*(Ri-R)/400])}

(1/P) dP/dR = -Σi {(1 / (1 + 10^[Si*(Ri-R)/400]) * ln10 * 10^[Si*(Ri-R)/400]) * Si / 400}

(N.B. if y = 10^x, ln y = x ln10, y = e^(x ln10) and dy/dx = ln10 * e^(x ln10) = ln 10 * 10^x.)

P is at its maximum when the derivative dP/dR is zero, i.e. when:

Σi {Si * (1 / (1 + 10^[Si*(R-Ri)/400])} = 0

What a remarkably simple result!  Here is a typical graph of this function:

Numerical Solution

We can solve this equation numerically using Newton’s method.  Let:

f(R) = Σi {Si * (1 / (1 + 10^[Si*(R-Ri)/400])}

f’(R) = Σi {Si * (-1 / (1 + 10^[Si*(R-Ri)/400])^2 * ln10 * 10^[Si*(R-Ri)/400] * Si / 400}

= Σi {(-ln10 / [(1+ 10^[(R-Ri)/400])*(1+ 10^[(Ri-R)/400])] / 400}

Another simple result!  We can use the approximate formula discussed at the beginning of this article to obtain a first approximation for R.  Newton’s Method then gives the next approximation as:

R’ = R + h

f(R) + h * f’(R) = 0

h = -f(R) / f’(R)

We can use this formula successively until sufficient accuracy is obtained.  (N.B. This method does not always converge when the initial approximation is poor, but it doubles the number of significant figures on each iteration near the solution.)

More General Calculation

For a more general calculation, where there are no reliable ratings at the outset, see:   For the Bradley-Terry algorithm to solve this problem, see:

Michael de la Maza

Here are the results for Michael de la Maza’s last 19 games:

1873    1          1607    1
1810    1          2281    0
1854    1          1836    1
1936    1          1813    1
1925    1          1821    1
1878    0.5        1952    1
1960    1          1853    0.5
1977    1          1948    0.5
1991    0.5        2531    0
1629    1

The graph above shows f(R) for these results.  Here are my results from Newton‘s Method:
        1            2            3            4            5
R   2149.349349  2186.589424  2188.682595  2188.689059  2188.689059
h   37.24007498  2.093170523  0.006463885  6.15177E-08  -2.44907E-13

The initial approximation given by the simple calculation is 2149, and the value given by the maximum likelihood method is 2189, which is significantly higher. MDLM's final performance was very nearly at the US National Master level of 2200.


  1. Unexpecting things can happen during a tournament. An opponent can resign at move 5 and decide to go to the dentist. Your opponent can get a seizure and lose on time. Or they are called away for family problems. An unreasonable share of your high rated opponents are so impressed by your smell that they want to get it over with. Causing you to be temporarily overrated. I guess something like that has happened. For some people being overrated is a good moment to stop :)

  2. The chances of that happening to you for 19 games are astronomical! I do not believe his story.

  3. Tempo said:
    An unreasonable share of your high rated opponents are so impressed by your smell ...
    The smell of MDLM had to be impressiv with a coat on at that heat: