Comments on Empirical Rabbit: Rating Points Revisited

His exam may be based on the performances of relia...

2011-10-04T11:24:41.183-07:00

His exam may be based on the performances of reliably rated players at his test, rather than assigning ratings to problems and using the Elo formula. If so, 0.4 x looks plausible, but I would expect the multiplier to be larger for beginners and smaller for GMs, and larger for tactical players and smaller for positional players.

With my method the tactical rating directly reflects your probability of solving a single tactical problem within the time available. However, if you find a single tactic that your opponent misses, you will not necessarily win, and if you fail to find it, you will not necessarily lose. To convert your probability of finding a single tactic into your probability of winning rather than losing a whole point, you need to know how many tactical chances there are per game. (Easy chances that both players nearly always spot do not count, and neither do chances that neither player has much chance of spotting.)

The methods used by the tactical servers have the same limitations as mine - plus the limitation of muddling up many different time limits in an opaque way!

By the way, I have just found my spam folder, and found an old comment of yours about Chess Hero, which should now be on the blog.

"The main remaining uncertainty is the averag...

2011-10-04T10:16:52.743-07:00

"The main remaining uncertainty is the average number of times per game that a potentially half point winning or losing tactic occurs."..

Igor Khmelnitsky did write an award winning book: Chess Exam ( http://www.amazon.com/Chess-Exam-Training-Guide-Yourself/dp/0975476122 ).

In this book Khmelnitsky gives a rating in several "Subskills" of Chessability as Strategy, Tactics, Opnening and so on. His formula indicates that a gain of x points in ( Slow ) Tactics will increase the Elo by 0.4 * x.

His exam is now online for free :
http://www.chessik.com/index.htm

The last two comments are out of sequence because ...

2011-10-03T02:31:10.393-07:00

The last two comments are out of sequence because of the time difference between England and Germany!

The main remaining uncertainty is the average number of times per game that a potentially half point winning or losing tactic occurs. The games of weak club players often have many changes of fortune, and many missed tactical opportunities; but tactics rarely decides GM games. At the 2000 level, two half point tactical opportunities per game looks plausible.

I go to a lot of trouble to find problem sets in w...

2011-10-02T23:19:46.888-07:00

I go to a lot of trouble to find problem sets in which the difficulty range is a narrow as possible. The rationale is that every game contains easy tactics that both 2000 players will spot, and difficult tactics that neither player will spot. What decides games is tactics that one player spots and the other misses. If the problem set is easy, I set a short time limit, and if it is harder, I set a longer one, to make the problems realistic differentiators.

Suppose that 25% of the problems are impossible, and that the percentage that I get under the time limit on the remaining problems increases from 50% to 60%. My percentage increase on the problem set as a whole increases from 0.75*50% = 37.5% to 0.75%*60% = 45%, so that is a 7.5% increase rather than a 10% increase. Nonetheless, I still get a rough estimate of my improvement.

If I used average solution times, this problem would be much worse, because the very large solution times for the very difficult problems would dominate in the average. This is one reason why I do not use average solution times!

If the Problems are all, about of the same complex...

2011-10-02T22:45:39.035-07:00

If the Problems are all, about of the same complexity then the rating is ok. That should be the case in "good" books.

So Argument 1 is wrong, if you measure on a "...

2011-10-02T13:36:16.228-07:00

So Argument 1 is wrong, if you measure on a "unkown" set D.
Argument 2 and 3 dont work directly, you start with 50%. But i still think the rating-improvement is dependend on the distribution of the problem-complexity, i think its parametric:

In my training my average "end" speed is ~2++ times quicker than at the first time i see them.
If the problems are "close" together (=low "variance" ) in their "complexity" ( ::= ~ time i need for them ;) then i have 50% at the beginning and 100% at the end, but if the complexity is not close together ( high variance ) , if there are many problems with extreme high "complexity" then they will stay unsolved. If for example 25% of the problems are some of these problems wich are only solvable by computers then its not possible to score higher than ~75%

Suppose that I got 50% of the problems in batch A ...

2011-10-02T12:41:29.619-07:00

Suppose that I got 50% of the problems in batch A right in under T seconds, on my first pass through that batch. Suppose also that I got 60% of the problems in batch D right in under T seconds, on my first pass through that batch. The value of d for batch A is -400log(1/0.5 - 1) = 0. The value of d for batch D is -400log(1/0.6 - 1) = 70. I therefore estimate that I improved by 70 Elo points as result of practicing batches A, B and C.

Any improvement that I make by practicing batches A, B and C will not be fully reflected in an improvement at solving batch D (assuming that batch D does not contain any of the problems in batches A, B and C). Indeed, if batches A, B and C are chess and batch D is checkers, practicing A, B and C probably will not help at all with batch D. However, if batches A, B, C and D are all randomly selected from chess games, practicing A, B and C should help solving batch D to some extent.

Actually, my problems are randomly selected from problem books rather than chess games. The problems in problem books are usually less alike than those randomly selected from chess games - but, as I have said, Bain has a worrying high level of duplication.

An alternative approach is to have a large test set that I solve at with a very long repetition interval (one year say). I can then argue that I will have forgotten anything that I leaned on the previous passes. There are problems with both methods!

I will make the text more explicit.

Hi, i still have my problems with your Rating-tec...

2011-10-02T02:10:13.412-07:00

Hi,

i still have my problems with your Rating-tecnic. A rating on "known" problems is imo of very! limited value. ( Might help to monitor the training though )

Argument 1:

Thesis 1: If i do better and quicker than any other then my rating should be higher then of any other.

Thesis 2: I can learn a ( small ) set of problems so i can solve them quicker then anyone else ( unprepared, at this time :)

Concusion: My rating on this set is suddenly 2800+++
(?)

Argument 2:

Step 1: I create a (small) set with problems where i need t+1 sec to solve every! one of them
Step 2: i do some training on it till i solve every! problem in t-1 -- sec
Step 3: I calculate a rating with the cut off time t

Conclusion: My rating jumps from 0 towards eternity
(?)

Argument 3:

Step 1: I create a (small) set with problems where i need 100 sec to solve every! one of them
Step 2: i do some training on it till i solve every! problem in 30 sec
Step 3: I calculate a rating with the cut off time 29

Conclusion: My rating stays the same
(?)

I did some analysis on the matter speed and rating( for example here: http://aoxomoxoa-wondering.blogspot.com/2011/07/how-speed-influence-rating.html ) but i dont have a concrete "result" how to calculate.

I think the time-score function of CTS and CT was just "selected" and are not based on empirical datas. CTS was the first tactic server anyway. How should they know, what the real speed-rating relation is.

With a lot of work it should be possible to extract the necessary data from CT but i found my solution to this problem: I look at my perfomance on "problems i never saw before" at chesstempo.