GameKnot related: How accurate is gameknot chess rating
« Back to forum
Pages: 12
Go to the last post
FromMessage
gregbrill
20-May-10, 03:55

How accurate is gameknot chess rating
How accurate is the gameknot rating?
Does anyone have a rating outside of gameknot. If so, how does it match up?
algol
20-May-10, 09:26

...
From Alex Brunetti (brunetti) who did some work on this in 2002. I believe that GK changed their rating system since then, so the remark about the deflation may be void now. But his main point is still valid, do not compare the numbers, only the ranking:

Although the ratings of two organizations may not be compared, because the starting pool of players and their starting ratings were different, Elo ratings tend to become stable in the time, with a deflation factor much lower than in GK ratings.

As you may notice, the differences in the rankings from GK system and Elo lists are very small, and this proves that GK system, despite many complaints, works quite well. Obviously one doesn't have to look at absolute rating values, which are meaningless, but at the relative differences from a player to another one.

Reference gameknot.com

See also gameknot.com
algol
20-May-10, 09:43

...
May be brunetti's work was what triggered GK to adapt their rating system to closer match the ELO system. I read somewhere that in 2003 cyrano's rating got over 3000.

Here some more GK threads covering the topic you are interested in:

gameknot.com
gameknot.com
gameknot.com
gameknot.com
algol
20-May-10, 11:52

...
gameknot.com
baronderkilt
22-May-10, 04:25

I would say ....
The arguement that OTB Chess and Corr Chess are nearly two different games is a very valid one. They do take different skills, or at least emphasis on different skills compared to each other.
I would also say that Postal Chess and online corr of today are quite different than those of 20 years ago. The emphasis on computer analyses, most recently, and game d-bases before that (and still) make it so.
**
There will always be persons like myself that will be better at corr than otb because of ability to research and create new lines being greater than ability to retain and recall at need as in otb. Also analytical abilities can vary widely between the two forms. At my best I followed a win by combination about 20 "half moves" in otb (and was rated nearly 2100; an otb "Expert" rating with a personal emphasis to tactics creating probably a Master level analysis skill, with deficiencies coming other places such as winning "won" endings... okok And for BOGG I will also admit to Not Drawing an ending "a C player Would draw" LOL )
**
Postally, I went over 2200 in every org I played in long enough (excepting GK... maybe its just not long enough yet...?! LOL. No actually I do not play as well now, nor try to. It was a Lot of work, for me. Hours every day. A partime job with very low pay, except for self-satisfaction). And got to play 3of 3 draws with persons in the WC cycle as well as lose to USA Champions Taylor & Milbratz, but manage to break even, or better, all other Champs played, and a Plus score vs APCT Top 30... which is all to say, I don't think I ever reached my potential rating. Let me guess at somewhat over 2300
**
There will also always be those who are great in otb, but lack patience or some skill of corr chess and will be lower in it compared to their otb. And again.... We may call BLITZ CHESS yet another form~! And often do find a seperate rating pool for it, even on the same site.
***
ALL that said. Aw, GO AHEAD & SKIP OVER EVERYTHING ABOVE THIS~!! The important thing is: (tho let me make it longer by pointing to 30+ years of tmt experience,etc & being online for Chess since Compuserve/Leisure Link/& Prodigy were the only ones doing Chess at national level, and charged for it)

GK / GAMEKNOT HAS THE MOST ACCURATE CHESS RATING SYSTEM ON THE INTERNET FOR TURN-BASED CHESS, WHETHER ELO OR OTHER. INSOFAR AS BOTH SELF-CONSISTENCY & CONSISTENCY WITH THE TWO SOURCES WE MIGHT CONSIDER AS "DETERMINANT BY RATING FIAT", THAT ALL OTHERS WOULD COMPARE TO ... FIDE & ICCF, GK IS AT LEAST AS ACCURATE AS ANY SITE THAT IS NOT OF A NATIONAL FEDERATION, and even there will have most increased variance attributable to the fact of large time/out forfeit withdrawals from play. IMO
For what its worth, I also have the opionion that Nothing I've seen in use elsewhere now or 30 years ago has ever been as accuate as Elo, let alone moreso. If it were so, then Elo is a sound system that allow also for adjustment factors. So properly applied, it cannot be exceeded for accuracy, except possibly during Provisional Phase.
* * *
ok, I'm ready now ... "Hey, Executioner . . .
tactical_abyss
22-May-10, 06:12

Deleted by tactical_abyss on 24-May-10, 14:33.
ganstaman
22-May-10, 06:44

"Basically,someones corresp. and/or postal rating will almost always be higher than their OTB rating"

I don't think this is true. Given the same pool of players, your ratings should be the same (if it weren't that some players are actually relatively better at OTB or CC).

And both players have more time to think, so playing CC vs OTB doesn't automatically give you a better chance at winning.

In other words, your rating isn't an absolute number that represents how good your games are, but it's a relative number that tells you how good you are compared to the pool of players.
tactical_abyss
22-May-10, 07:40

Deleted by tactical_abyss on 24-May-10, 14:33.
ganstaman
22-May-10, 08:58

I'm sorry, but your post makes no mathematical sense, and the rating system is strictly a mathematical system. Given the same pool of players, CC and OTB ratings will be different ONLY because some people play one type of game RELATIVELY better. This is just the way ratings work, and no amount of anecdotal evidence from you can change the simply math of the situation.

" 'and both players have more time to think,so playing CC vs OTB dosen't give you a better chance at winning'
Incorrect.

For this very reason a player or both players that have more time to 'think' is the VERY REASON WHY players ratings tend to be higher on average playing CC chess than OTB."

So let's say we have 2 players in all of chess. These 2 players play each other OTB and in CC. The level of play will be higher in CC than OTB, but neither player will be playing RELATIVELY better than the other (that is, the difference in skill at the OTB games translates to the same degree of skill difference in CC). Are you claiming that these 2 players will have higher CC ratings than OTB ratings? Where are these extra points coming from?

And in your stories with more than 2 players, if they all have higher CC ratings than OTB ratings, where are these extra rating points coming from?

Basically, you are trying to say that ratings correlate to an absolute quality of chess playing. But this is wrong. Ratings correlate to the relative quality of your chess playing vs that of the other players in the pool.
tactical_abyss
22-May-10, 09:03

Deleted by tactical_abyss on 24-May-10, 14:33.
tactical_abyss
22-May-10, 09:15

Thanks ganstaman for your input.
All I can tell you is what I hear from other players.They simply tell me that for example,their rating is 2200 on a chess site and it is 2100 in a USCF OTB rating.Other than that,I cannot be any clearer.I just had conversations with several class "A" players about 3 weeks ago at the club.Class A as you know is 1800-1999 USCF.Two out of the 3 players I talked to told me that their corresp rating was "master level"(2200+) and the other player said his rating was equal with his OTB and Corresp rating.So right there two players had approx a 200+ difference in OTB compared to corresp.Is this the exception and not the rule?Truly,I do not know,but I can only pass on to you guys what I hear.
kingdawar
22-May-10, 09:19

All ratings are inaccurate
tactical_abyss
22-May-10, 09:38

Deleted by tactical_abyss on 24-May-10, 14:33.
ganstaman
22-May-10, 09:45

Tactical_abyss: As I keep saying, but realize I forgot to emphasize, you'd expect the same ratings only given the same pool of players (and technically, the same rating formula). But the players rated by the USCF are not the same as those rated by FIDE or gameknot. Sure, there is a lot of overlap, but the pools are not identical, so you can expect a rating difference.

Your findings can be due to the reasons you give of picking opponents. Or maybe OTB players that go to chess clubs are the type that play CC relatively better. I don't know for sure.
tactical_abyss
22-May-10, 11:31

Deleted by tactical_abyss on 24-May-10, 14:33.
lighttotheright
22-May-10, 13:43

I did my own extensive guesstimate study on relative rating differences between over the board players and those in correspondence arenas. While there were vast differences between experiences from one player to the next, there seemed to be no consistent formula to predict what those discrepancies would be for any single individual. In other words, some players tended to play better OTB while others played better in CC.

Data was extracted from histories of those who consistently participated in both forms of chess, so that some sort of viable comparison could be drawn. Although highly controversial, there was a quantifiable divergence between the two compared rating systems when averaged over a significant population. Thankfully, the answer is somewhat better than most anecdotal evidence frequently relied upon.

The digitized differential is a real number. The calculation can never be completed because of non-repeating numbers as part of a fraction. So the estimate is that approximately 3.14159265358979323846264338327950288419716939937510...etc. rating points are gained by players in CC above those who play in OTB.

The reasons for this disparity are unknown and beyond the scope of the study. No whys are given nor speculated upon. The focus of the study was purely to see whether there actually was a real difference or an imagined one.

Rather than talking in circles, I'd like to draw you a pi chart but don't know how to do this type of graphic using this forum. Hopefully the # description above will suffice. The actual results were a range instead of a specific number.

If you are inclined to believe that the difference must be somewhat greater than this, there is hope. My findings suggest that the maximum potential for disparity between rating systems is also a real, non-repeating number. An accurate calculation of this limiting figure can be deduce by dividing twenty-two by seven. This is somewhat larger than pi but alas is much to do about little to nothing.

Best of luck in your endeavors. My heart goes out to all those on both sides of this argument.
baronderkilt
22-May-10, 18:34

APPLES & Oranges!? Maybe so ...
But as Light says too "My heart goes out to all those on both sides of this argument."
****
I'm seeing 3 factors that I think may make it other than it "should be".
I've always agreed with Ganstaman that there are different skills & talents that WILL differentiate players. But at the same time there is a certain Type of player that will nearly always be better at corr. That is someone who Knows So Much that they cannot properly apply it to an otb time control. Even if great at blitz play, get still better at otb & still better at corr. Poss EG's Berliner, Dunne, Botvinnik, Keres. Here at GK, there's Bogg & I suspect our friend Tactical_abyss. (Even myself ...but only cuz I'm so Slow in the first place! And for some reason always analyse from 1.Seeing any Sac's instantly, to 2.Eliminating "Stuff that Don't Work', to 3. "What's Left .. Hope it works" ~! This is not very efficient, but comes from from desire to "SEE EVERYTHING" in a game. If I overlook something, I feel I just "Got Lucky")
***
Secondly, as mentioned the pool not being identical. A factor, tho compensations & adjustments can doubtless be applied by the mathematician. Anyway, the next two points relate to this:

1. That most start in otb and move to corr in the past (has computer changed this? I dont know). And I believe that most will find they suddenly BEGIN TO IMPROVE their OTB game as well~! ...after starting corr play. I can attibute a LARGE chunk of my going from 1700's to OTB Expert to just this factor. (EG, adding a defense like the French that one is playing at 2400's performance in Postal cannot but HELP the otb in both speed of play and knowlege of that opening!)

2. PERHAPS the BIGGEST FACTOR of all, that I can think of, is this:
Someone who starts in OTB & takes a shot as corr may find it unsuited to personality or have a dismal performace, [particularly if they enter into it with a trappy but slightly less than sound style that is often effective in otb, for eg.] And so at That Point, they often QUIT playing corr Chess ... leaving their rating sitting there 300 pts lower than otb ... but in 6 mo or a year that will disappear from the Rating Pool ... And .... for a short time a much better in otb rated player ... but that goes away. Yet someone who played BETTER in corr, and/or got a Higher RAting there ... just may decide to make it their MAIN FORM of PLAY. Like I did for several years, tho never too far from an otb tmt  
***
I am loathe to say it, because then it usually ends up Everyone involved hates you (or at least what you Said. LOL) But I think in aspects, Both outlooks are relevant & hold
seeds of truth that depend on who and how we choose to define!?
***
}8-D
I so to sum up (and say what I should have said before thinking about everything else  
I see it as a situation where ideally differences are applicable to differing knowlege and talents
,as suggested by ganstaman & I; Except when Transitional States Apply. The thing being ... its Always in transition to some extent.
I think the idea of following players involved in both, thruout those "careers" is a good one, and probably the most informative angle. See if their abilities begin divergent and converge for instance. Or tend to diverge. When still nags at me that it will take some manner of adjustment factor other than strictly seeking equivalance. (H#% Go as ELO ~!!  
ganstaman
22-May-10, 19:47

Well, now that's something I can accept. Should make everyone happy and satisfied, too.
algol
22-May-10, 21:10

...
lighttotheright   baronderkilt touches upon some interesting dynamics too.

I agree with gantsaman. Your rating is a relative number which measures your results within a pool of players. The absolute value does not mean much a priory.

One could imagine a new chess organization starting and giving each new incoming player a rating of 120,000. Over time the ratings will take on some bell shaped distribution with the midpoint somewhere around 120,000. Does that mean that a beginner with a rating of say 10,000 in this system is much stronger than Carlsen with 2810 FIDE ELO?

In organizations that exist longer, there will not be many games between unrated players and this problem of the initial rating is not so pronounced. But at one time it was a determining factor in what the rating numbers would be. I think I read somewhere that for the USCF it was originally set at 1500.

Also the point exchange formula plays a role in the ratings. brunetti's message - referenced at the top of this thread - alludes to a time when GK did not follow the ELO system. cyrano's rating got over 3000 then and at some time GK started using the logistic curve for point exchanges (although I do not know the details of this change).





That said, there is obviously an interest to convert between some of these 'local currencies'.
When I played in Europe my countries' national rating system was adapted to closer match the FIDE rating. This was done based on a subset of players which had both a national and a FIDE rating. I believe that is not unusual.

They did not reason that these players were stronger when playing in a FIDE tournament (lack of the local beer?). Instead the reasoning was that our local currency was off and needed to be adjusted upwards.


I think this is not an unusual procedure. Has something like that not happened when the women ratings in FIDE were deemed too low compared to the men (judged by results of women who played both women and men I would guess). They awarded all women an extra 100 ELO points except Susan Polgar (she played mostly men so her rating was already accurate was the reasoning there).

So also here the judgement was that the local currency (women's ELO) needed on average an extra 100 rating points to match the system the men were using. Now that one FIDE system (for OTB) is used, this problem is gone of course, both men and women are in the same pool.


These examples show that although the absolute ratings of two systems can be quite independent, a sufficient large number of players who play in both of them provide the 'exchange rate'. It is thus assumed that their skill is on constant across the two systems.
The examples I know off were strictly OTB play though. I do not know of any attempt to adjust CC ratings with an OTB set of ratings or vice versa.
If the skills are indeed very different, there is no means to have an 'exchange rate' and the absolute numbers can not be compared at all.




It is hard to get the ratings in two systems for a large number of players (unless you know many players like tactical_abyss does). But there are some places where rating distributions have been published, so we can compare them.

You can see the rating distribution of the USCF system for active players on June 2008 here.
www.eddins.net

The cumulative distributions shows that the median value is about 1140 (linear interpolation on the graph). Note how the rating classes with most players are around 700 to 800, most likely because of scholastic chess. These many children pull the USCF currency down.

It is interesting how there seems to be an extra incentive to be 2200 rather than 2150: Some hard work is done to make it to master class! Also note a bunch up at rating 100, which is the minimum rating awarded by USCF I believe.


Almost a year ago I had a look at the rating numbers for the active players on GK. That was 11 June 2009 and there were 31091 active players listed. I still have the numbers and the median was approximately 1360.

You can still see the GK rating distribution in my profile algol
This distribution looks much more like the logistic function which the ELO system is based upon than the USCF one. There are some heightened classes around 1200 (because new players are seeded there I think).

So the median is some 200 rating points higher in GK compared to USCF. That corresponds well with what tactical_abyss observes.

Of course, this is just for the median. For the high rated players things will probably be somewhat different:

1. GK ratings were topped last year with the class of 2600 while USCF ratings go higher (something is visible in the 2750 class in the upper USCF graph).

2. On GK only about 3% was in class 2000 or higher. From the cumulative distribution of the USCF, it seems like somewhat more than 5% of the pool is rated in class 2000 or higher. So the USCF rating distribution has a heavier tail than we see on GK.

So we have an inverse effect for high rated players compared to the median of the distribution, the problem is not a linear one.

Of course the USCF correspondence chess ratings may be very different from the GK one.
But this exercise illustrates that it is not so easy to compare or convert between different ratings...
algol
22-May-10, 21:53

CC ratings
There are some rating reports of the ICCF here www.iccf.com

I had a look at the second report of 2010, at the bottom of the document is a cumulative distribution. The median is over 2200, and there are no players rated under 1500 in the graph. So you would have to do extremely bad to get down to a rating like 1400 in this system. The rating distribution is shifted quite a bit up from the USCF one.

I did not see the USCF CC distributions, but if they are anything like the ICCF one, it would explain why many players have a higher CC rating... it is simply too hard to get a low one in a manner of speaking  
fmgaijin
22-May-10, 22:08

Traditionally . . .
. . . most players do not play internationally (FIDE, ICCF) unless they are in the top half of their national rating system. That is why until recently both groups did not even HAVE ratings under 2200--you had to perform over 2200 to get a published rating (or get a plus score in the Olympics, etc.) Though that has changed to some extent, very few abysmally weak players spend the money to enter international events (they cost much more than playing on GK, for example). Therefore the international ratings do not follow the "normal" rating distribution of an ELO system (which Arpad designed to center on 1500).
algol
22-May-10, 22:29

...
fmgaijin thanks for the info! I was not aware of that.

There is some information on USCF CC ratings here main.uschess.org . These ratings do go much lower (down to the 100s) than ICCF ones. There is a group of players with 1300, may be that is the starting value in that system.
tactical_abyss
23-May-10, 07:05

Deleted by tactical_abyss on 24-May-10, 14:33.
algol
23-May-10, 07:11

...
The USCF CC ratings list which is on the web page referenced in the previous message has 907 players. The median for that rating system is approximately 1750, much higher than USCF OTB. So it is logical that CC ratings will on average be higher than OTB.

There are two local peaks in the USCF CC rating distribution, one at 1300 and an even higher one for the 1400 class. There is also a dip at 1850, otherwise it looks pretty smooth.
tactical_abyss
23-May-10, 07:30

Deleted by tactical_abyss on 24-May-10, 14:34.
algol
23-May-10, 07:37

tactical_abyss
Sorry missed your post. Yes, the charts do not capture everything, true. But this simple model is also what is used to compute future ratings of the players, so it must have some validity.

I agree that the best method is to compare ratings of players who play in several pools. It is only because I do not have that information, that I try something basic to compare the systems. The median of the distribution is of course a crude indicator (an average across the whole pool of players) and things may be totally different in each class like you say.

I have players' names with their ratings from the USCF CC file which I downloaded and put in Excel (to compute the median and see the shape of the distribution). If similar information would be available somewhere for the USCF OTB, then I can match names. That would get the most accurate results - if enough people play in both systems. That would basically be the same thing the people in charge of the rating systems use to adjust the systems (as outlined in an earlier post).
ganstaman
23-May-10, 07:54

"The USCF CC ratings list which is on the web page referenced in the previous message has 907 players. The median for that rating system is approximately 1750, much higher than USCF OTB. So it is logical that CC ratings will on average be higher than OTB."

I think it needs to be emphasized, though, that this is not in any due to the extra time players have to make moves in CC vs OTB games. It is not in any way due to the higher quality of the games played. It does not reflect that a player plays CC better than they play OTB.

The system could easily and inconsequentially have been set up such that CC ratings were lower than OTB ratings.

Having a higher CC rating does not mean that you play CC relatively better than your opponents than at OTB.
algol
23-May-10, 08:32

ganstaman
I agree with you. The changed conditions hold for all players in the pool. The chart only shows that the average rating is higher in CC, it does not indicate why that is.

But the chart shows that if one encounters a player which participates both in OTB and in CC, chances are very good that his/her CC rating will be higher than his OTB rating.

It would also be interesting to see the USCF OTB rating distribution without scholastic chess.
maca
23-May-10, 08:56

A also agree...
The difference is due to the mathematical differences of rating systems, not because CC would be any easier to play than OTB. And whether you agree with that depends completely on your personal taste. I for one think that all the options that are not available in OTB are given to both players in CC, and with a sufficiently large sample these differencies in the 'difficulty' of the game should therefore cancel out because the 'difficulty' of the game depends only on the relative strengths of the two players.

From individual standpoint, it is naturally also a matter of getting used to the different environments.


Regards,
MaCa.
tactical_abyss
23-May-10, 09:00

Deleted by tactical_abyss on 24-May-10, 14:34.
Pages: 12
Go to the last post