online chess
Log in

GameKnot Forum
(archived)   << - < - > - >>
FromMessage
sayb

12/12/2006
19:24:29

[ report this post ]
Subject: Random pairing

Message:
Some reviewing of the ELO rating system has led me to the conclusion that ratings might be a little bit more meaningful if opponents were paired completely randomly. There seems to be some inflation caused by higher rated players only playing other higher rated players.

I'm not saying this type of behavior is wrong. It's the same way I'd play. It's smart.

But I think the ELO ratings (not for an individual) but for a large group of players will more rapidly approach true indicative levels if opponents were paired randomly.

My suggestion is this:

we currently have league games, mini-tourney games, gk-tourney games, team games, and challenge games.

I was thinking maybe on the side we could have "random pairing" games, where you submit your willingness to be randomly paired and then are paired accordingly with another player of equal willingness and the game begins.

It should affect your regular rating but also another stat: I don't know what to call it.. maybe some kind of activity rating. The more frequently you do the random-pairing, the higher your activity rating. It's not based on ELO it's simply a measure of how often you participate in the random-pairing. So you see someone's ELO and see that their activity rating is high. That means their ELO is more likely an accurate indicator of their strength relative to the GK player-pool.

Just a thought. Shoot it down if you don't like it. If you like it, let me know in this thread so I could suggest it to Mike and give him a link to the thread.

Thanks for reading.


yanm

12/13/2006
02:33:22

[ report this post ]
makes sense

Message:
Your explanation of the problem behind the current pairing system makes quite a lot of sense to me. However, I'm not sure that random pairing is a good idea per se. Well maybe that it can avoid rating skew, I don't know :/

As for an extreme case of skewed rating, and hence totally meaningless* IMHO, look at players' team ratings. Since the pairing is most often done according to the 'normal' rating, it ends up that team matches are likely to happen with players of similar ratings. At the end, team players start at 1200T but play against other 1200T players and on-average balance out wins and loss. At the end, everybody stays around 1200T (+- 200)...

My two cents,

yanm

* Team rating is maybe not totally useless since it permits to see how a team player performs in team game. For instance, a rating above (below) 1200 signifies that the team players performs better (worst) in team games than in normal game.


sayb

12/13/2006
05:23:34

[ report this post ]
yeah

Message:
I agree. In fact it may be a good indicator of sandbagging. Since team pairings are supposed to be fair, a team with a consistently high (and growing) rating is most likely sand bagging. Though the team captain could make the claim that he pairs legitimate non-sandbagging players against other legitimate players such that their styles clash in a benificial way for said team. But the magnitude of being able to consistently pair team members in this way is overwhelming to me. I don't think that would be possible.

katonah

12/13/2006
11:05:53

[ report this post ]
Interesting Idea

Message:
I like it, so with my Good Housekeeping seal you may proceed accordingly :0~ A twist would be also make these Fischer Random games !??! Keep the ideas coming; as a static chess site will never move forward!

ganstaman

12/13/2006
12:41:50

[ report this post ]


Message:
I haven't looked at the Elo system in detail ever, so I can't say anything for certain. However, I'm sure he knew that most players play other players near-ish to their own rating most of the time. I hope this doesn't sound condescending, but I would like to know if you reached your conclusion based just on 'hmmm, it seems like this would make sense' or 'from what I've learned in my years of studying statistics and mathematics, I would conclude this?'

It's really just that until someone knowledgeable enough in this field looks at the system and tells us one way or the other, we don't really know for sure if random pairing is actually necessary to keep ratings uninflated and accurate.

I'm not saying you're wrong, I'm just not sure that you are right. I hope I was clear enough without sounding too mean or confused, as I think I am nice and only somewhat confused...


sayb

12/13/2006
14:25:46

[ report this post ]
no

Message:
You're quite right to be skeptical of my generalization.

And even if I'm wrong about players tending to stick to their +- 200 or so Elo pool, the point would be that with random pairing we'd have a "for sure" way of knowing their rating had been doing some unbiased fluctuations.

For those who already played a wide variety of rating ranges (with an emphasis on those rating ranges that have the highest density of playerbase as in accordance with what you'd need to do to get a true performance rating with respect to the pool in question) their activity rating would rise when they used the random pairing feature and their rating probably wouldn't change much.

For those who stick to their own pool we might see their rating change more substantially or stay the same with a miniscule activity rating.


nottop

12/13/2006
15:39:06

[ report this post ]
problem with elo

Message:
The problem with elo type ratings system is that it makes it very difficult to win rating points by playing lower rated players. It's not that hard to force a draw (with white) against a player 100-150 above you - then you gain of bunch or ratings points and your opponent loses them.
Since this is clear to everyone, what happens next is that people go to lengths to avoid playing players with lower ratings. I guess that is OK - but the drawback (to me this is a drawback) is that lower rated players lose there chance to play and learn against higher rated players.
In addition, some lower rated players are lower rated simply because they haven't played a lot of games

If a player who should be rated 2300 but is new and only rated 1900 is playing against a 2200 player, the 2200 player has everything to lose and nothing to gain. But it might be a worthwhile game, the rating system should not stop the game from happening.

I guess I favor some modification of the elo system but I don't have a clue as to what that should be.



danders

12/13/2006
16:25:00

[ report this post ]
Kind of a cool idea, sort o'...

Message:
...but what exactly will it solve? I don't object to the random pairing idea. It sounds like an excellent way to pair up for an extra game or three, if you limit the range of the rating deviation a bit. +/- 250, maybe?

But why introduce another stat? I thought that's what the 'average rating' stats were for. You can look at those and pick your opponents just as easily as you can look at a 'random pairing' stat to pick 'em.

Ratings fluctuate for a variety of different reasons. Some of those reasons won't have anything to do with Gameknot, so you can't expect the server to be able to control all the fluctuations. Assume, for instance, that I take a year-long hiatus from GK chess. While I'm going about my life, I'm still solving chess puzzles, reading Chess Life, and working through a game on the board every now and then. I'm just not competing as hard or as often. In other words, I'm still possibly improving my chess, even though I'm not playing any rated games. When I come back to GK, my rating is probably going to be too low. By what magnitude? Only time and the next fifty games or so will tell.

ELO is a lagging indicator. That means it shows probable playing strength based on past results. It already shows your strength relative to the player pool. It's never going to work as an up-to-the-minute indicator of a person's chess potential. This is why I don't understand this obsession some people have with controlling the apparently rampant rating inflation/deflation. It seems to miss the whole point of a chess rating.

Random pairing is a pretty neat idea. Expecting it to somehow make the ELO system more accurate seems like asking it to do more than it can do. In fact, it seems like asking the ELO system to do more than it was designed to do.

Maybe I'm just missing something here.


kewms

12/13/2006
18:21:30

[ report this post ]


Message:
It sounds to me like a solution in search of a problem. The rating system doesn't claim to be more than an estimate of actual strength. If the sample size is sufficiently large, random fluctuations should average out over time. As for the absolute value of the top rating, it's a meaningless number except relative to the GK player pool anyway.

Katherine


sayb

12/13/2006
21:27:25

[ report this post ]
kewms

Message:
You put it very well when you said it sounds like a solution in search of a problem.

I don't really see a big problem here at GK. I don't see a problem at all to be honest I'm fairly certain most ratings are fairly accurate.

I also understand that ELO is a lagging indicator and this is fine. Danders is right in that it is in the spirit of ELO to be a step behind and all the time slowly approximating to your strength.

I think the random pairings would change the overall ratings distribution of the pool though.

I would bet that the average 2300 player here isn't really 2300 relative to the pool. More like 2150 to 2200. But his/her rating is high because he/she takes less risks against lower rated playres. Their biggest threat are the 1700 - 2100 players who are strong but way behind in rating and hence a big danger in the event of a loss.

With random pairings, everyone plays everyone and the distribution of ratings here on GK would approach a normal distribution.

As I admitted earlier, agreeing wholeheartedly with ganstaman, I may be wrong and the current distribution may in fact be beautifully normal. But I just have a hunch that it isn't. It makes sense. High rated players tend to play high rated players and low rated players tend to stick to low rated players.

Sure there are kind teachers and eager students, but those are a minority. Not to give any bad conotations to the high rated players who stick to high rated opponents. I would do the same. My own rating is not quite indicative of my true strength and when it is, I will definitely be filtering my opponents to try to stay within +/- 150 ELO of me.

I think what I'm saying makes sense but this post got pretty long. I think with random pairings the distribution of ratings on GK would be more "true". For an individual it would probably make not too big of a difference. The further you were away from '1200' the more of a difference the change would make for you. (on average of course)


kewms

12/14/2006
21:36:08

[ report this post ]


Message:
High-rated players tend to play high-rated players everywhere, not just at GK. When was the last time Kramnik or Topalov played a rated game against someone who wasn't at least a GM-candidate? (Simuls aren't rated and don't count.) But does anyone doubt that they are among the strongest players in the world?

The nature of statistics is that it's really hard to get accurate numbers as the sample size gets smaller. That means ratings at the extremes are likely to be less accurate than ratings at the middle. I don't think random pairings would help, though. Once the rating difference exceeds 200 points, the chance of a win by the (accurately rated) stronger player approaches 100 percent. A 1700 player simply isn't strong enough to demonstrate the difference between a 2300 player and a 2000 player. Random pairings would only have a significant impact if the current ratings were both inaccurate *and* inconsistent: you would have to have both highly overrated strong players and highly underrated average players to get results that were substantially different from those predicted by the rating system. And even then, a player would have to randomly pair most of his games. That's unlikely to happen, since playing people close to your own rating is generally more enjoyable.

An unstated goal of your proposal is to encourage stronger players to play weaker players. That seems to me to be one of the functions of the league. League divisions typically have a rating spread of about 400 points, and league challenges cannot be declined. I've certainly found as many strong opponents as I can stand that way.



sayb

12/15/2006
02:39:07

[ report this post ]


Message:
>High-rated players tend to play high-rated players everywhere, not just at GK.

Never said it was specific to GK and everyone knows it isn't.

>Random pairings would only have a significant impact if the current ratings were >both inaccurate *and* inconsistent: you would have to have both highly >overrated strong players and highly underrated average players to get results >that were substantially different from those predicted by the rating system. And >even then, a player would have to randomly pair most of his games. That's >unlikely to happen, since playing people close to your own rating is generally >more enjoyable.

Our higher rated players are overated and our lower rated players are underated. As are players on any chess site or in FIDE or any established rating pool in the game of chess.

As for the 'more enjoyable' bit. That's besides the point. The goal of this (and I'm getting to the point where you presume that I have an unstated goal.. a bit uncouth on your part I would say) is not a matter of enjoyment but simply that the rating system would more accurately portray players in this rating pool. That's all.

>An unstated goal of your proposal is to encourage stronger players to play >weaker players.

This would be an unstated but obvious repercussion of this proposal. It was not an unstated goal. The league would no longer have any function. And your league argument is illogical as the rating spread is limited. The point of the random pairing is to see that any two players have as good a chance of being paired as any other two players.


The point of this proposal (no there are no unstated goals) is to let the GK rating distribution get to the point where you can look at one player's rating and another's and estimate with a better level of accuracy the expected score.


kewms

12/15/2006
05:40:55

[ report this post ]


Message:
Enjoyable play is *not* beside the point. If players do not enjoy their games, they will leave. A smaller player pool inherently gives less accurate ratings.

I continue to believe your proposal is a solution in search of a problem. At best, it's impossible to evaluate without additional data: how inaccurate are ratings now? What fraction of a player's games would need to be randomly paired to achieve the desired goal? And what is the maximum accuracy that could be achieved by this method?




wschmidt

12/15/2006
14:05:48

[ report this post ]
sayb,

Message:
what about your playing experience at GK has led you to believe that there is really a need to be better able to estimate the expected score than what exists now? You've played 38 games and won every one, almost all against players rated lower than you are. Right now, with a rating 1804, your average opponent rating is 1435. For a player in the top 10%, that's to be expected at this point in your GK experience. However, until you start losing games to players stronger than yourself and get a feel for where your ceiling is, I don't see how you can conclude that the pairing change you suggest would really provide more meaningful information to the day-to-day player. Granted, it may be incrementally more accurate, but is there a need for that additional accuracy? Where is that conclusion coming from?



sayb

12/15/2006
15:02:57

[ report this post ]


Message:
I started these games when I joined GK. Since I've been winning, my rating has been going up and hence their ratings are lower. You fell for a false corelation there my friend.

I also stated that the phenomenon this proposal would cure is not specific to GK but to any rating pool.

And I've played here before. Years ago and for years.


leo_london

12/15/2006
18:07:44

[ report this post ]


Message:
There will always be problems with any rating system, whatever formulas are applied to the games among the rated players.

Although the emphasis is usually on up-to-date ratings, the underlying data are an assortment of old and new.

While some ratings are more or less stable, others are rising rapidly, and interaction of the two sorts causes deflation in the pool at large.

Data samples vary in size considerably, from the few games of the occasional player to the many games of the enthusiast, with resulting variation in sampling error.

Finally, the rating pool itself is changing as players come and go.

The Elo System attempts to keep ratings up-to-date by limiting sample size in its established rating, which becomes a kind of moving average. Unlike ordinary moving averages, where sample size is restricted to the last N games, established ratings are based on attenuated sample weight. The effect of a rated event, having an original sample weight of 1/N, becomes attenuated as more and more data are processed. In theory at least, the effect is never completely lost. The established rating becomes what might be called a weighted moving average. Even though recent data are more heavily weighted, rating changes cannot keep pace with changes in playing strength. Timely adjustments, in short, do not guarantee currency of the data on which they are based.

Personally, I would have thought ( with regard to GK ) the first step towards more accurate ratings should be to make the provisional rating period far longer.
Ideally, a player's rating should be established before joining the main pool, during that period he/she would be playing fairly random games but only against other " provisional " players. After joining the main pool of players on GK ( with an established rating based on a reasonable number of games ) I can see no reason why the random element suggested by sayb would be any more accurate.
However, I'm no mathematician and from his profile I note that sayb is a a bit of an expert..so, I dont intend to really argue the point. ;)



pawntificator

12/15/2006
21:27:23

[ report this post ]
Interesting

Message:
I don't think it would make a difference in ratings, but it would be nice for weaker players to get to play with stronger ones.

sayb

12/15/2006
21:57:42

[ report this post ]


Message:
The truth is the only way it would make a significant difference in ratings is if everyone was forced to use this system and to use it for the majority of their games.

That's the fundamental flaw in the idea.

I think maybe this idea would require a seperate chess site. A site where everyone is randomly paired and that's it. Would be quite interesting.


chilliman

12/15/2006
23:10:55

[ report this post ]
kewms

Message:
again you are the voice of reason, I would like to add my thoughts to this thread but you have made my point so I will instead just ditto your posts - though if for some reason you were to get banned for those same posts I will per the quickest j'adoube ever seen!

wschmidt

12/17/2006
14:42:03

[ report this post ]
sayb

Message:
I specifically said, "Right now, with a rating 1804, your average opponent rating is 1435. For a player in the top 10%, that's to be expected at this point in your GK experience", so I didn't fall for a false correlation.

But you didn't answer my question. What in your experience, your day-to-day play at GK, leads you to believe that additional accuracy resulting from random pairings would enhance the GK experience?


sayb

12/17/2006
16:41:57

[ report this post ]


Message:
I stated I played here before, explaining how I know that there is a slight rating discrepency. I was rating much higher than 1800 before.

You didn't ask what would enhance the GK experience you asked what leads me to the conclusion that GK needs it.

I've stated it's a phenomenon of any rating system and it would be an overall improvement due to being able to better ascertain strength from rating.

If you understand the statistical aspect (normal distribution and the statistical nature of elo) you'd understand why the ratings would be more accurate.

-> en.wikipedia.org

What I'm stating as a proposal here is not necessarily that it would improve the GK experience (that is my opinion because I feel that more accurate ratings means better being able to gauge the strength of an opponent which in turn makes the experience here a little better. that's an opion) but the fact (yes it is a fact) that if it was all random pairing, the ratings would be more accurate.

This depends on what you see as accurate of course.

A 2200 rated player on GK is roughly as strong as a 2200 rated player on GK. If you're willing to throw normal distribution out the window, yes you can definitely gauge strength by rating here but it's not a simple formula. It's more like a piece-wise function with arbitrary weights assigned to meaningless rating intervals.

I've previously stated that I realized it is not a good change for GK since it's too large of a change and that I think it would be a good idea for a new chess site. A site where pairing is random always.


kewms

12/17/2006
17:52:15

[ report this post ]


Message:
Which brings up a philosophical question: what makes a rating system "better?" A statistician might say that the best system is the one in which rating most accurately measures strength, and they would be able to provide a rigorous definition of exactly what that means. But most players aren't statisticians, and aren't paid bonuses and appearance money based on their spot in the FIDE rankings. Most players just want to play enjoyable chess games. For them, the best system is the one that facilitates that goal.

For those players, one of the reasons why ratings matter in the first place is because people don't like random pairings. Ratings are a way to judge whether someone is a suitable opponent, or whether the competition in a tournament section is "fair." So if a more accurate rating can only be achieved at the expense of opponent choice, most players are likely to say it isn't worth it.


sayb

12/17/2006
18:40:48

[ report this post ]
kewms

Message:
I'd considered that exact reasoning.

"I've previously stated that I realized it is not a good change for GK since it's too large of a change and that I think it would be a good idea for a new chess site. A site where pairing is random always."


danders

12/21/2006
14:48:34

[ report this post ]
sayb

Message:
The elo system is used to match people of like strengths together, is it not? If you resort to 100% random pairing, you are in effect ignoring everyone's ratings, and any increase in accuracy will be irrelevant, because you aren't using the numbers anyway!

In light of this, I fail to understand what you hope to gain from your increase in accuracy. What you suggest, if I understand correctly, is self-contradictory.

However, I'm wondering... would it help at all if provisionally rated players were given their games from a random pool consisting of other provisionally ranked players AND officially ranked players who choose to be added to the random pool? I could see where that might actually improve the gameknot experience, but I'll leave it up to you statisticians to figure out if it significantly improves the ratings any.

Personally, I think I'd go for an idea like that.





ccmcacollister

12/22/2006
01:31:10

[ report this post ]
back in the day ...

Message:
when I was playing APCT or just more seriously in general ... I used to like having sections show who has joined up so far, to pounce on .. :) Then you get those great revenge matches, etc. So I guess I like knowing who is there. But the GK Tournaments do pair somewhat ramdomly within rating groups, dont they?
Also adding ramdomness, when you join a M/T you dont really know who will join in later.




Post a reply to this message:

Please adjust your bookmarks to point to the new forums!