Wednesday, January 25, 2012

The Numbers Game

  In organized chess, players are given numerical ratings by whatever governing body is in charge of the rating system for said organization. Ratings are a gauge of relative strength of chess players. In America, the United States Chess Federation (USCF) is in charge of assigning ratings for all players in USCF tournaments, The British Chess Federation is in charge of England’s rating system, and FIDE assigns international ratings. I have 3 USCF ratings: a 1663 rating for quick chess play (game in under 60 minutes), a 1706 rating for long play (game over 30 minutes) and a 1591 rating for extremely slow play by email or postal mail. I also have 8 ratings from the Internet Chess Club for games in 1 minute, 2 minute, bullet, blitz, standard, 5-minute, and other chess variants. In addition to these ratings, I have a rating on, my Tactics trainer iPod app, and a host of other tactic servers. Ratings are a source of pride for some chess players, and embarrassment for others, and an obsession for most.

  At my Des Moines youth chess tournaments, I have a rated and an unrated section. The thought of getting a rating fascinates many of the unrated players and on Saturday one of them asked me what ratings meant. I explained a bit about the numbers, but I made sure to close my discussion by saying that a rating is where you’ve been, not where you’re going and except for 2 people everyone has a worse rating (not exactly true since plenty of people have the minimum USCF rating of 100) than some and a better rating than others. And at least 3 times every tournament, a rated player will ask me what I think their rating will be after the tournament. There's just something about having this little number next to your name that appeals to chess players.

  When I say ratings can be an obsession, I mean it. When I’m at a tournament and a player pulls of a big upset, they rarely say ‘I beat Sam’ or ‘I beat Johnny’, instead they say ‘I beat an 1800!’ and most players don’t talk about losing to Billy, rather bemoaning how they lost to a 1100. I used to get upset when I lost rating points, but after watching my blitz rating oscillate over the last couple of years I’ve come to the realization that when my rating is low I gain a few more points (or lose a few less) from tournaments than when my rating is high (and the inverse is true when my rating is high) so my rating at any given point is much less important than how I’m playing. Whenever anyone starts telling me about their latest rating swing, I mention this. It normally makes the player who has just lost a lot of points feel better (and sometimes enrages the player who has just gained a lot of points and now feels belittled). The math and logic is indisputable, but I wonder if it is just a coincidence that I’ve only played in one regular rated tournament since I cracked the 1700 rating mark in 2009?

  The USCF and FIDE ratings are based on the calculations of Arpad Elo, a Hungarian immigrant. The theory behind the ratings are used from everything from table tennis to role playing card games (you can find the math here). There are a few cracks in the USCF rating system that lead to rating inflation. To prevent players from artificially lowering their ratings in order to get cash prizes, the USCF instituted rating floors. With a rating floor, a players rating can never fall 200 points below their highest rating rounded down to the hundreds. For example, a player rated 1865 can never have their rating drop to below 1600 unless they petition the USCF for a lower rating. Rating floors are also assigned when a large cash prize is won for winning a prize for a certain rating. When Iowa player Tim Crouse won a huge cash prize for the best score under an 1800 rating at a big tournament in Chicago, he was assigned a rating floor of 1800 by the USCF to prevent him from winning another big under 1800 prize in the future.

  While rating floors are useful to protect the integrity of cash prizes, they lead to rating inflation when a player’s strength doesn’t match his rating floor due to age or simply not playing at the floored rating obtained from winning a large cash prize. When a player at his or her rating floor is in a slump and not playing at their rating level, lower rated players who defeat then gain rating points while the losing player's rating stays at their floor. The net effect is an increase in the total pool of rating points, leading to the inflation. Another cause of rating inflation is the bonus afforded to players who get a perfect score in a tournament.

  It isn’t uncommon for a player at my Marshalltown Thursday Night blitz tournaments to have a big day and gain a hundred points or more in a day. For example, on the January 5th tournament, Jerry Mason (rated 1155), won all 3 of his games against players rated 1087, 1304, and 1732 respectively. It was a great performance and Jerry’s rating shot up to 1324, a gain of 169 points, but the rest of the playing field lost 24 points combined. This means 145 points of Jerry’s rating increase didn’t come at the expense of the other players, but was just added to the system. This made me curious and I added the before and after ratings of all Thursday Night Blitz tournaments and found an increase of 4,453 rating points in the 106 tournaments since September 2009. For my technically-inclined readers, I’ll note that I did not include provisional ratings (the USCF counts ratings as provisional until 25 games have been played and allows the ratings to increase and decrease at a higher rate than non-provisional ratings) in this total.

  I researched the tournaments that caused the most disparities and they were all like Jerry’s big January 5th tournament. One player had a great tournament and gained a hundred rating points or more, but not at the expense of the other players. And when these lucky players came back to earth and lost some of their rating points back, the points tend to go to the other Marshalltown players because our blitz tournaments include a lot of the same players week after week. This ‘closed-loop’ effect was seen most clearly in the case of chess player (and convicted murderer) Claude Bloodgood, who attained a 1996 rating of 2700 (second in the country at the time) while only playing in prison tournaments.

  The main use I have for ratings is to keep the top players at tournaments I run from playing each other in the early going. Win TD (the pairing software I use) does this automatically for rated tournaments, but I’ve been having problems with my unrated tournaments. Since the players don’t have ratings, Win TD pairs the players randomly and sometimes will pair the best two players in the first round or give the defending champion a first round bye in case of an odd number of players. Because I have all the results in a database, I was able to spend some time on Sunday noodling around with various rating formulas to try to solve this and I discarded all my attempts as flawed. But when I started looking to see if I could ape the USCF system, I found it was nearly as flawed as anything I was coming up with and hundreds of times harder to implement.

  I’ve settled on an ‘unrated’ rating system that will take the winning percentage of a player’s latest 5 tournaments within the past 12 months scaled to 100, give a 5 point bonus for each tournament win, and add a point for each game played. The ratings will top out at 140 to 150 for a player who wins their last 5 tournaments and the lowest a player can have is a rating of 1 (playing only one game and losing it). I’m not sure if I’m going to let the players see their ratings since they will look low compared to the USCF ratings which start at 100, but I’m inclined to not only let the players see the ratings, but publish them on the Internet. After all, who doesn’t want to see that little number next to their name.