65,053 rows of data of the form:
|Month #||White Player #||Black Player #||Score|
Score is 1 if white wins, 0 if black wins, 0.5 indicates a draw.
There are 8631 players and the month value ranges from 1 to 105, hence there is some scope for modelling player's changing skill level.
Predict outcomes for games for which no score has been revealed, the games to be predicted occurred after the period represented by the training data. The primary goal of the organisers was to find a chess player rating system better than the existing ELO rating system.
In a nutshell, I define each player as having a rating value. The probability of player A beating player B is simply A's rating minus B's rating. From there I use a gradient descent update rule to find the optimal ratings. Details as follows...
Some degree of tweaking was necessary to make this competitive. Most notably:5) W(A) = (Ra - Rb) * 0.04 (found experimentally)
All of the above ultimately got me to 10th position on the final leaderboard with an RMSE score of 0.70018 (lower is better), very closely behind the winning score of 0.69477. Also a number of interesting developments were made public on the forums after I had carried out the above work, so I possibly could have improved this score further. Possibly the most beneficial change would have been to use a probe data set that more accurately represented the set being used to score competition submissions - I used a random scattering of games across the whole data set while others observed better results using a probe set made up from the most recent games.
Also the final reported position is somewhat better than my position on the leaderboard in the final weeks of the competition where I had slipped down to approx 23rd place after making my final submission sometime earlier. The final scores were based on a broader data set and this seems to have significantly altered the final standings, boosting my entry (entries?) up to 12th place. (My high water mark was a brief stint at 4th place). This noise in the scoring system is a problem inherent to small data sets.
(some time in 2010)