|
|
|
![]() |
|
|||||||
|
||||||||
![]() |
| Thread Tools |
Rating:
|
Display Modes |
|
#16
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Quote:
|
|
#17
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
It's the same one I PMed you a couple weeks ago, but sure.
https://dl.dropboxusercontent.com/u/5193107/scores.zip You'll need to install the trueskill package ('pip install trueskill') to use it. |
|
#18
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
So I decided to take this data a little bit further. What I did was to take all of these calculations and run them through the data this year and try to predict matches. For this, I also included a modified Elo system that has diminishing returns for large margins of victory (Calling this Elo Mod). I got some rather surprising results.
My baseline was just using OPR for predicting match outcomes, it was able to predict about 77.1% of the matches this year. This was calculated by adding up the OPRs of each alliance and comparing with the result of the match. TrueSkill was able to predict 79.0% of the matches, a pretty good improvement. I need to develop the prediction model a bit better because it currently doesn't take into account the standard deviation as a measure of certainty. The modified Elo system was able to predict 79.5% of matches, an improvement over TrueSkill. The baseline, unadulterated Elo system as used in this thread was able to predict a whopping 81.4% of matches, by far the best out of any of these models. There is still room for improvement with the TrueSkill and Modified Elo. With the modified Elo, there are some constants that can be tuned for better results. But overall, the results are somewhat interesting. It seems that no matter the ranking model used, about 1 in 5 qualification matches will result in an upset. Here is the spreadsheet I used: https://dl.dropboxusercontent.com/u/...Trueskill.xlsx |
|
#19
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Quote:
|
|
#20
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
No, I guess I should have used a better word than "predict". More like "postdict". I went into it "knowing" an Elo/TrueSkill rating and tested against the data that was used to calculate it. Of course we won't have all the data to calculate the final Elo ratings during the season. My next step is to calculate all the ratings as if it is just before championships and then see how each does with "predicting" championship matches. My SWAG is that the success rate in predictions using the "postdicted" matches is a ceiling for how good we can hope for the predictions to be.
|
|
#21
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Quote:
|
|
#22
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
I've tried the L1 optimization problem, but l1-magic is giving me fits. For some reason, it blows up after 10 iterations.
However, I have done the analysis that I wanted. I calculated all the rating systems using data from events prior to CMP, then I used that data to predict CMP matches. In summary: OPR: 72.46% Correct TrueSkill: 69.31% Correct Elo: 72.90% Correct Elo Mod: 71.71% Correct TL;DR: We're okay, but not great at predicting matches. OPR is okay at it, but Elo is better. I'm still somewhat surprised that Elo is slightly better. Updated Spreadsheet: https://dl.dropboxusercontent.com/u/...skill%202.xlsx |
|
#23
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Quote:
|
|
#24
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Quote:
|
|
#25
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
I'm using the l1eq_pd.m function in the L1-Magic library (http://users.ece.gatech.edu/~justin/l1magic/)
CCWM: 71.41% |
|
#26
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
That's the wrong solver.
Code:
% l1eq_pd.m % % Solve % min_x ||x||_1 s.t. Ax = b Secondly, what you want to find is the min L1 norm of the residuals, not of the solution vector itself. For the set of overdetermined linear equations Ax ≈ b, x is the solution vector. The residuals are b-Ax. So you want find a solution vector x which minimizes the L1 norm of b-Ax. |
|
#27
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Attached is a comparison of b-Ax residuals for L2 and L1 OPR. Alliance scores computed from L1 OPR are within +/-10 points of the actual scores 33.5% of the time. Alliance scores computed from L2 OPR are within +/-10 points of the actual scores only 22.4% of the time. It is on that basis that I postulate that L1 OPR might be a better predictor of match outcome. [EDIT]Cannot add attachments to threads associated with papers. Brandon: can you please change this setting to allow attachments? Thank you.[/EDIT] |
|
#28
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Quote:
Our numbers are very close, but I had expected them to be identical. Here's a link to an XLS spreadsheet. |
|
#29
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Quote:
|
|
#30
|
||||
|
||||
|
Re: paper: Weeks 1-2 Elo Analysis
Quote:
|
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|