|
|
|
![]() |
|
|||||||
|
||||||||
![]() |
|
|
Thread Tools | Rate Thread | Display Modes |
|
|
|
#1
|
|||
|
|||
|
2016 Pre-Champs ELO Ratings
I wrote a small Python script (similar to this thread from 2014) to calculate the season-long Elo ratings for all 3000+ plus FRC teams that competed in the 2016 season. Here's the top 100 (keep in mind this a fairly untuned model):
Code:
Rank,Team,Elo Rating,# Played,Win % 1,frc2056,2564.8708516684,32,0.84375 2,frc987,2556.7395845976,31,0.8709677419 3,frc148,2513.3702482568,32,0.8125 4,frc2771,2509.4705400282,48,0.75 5,frc1241,2437.9107936136,35,0.8571428571 6,frc1519,2435.0658342396,36,0.9444444444 7,frc133,2432.4950118635,36,0.8888888889 8,frc118,2410.6631929387,28,0.8571428571 9,frc27,2408.0850729022,45,0.8 10,frc359,2393.7669444743,32,0.90625 11,frc1678,2365.1427556916,26,0.8846153846 12,frc1983,2352.9683563234,48,0.7708333333 13,frc1023,2327.1563415467,46,0.8260869565 14,frc1501,2314.8914797357,48,0.7916666667 15,frc1540,2308.9583510968,36,0.8333333333 16,frc4564,2299.9814293025,36,0.8611111111 17,frc225,2286.8436656747,36,0.9166666667 18,frc319,2267.1466845432,48,0.7291666667 19,frc2046,2262.0104039536,48,0.7291666667 20,frc3620,2258.2489588419,36,0.8055555556 21,frc2767,2231.1523749957,36,0.8888888889 22,frc125,2203.9042618114,58,0.6724137931 23,frc217,2193.7623220698,45,0.7111111111 24,frc67,2193.5383205599,36,0.8611111111 25,frc254,2190.187307243,19,0.9473684211 26,frc195,2174.4289095983,36,0.8055555556 27,frc179,2164.4363863313,27,0.9259259259 28,frc2013,2162.9140605587,33,0.7272727273 29,frc33,2155.8558874617,36,0.7777777778 30,frc971,2150.359414525,16,0.9375 31,frc4188,2147.4291381874,46,0.7608695652 32,frc910,2143.8934792287,33,0.7878787879 33,frc2590,2142.9579313632,45,0.6888888889 34,frc25,2142.4412456836,36,0.8055555556 35,frc4450,2139.5744792782,36,0.8055555556 36,frc4967,2138.8294065088,36,0.7222222222 37,frc2468,2133.6054741145,32,0.78125 38,frc1024,2132.5886298519,48,0.7083333333 39,frc2974,2126.6132833861,36,0.8055555556 40,frc1746,2122.4741745816,36,0.8611111111 41,frc1986,2105.5705944366,20,0.95 42,frc4334,2090.8271976938,23,0.8695652174 43,frc1261,2087.3494490483,48,0.7083333333 44,frc3314,2076.4327449113,57,0.701754386 45,frc4488,2076.2542185479,36,0.7222222222 46,frc1533,2072.1356193236,48,0.6458333333 47,frc3230,2071.4881103811,35,0.7142857143 48,frc2481,2068.7886898013,21,0.9523809524 49,frc494,2065.1162537725,36,0.8055555556 50,frc869,2062.6485747098,48,0.6875 51,frc230,2061.1864101023,48,0.6458333333 52,frc5279,2060.1866212576,36,0.7777777778 53,frc238,2057.8208919517,48,0.6875 54,frc5172,2055.2630353144,18,0.8888888889 55,frc2067,2054.8484971032,48,0.6875 56,frc1318,2048.8236076466,48,0.7291666667 57,frc3250,2036.8233126059,30,0.7333333333 58,frc5050,2034.4232765669,36,0.7777777778 59,frc4469,2033.0203537629,36,0.8055555556 60,frc3310,2031.833668903,20,0.95 61,frc1058,2029.9012706186,48,0.6458333333 62,frc330,2023.303386855,20,0.85 63,frc4468,2020.8796496031,36,0.7777777778 64,frc365,2015.2776518452,36,0.75 65,frc868,2013.0139180101,46,0.7173913043 66,frc1418,2004.4858845658,34,0.7352941176 67,frc2415,2002.7041652616,36,0.6944444444 68,frc16,2001.6149194784,19,0.8947368421 69,frc3688,2000.4276164977,36,0.7222222222 70,frc3990,1993.4366845031,21,0.9047619048 71,frc1747,1989.0741465466,48,0.6875 72,frc3357,1987.188575724,48,0.6875 73,frc368,1982.42880364,20,0.95 74,frc5460,1981.5616311026,36,0.6944444444 75,frc3238,1976.6527065089,36,0.7777777778 76,frc525,1976.440984525,18,0.8888888889 77,frc4384,1973.7651648588,36,0.5833333333 78,frc836,1972.003555699,34,0.7352941176 79,frc71,1970.1239091472,36,0.6944444444 80,frc56,1969.5271280769,36,0.6944444444 81,frc1731,1966.9417414622,36,0.75 82,frc1305,1964.9589066452,31,0.7096774194 83,frc1918,1961.9294395633,36,0.75 84,frc3604,1960.9763143508,36,0.7777777778 85,frc41,1959.5598450238,36,0.5833333333 86,frc2474,1957.2471714871,36,0.6666666667 87,frc4063,1956.7118381526,29,0.724137931 88,frc1425,1953.6830324716,36,0.75 89,frc1806,1951.7294052903,19,0.8947368421 90,frc2471,1950.9437092534,36,0.7222222222 91,frc3683,1947.6203673269,23,0.7826086957 92,frc1114,1944.3669483275,23,0.7826086957 93,frc5895,1941.0093857167,36,0.7222222222 94,frc3618,1941.0017737099,48,0.7083333333 95,frc3309,1934.657254773,32,0.6875 96,frc107,1931.3958797577,36,0.6944444444 97,frc1124,1921.5875314542,48,0.6666666667 98,frc4103,1919.8324175315,36,0.6944444444 99,frc3021,1917.1307148459,21,0.8571428571 100,frc85,1914.7890005898,36,0.7222222222 Methodology: I initialized all teams at the beginning of the season at 1500, and had ratings persist between competitions. The only matches considered by the model were qualification matches at official events. I decided to discount playoff matches because I wanted the ratings to reflect the best robots at an event, not necessarily the best alliances. Adding elimination matches massively inflated the ratings of 2nd picks on very strong alliances, often making them the third-highest rated robot at an event despite that usually not being the case. (I can post ratings with eliminations if people really want them, however) I also used a margin of victory multiplier similar to the one used for FiveThirtyEight's NBA Elo Ratings, which rewards underdogs for upsetting higher alliances, but for stronger alliances only rewards a little for beating weaker alliances. Most of the tuning values I used for these rankings were taken from the 538 values for the NBA, largely due to the rough similarity in scores between the NBA and Stronghold. Here's the script I used. I added parameters for the k-values, as well as the margin of victory multiplier function, and match level. (The 'tba.event_get()' method is taken from the the TBA wrapper script I use, and essentially just gets the event matches and teams from the TBA API and parses them into a dict using json.loads()). Code:
def elos(event_key, k=20, mov=lambda elodiff, scorediff: ((scorediff + 3) ** 0.8) / (7.5 + 0.006 * (elodiff)), level='qm'):
event = tba.event_get(event_key)
elos = {}
for team in event.teams:
elos[team['key']] = 1500
played[team['key']] = 0
for match in event.matches:
if level is not None and match['comp_level'] != level: continue
red = match['alliances']['red']
blue = match['alliances']['blue']
red_elo = statistics.mean(elos[team] for team in red['teams'])
blue_elo = statistics.mean(elos[team] for team in blue['teams'])
expected_score_red = 1. / (1 + 10 ** ((red_elo - blue_elo)/400.0))
expected_score_blue = 1. / (1 + 10 ** ((blue_elo - red_elo) / 400.0))
score_break = match['score_breakdown']
red_score = red['score']
blue_score = blue['score']
actual_score = 0.0
margin_mult = 1.0
if red_score > blue_score:
actual_score = 1.0
margin_mult = mov(red_elo-blue_elo, red_score-blue_score)
#margin_mult = ((red_score - blue_score + 3) ** 0.8) / (7.5 + 0.006 * (red_elo - blue_elo))
elif red_score == blue_score:
actual_score = 0.5
#margin_mult = (3 ** 0.8) / (7.5 + 0.006 * (red_elo - blue_elo))
margin_mult = mov(0, 0)
else:
actual_score = 0.0
margin_mult = ((blue_score - red_score + 3) ** 0.8) / (7.5 + 0.006 * (blue_elo - red_elo))
margin_mult = mov(blue_elo - red_elo, blue_score - red_score)
for team in red['teams']:
elos[team] += k * margin_mult * (actual_score - expected_score_red)
played[team] += 1
for team in blue['teams']:
elos[team] += k * margin_mult * (1-actual_score - expected_score_blue)
played[team] += 1
return elos
Last edited by wjordan : 04-22-2016 at 07:09 PM. Reason: Fixed code error |
|
#2
|
||||
|
||||
|
Re: 2016 Pre-Champs ELO Ratings
What exactly does ELO stand for? I've never heard of this statistic.
|
|
#3
|
|||||
|
|||||
|
Re: 2016 Pre-Champs ELO Ratings
Quote:
https://en.wikipedia.org/wiki/Elo_rating_system Fivethirtyeight.com uses it for a lot of there analyses and I'm growing to like it as a ranking method.. |
|
#4
|
||||
|
||||
|
Re: 2016 Pre-Champs ELO Ratings
It's actually Elo rating, named after its creator, Arpad Elo.
|
|
#5
|
||||
|
||||
|
Re: 2016 Pre-Champs ELO Ratings
How do you account for the lack of scoring for breach and capture during quals? That's critical to comparisons for elims.
|
|
#6
|
||||
|
||||
|
Re: 2016 Pre-Champs ELO Ratings
Are you talking about the missed 25 and 20 point bonuses in the eliminations by not including the elims? Thanks.
|
|
#7
|
||||
|
||||
|
Re: 2016 Pre-Champs ELO Ratings
As stated in the OP, elims results are not accounted for in the model.
|
|
#8
|
|||
|
|||
|
Re: 2016 Pre-Champs ELO Ratings
It doesn't account for elimination matches, but I've thought about adding the point bonuses for quals matches. I need to write some sort of prediction accuracy code to see if it's any more predictive.
|
|
#9
|
|||||
|
|||||
|
Re: 2016 Pre-Champs ELO Ratings
One source of error in this system arises because non-district teams play fewer qualifying matches than district teams. More capable teams, such as 16, 254, 330, 971, 2481, etc. would need a few more matches for their Elo ratings to converge from the initial seed (1500) toward a figure that better represents their performance.
|
|
#10
|
|||
|
|||
|
Re: 2016 Pre-Champs ELO Ratings
Quote:
Last edited by wjordan : 04-22-2016 at 05:45 PM. |
|
#11
|
|||||
|
|||||
|
Re: 2016 Pre-Champs ELO Ratings
Quote:
|
|
#12
|
||||
|
||||
|
Re: 2016 Pre-Champs ELO Ratings
I have been developing my own Elo rating system for FRC over the past few years. The way it works differs from the one in this post enough that I thought it might by interesting to compare.
The data in my ratings are based on the history of each team dating back to 2002. At the end of each season, ratings are truncated 80% closer to the starting rating. Since 0 is the starting rating, a team with a score of 100 at the end of one season will begin the next season at 80. As this system uses matches from different games, it does not use win margins. A "K factor" of 32 is used in each match except in playoff matches, where the K factor is 16 (I too found that playoff matches seemed less predictive of future matches than qualification matches). At the 2014 FIRST Championship, this system had a 0.190 Brier score, so it at least performs better than flipping a coin! Anyways, enough methodology. Here are the current rankings: Code:
Rank | Team | Rating 1. frc254 448 2. frc1519 421 3. frc225 388 4. frc1678 381 5. frc195 373 6. frc2481 368 7. frc118 364 8. frc987 345 9. frc2056 344 10. frc359 340 11. frc1023 336 12. frc1241 333 13. frc1114 333 14. frc525 329 15. frc148 327 16. frc1986 325 17. frc330 323 18. frc67 321 19. frc133 312 20. frc2767 310 21. frc33 306 22. frc3310 300 23. frc16 299 24. frc1806 299 25. frc4564 295 26. frc1501 292 27. frc125 290 28. frc3238 284 29. frc5172 278 30. frc971 278 31. frc368 276 32. frc494 275 33. frc5254 270 34. frc2122 269 35. frc2771 268 36. frc3130 267 37. frc1619 266 38. frc1024 266 39. frc4334 264 40. frc4469 263 41. frc3683 263 42. frc2974 262 43. frc179 259 44. frc3314 259 45. frc27 259 46. frc3230 256 47. frc1540 254 48. frc1318 254 49. frc1983 254 50. frc3990 253 51. frc3339 253 52. frc126 249 53. frc2451 247 54. frc107 245 55. frc1730 245 56. frc3604 244 57. frc2067 243 58. frc3824 243 59. frc2468 243 60. frc233 242 61. frc2590 242 62. frc4103 239 63. frc180 238 64. frc1918 237 65. frc973 236 66. frc1418 236 67. frc4488 236 68. frc25 236 69. frc3688 235 70. frc5050 235 71. frc70 234 72. frc2883 232 73. frc1261 232 74. frc2848 232 75. frc341 229 76. frc1296 228 77. frc1746 227 78. frc177 227 79. frc868 226 80. frc4039 226 81. frc3937 226 82. frc744 225 83. frc1717 222 84. frc2614 222 85. frc4188 221 86. frc85 220 87. frc2137 220 88. frc3309 219 89. frc217 219 90. frc1065 219 91. frc1425 218 92. frc4967 218 93. frc836 218 94. frc1126 217 95. frc1836 216 96. frc2471 216 97. frc3255 215 98. frc2338 215 99. frc231 214 100. frc4003 214 EDIT: Here are the full rankings if anyone is interested. Last edited by Carl C : 04-22-2016 at 06:06 PM. |
|
#13
|
|||
|
|||
|
Re: 2016 Pre-Champs ELO Ratings
Someone pointed out an error in the script I posted, which made the ratings slightly incorrect. I fixed the error and updated the rankings in the OP.
As for predictive power, this set of rankings had a Brier score of 0.155. |
|
#14
|
|||||
|
|||||
|
Re: 2016 Pre-Champs ELO Ratings
I can see at least one reason that "past performance does not guarantee future results", as the investment prospectus always says.
|
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|