Paper: FRC Elo 2002-Present

This is a solid summary, I’ll try to remember to link to this in the future.

I’m planning to play around more with OPRm sometime around here, and I’ll post the results as I get them. I don’t actually know off-hand whether it has more or less predictive power than ixOPR, but if it does that’s what I’ll probably use moving forward.

I don’t actually know off-hand whether it has more or less predictive power than ixOPR, but if it does that’s what I’ll probably use moving forward.

I think OPRm is probably similar in predictive power, but has a bit more “tunability” than xOPR/ixOPR which may make it easier to eke out a few extra percent in performance.

A few points:

  • The amount that xOPR/ixOPR weight based on the prior OPR estimate depends on how many matches each team plays in an event, which seems odd and arbitrary to me. If you’re playing in a 12 match event, after your first match xOPR/ixOPR add in 11 fake matches, while if you’re playing in a 10 match event, after your first match xOPR/ixOPR add in 9 fake matches. IMHO after 1 match played your “new” OPR should be the same regardless of how many matches you have left to play in the event. You could just add a single fake match with a weighting factor to bias the relative importance of the previous estimate and the new match results, and then you basically just end up doing OPRm.

  • xOPR/ixOPR depend on the particular match schedule for unplayed matches. This also seems odd and arbitrary. After 1 match, the xOPR/ixOPR results will be different depending on what the unplayed remaining match schedule is. Change that unplayed schedule and you’ll get different values for xOPR/ixOPR. OPRm doesn’t have this issue.

  • Regressing to the mean definitely helps predictive ability (at least it did when I studied it in my paper from a few years ago). OPRm does this naturally. You could do this with xOPR/ixOPR too by adding a fake match with match scores pulled towards the mean, or just by doing that a bit for every fake match you currently are using: i.e., instead of making the fake match score exactly the sum of the OPR estimates, make the fake match score xOverallMeanScore + (1-x)(sum of OPR estimates for the specific teams).

  • At the end of an event, xOPR/ixOPR are 100% based on the event results. I think Caleb has found that better predictive results can be gained by weighting multiple events, at least in the Elo space. OPRm allows a weighted blending of results from previous estimates and from the current event. This could be done with xOPR/ixOPR by still having 1 or more fake matches added in even after all matches have been played.

2 Likes

I have a couple of questions.

First I see that your prediction percent averages is around 70%. Is this number what you expected? How does this compare to other Elo rated sports?

Secondly I read about using Elo ratings for pro sports and they use modifiers like home team, starting pitchers, weather. Other than your veteran/rookie modifiers, what other modifiers do you think you can use to improve the prediction percentage? Matches played? Matches since last match? Difference between best and worst team in an alliance?

If you weigh contributions to the alliance ELO in a non-constant way, can you gain predictive power? My understanding is that right now, you are taking the average.

Many FRC games have diminishing opportunities for scoring beyond two robots (Destination: Deep Space is one of them). As such, it would seem like the two highest ELO robots on an alliance matter much more than the contribution of the third.

2 Likes

My understanding is probably oversimplified, but doesn’t Caleb call his metric Winning Margin Elo? Seems to imply that preventing opponents from scoring weighs equally with scoring for your own alliance.

Is 70% what I expected? I don’t really recall anymore to be honest. I built the first iteration of my system a few years ago and have only nudged up the performance 2ish percent since then. I don’t know what number I expected exactly, probably around there as I expected it to be reasonably close to OPR which was already known to be in the 70% vicinity.

That sounds like a good question for someone else to look up. :grin:
My bs guesses would be something like:
mlb: 55%
nfl: 60%
nba: 65%

According to the 538 forecasts, here are the Brier scores for those 3 (I’m assuming their reliability is very close to 0 since they don’t state it and their charts look well calibrated):
mlb: .242
nfl: .221
nba: .212

Remember that lower is better with Brier scores. Brier scores from my Elo model are:

year Brier score
2002 0.2355
2003 0.2214
2004 0.2126
2005 0.2085
2006 0.2043
2007 0.2093
2008 0.1956
2009 0.1956
2010 0.1764
2011 0.1679
2012 0.1875
2013 0.1798
2014 0.1892
2015 0.1856
2016 0.1867
2017 0.2094
2018 0.1726
2019 0.1829

So every FRC game from the past few years has been more predictable than these pro sports (or I’m just way smarter than Nate Silver, take your pick).

I can’t really think of any others that I haven’t already attempted. I have tried modifiers for all 3 of the options you mention and none of them provide a large enough consistent improvement in predictive power for me to consider them worth adding. My future work will probably be more in the direction of finding year-specific attributes that can improve predictive power instead of features that improve performance in all years.

1 Like

Yes, you can definitely improve predictive power by giving different weights to the Elos of the strongest, middlest, and weakest teams. The problem is, the best “weights” to use vary drastically year to year. In 2016 for instance, I could improve performance by multiplying the highest team’s Elo by 1.5 and multiplying the middle and weakest teams’ ratings by 1. In 2017 though, this same structure caused a decrease in predictive power, and the best weights were 1,1,1. In 2018, I could very drastically improve predictive power by using the weights 1.5,1,0.5 (as in, the weakest team is only weighted a third as much as the strongest, and only half as much as the middle team). I haven’t tried for 2019, but my rough guess would be that the optimal weights would be around 1.5,1,1.

I’ve tried some other ways of mixing the Elos other than just taking their raw sum, but I haven’t found any alternative arrangement that improves performance in all years. As such, I opt for simplicity and take raw sums. Somewhere here I’ll incorporate year-specific features and that should really help predictive power.

Correct, any way you improve your winning margin will improve your Elo, offense or defense.

1 Like

I’ve created a GitHub repo for my Elo books here:

I also just updated to 2002-2019 v 2. There was a bug that would not display the highest 3 numbered teams in the season summary tabs, and this has been corrected.

4 Likes

It seems to me that there must be a limit to the predictive power of any type of rating for events that have an “unpredictable random” element (as opposed to rolling loaded dice, which would have a “predictable random” element). Current Elo ratings appear to run in the low 70’s% range, but this community continues to explore ways to tweak the algorithm and squeeze another few percent out of the data. Do the predictions asymptotically approach some maximum value? Would it be ever be possible to know if/when a model had reached its ideal state?

I would expect that the “ideal” model for any particular game could be identified retroactively (via AI or some automated iterative approach, etc.), but it would change for each game.

I think that there is an inherent uncertainty to every match and that the more effective way to think about match prediction and its ceiling isn’t in terms of correctly predicting the outcome, but correctly predicting the uncertainty (probability). Similarly, the error metric of choice shouldn’t be accuracy, rather mean squared error or log loss, which are more suitable for probabilistic models. Predicting a 51% chance of a loser winning is a much smaller failure than predicting a 90% chance of a loser winning.

2 Likes

Any chance of getting raw csvs of the data?

Here are csvs of the season summary tabs, is that what you wanted?
Max Elos.csv (489.6 KB)
End of Season Elos.csv (616.6 KB)
Average Elos.csv (489.6 KB)

I’m guessing that the weights for the first two are close, but higher than the third. That’s due to the single defender being able to focus on the best bot which then required the second bot to step up to win the match.

1 Like

Per request from @Brian_Maher, I’ve added event keys and an “All Event Elos” sheet to this book. You can find this in v3 here.

Here are the top 100 all time end of playoff Elos I’ve got:

Top 100 End of Event Elos Since 2002
Year Event Name Team Elo
2010 Michigan FIRST Robotics Competition State Championship 469 2224
2010 Greater Toronto Regional 1114 2201
2015 Windsor Essex Great Lakes Regional 1114 2187
2010 Curie Division 469 2187
2015 Waterloo Regional 1114 2175
2015 Silicon Valley Regional sponsored by Google.org 254 2166
2010 Troy FIRST Robotics District Competition 469 2162
2015 Curie Division 1114 2149
2015 Carson Division 254 2136
2010 Waterloo Regional 1114 2134
2010 Curie Division 1114 2111
2013 Greater Toronto West Regional 1114 2107
2008 Galileo Division 1114 2101
2013 Galileo Division 1114 2095
2015 Greater Toronto Central Regional 1114 2091
2007 Greater Toronto Regional 1114 2088
2013 Curie Division 2056 2085
2013 Greater Toronto West Regional 2056 2085
2010 Michigan FIRST Robotics Competition State Championship 67 2067
2013 Waterloo Regional 1114 2063
2012 Michigan FRC State Championship 469 2060
2013 Waterloo Regional 2056 2057
2015 Newton Division 1678 2057
2011 Waterloo Regional 1114 2054
2014 Curie Division 254 2050
2015 Windsor Essex Great Lakes Regional 2056 2047
2008 Greater Toronto Regional 1114 2046
2012 Troy FIRST Robotics District Competition 469 2043
2011 Greater Toronto East Regional 1114 2043
2018 Silicon Valley Regional 254 2043
2014 Greater Toronto East Regional 1114 2039
2018 Einstein Field (Houston) 254 2036
2015 Galileo Division 2056 2035
2016 Einstein Field 1114 2030
2010 Galileo Division 2056 2030
2007 Curie Division 330 2028
2018 Tesla Division 2056 2028
2015 Archimedes Division 1023 2027
2018 Hopper Division 254 2026
2010 Newton Division 67 2024
2013 Michigan FRC State Championship 469 2023
2011 Greater Toronto West Regional 2056 2023
2013 Archimedes Division 987 2023
2007 Galileo Division 25 2022
2011 Einstein Field 1114 2021
2015 FIM District - Bedford Event 1023 2021
2013 Greater Toronto East Regional 2056 2020
2007 San Diego Regional 330 2019
2007 Waterloo Regional 1114 2014
2014 Waterloo Regional 1114 2013
2006 Newton Division 25 2013
2008 Waterloo Regional 1114 2010
2018 New England District Championship 195 2005
2015 Waterloo Regional 2056 2005
2012 Northville FIRST Robotics District Competition 67 2004
2010 Cass Tech FIRST Robotics District Competition 217 2003
2012 Archimedes Division 2056 2003
2010 Pittsburgh Regional 1114 2002
2015 Silicon Valley Regional sponsored by Google.org 1678 2000
2015 Lone Star Regional 118 1997
2011 Pittsburgh Regional 1114 1997
2006 Archimedes Division 233 1996
2018 Wisconsin Regional 2481 1994
2018 Sacramento Regional 1678 1994
2018 Arizona North Regional 254 1994
2016 Silicon Valley Regional presented by Google.org 254 1993
2016 Dallas Regional 148 1993
2015 FIRST in Michigan District Championship 1023 1993
2018 Einstein Field (Houston) 1678 1993
2012 Troy FIRST Robotics District Competition 67 1992
2018 Newton Division 1678 1992
2010 Greater Toronto Regional 2056 1992
2013 San Diego Regional 987 1992
2018 FIRST Mid-Atlantic District Championship 2590 1989
2015 Newton Division 118 1989
2007 Great Lakes Regional 1114 1988
2010 Archimedes Division 254 1988
2013 Greater Toronto East Regional 1114 1987
2009 Lansing FIRST Robotics District Competition 67 1986
2015 Hopper Division 987 1985
2013 Bedford FIRST Robotics District Competition 469 1984
2009 Einstein Field 1114 1984
2013 Archimedes Division 469 1981
2018 FIRST Ontario Provincial Championship 2056 1980
2013 Michigan FRC State Championship 67 1980
2018 FIRST Ontario Provincial Championship - Science Division 2056 1979
2015 Greater Toronto East Regional 2056 1979
2013 Curie Division 67 1978
2006 Waterloo Regional 1114 1977
2012 Newton Division 1717 1977
2016 Hopper Division 1678 1976
2016 Central Valley Regional 254 1976
2015 Las Vegas Regional 148 1974
2011 Einstein Field 469 1973
2018 MAR District Montgomery Event 225 1972
2011 Michigan FIRST Robotics District Competition State Championship 217 1972
2019 ONT District McMaster University Event 2056 1970
2012 Waterloo Regional 2056 1970
2010 Las Vegas Regional 25 1969
2015 Central Valley Regional 254 1969

Teams with >2 entries in top 100:

team count in top 100
1114 27
2056 17
254 11
469 9
67 7
1678 6
25 3
1023 3
987 3
3 Likes

Caveat in that some years are clearly much easier to get high Elos in than others:

year count in top 100
2002 0
2003 0
2004 0
2005 0
2006 3
2007 6
2008 3
2009 2
2010 14
2011 7
2012 7
2013 15
2014 3
2015 20
2016 5
2017 0
2018 14
2019 1

What a fitting highest regional ELO for 330, their final season

3 Likes

Elo can predict much more than just match outcomes if you look closely enough.

6 Likes

:grimacing:

Maybe it’s just predicting their first Champs win?

Also, the top 16 of all time are just 3 teams, and 11 of them are 1114. It’s preposterous how dominant they were in 2010 and 2015.

4 Likes

That one doesn’t quite seem right, as 1114 wasn’t on Einstein in 2016.
Am I missing something?

3 Likes

Good catch! I had an error in the event key reporting and Einstein years. Here is a fixed book. I had fixed this issue on another version of my book but apparently I uploaded the wrong one. Here’s also a fixed top 100 list:

Fixed top 100
Year Event Name Team Elo
2010 Michigan FIRST Robotics Competition State Championship 469 2224
2010 Greater Toronto Regional 1114 2201
2015 Windsor Essex Great Lakes Regional 1114 2187
2010 Curie Division 469 2187
2015 Waterloo Regional 1114 2175
2015 Silicon Valley Regional sponsored by Google.org 254 2166
2010 Troy FIRST Robotics District Competition 469 2162
2015 Curie Division 1114 2149
2015 Carson Division 254 2136
2010 Waterloo Regional 1114 2134
2010 Curie Division 1114 2111
2013 Greater Toronto West Regional 1114 2107
2008 Galileo Division 1114 2101
2013 Galileo Division 1114 2095
2015 Greater Toronto Central Regional 1114 2091
2007 Greater Toronto Regional 1114 2088
2013 Curie Division 2056 2085
2013 Greater Toronto West Regional 2056 2085
2010 Michigan FIRST Robotics Competition State Championship 67 2067
2013 Waterloo Regional 1114 2063
2012 Michigan FRC State Championship 469 2060
2013 Waterloo Regional 2056 2057
2015 Newton Division 1678 2057
2011 Waterloo Regional 1114 2054
2014 Curie Division 254 2050
2015 Windsor Essex Great Lakes Regional 2056 2047
2008 Greater Toronto Regional 1114 2046
2012 Troy FIRST Robotics District Competition 469 2043
2011 Greater Toronto East Regional 1114 2043
2018 Silicon Valley Regional 254 2043
2014 Greater Toronto East Regional 1114 2039
2018 Einstein Field (Houston) 254 2036
2015 Galileo Division 2056 2035
2015 Einstein Field 1114 2030
2010 Galileo Division 2056 2030
2007 Curie Division 330 2028
2018 Tesla Division 2056 2028
2015 Archimedes Division 1023 2027
2018 Hopper Division 254 2026
2010 Newton Division 67 2024
2013 Michigan FRC State Championship 469 2023
2011 Greater Toronto West Regional 2056 2023
2013 Archimedes Division 987 2023
2007 Galileo Division 25 2022
2010 Einstein Field 1114 2021
2015 FIM District - Bedford Event 1023 2021
2013 Greater Toronto East Regional 2056 2020
2007 San Diego Regional 330 2019
2007 Waterloo Regional 1114 2014
2014 Waterloo Regional 1114 2013
2006 Newton Division 25 2013
2008 Waterloo Regional 1114 2010
2018 New England District Championship 195 2005
2015 Waterloo Regional 2056 2005
2012 Northville FIRST Robotics District Competition 67 2004
2010 Cass Tech FIRST Robotics District Competition 217 2003
2012 Archimedes Division 2056 2003
2010 Pittsburgh Regional 1114 2002
2015 Silicon Valley Regional sponsored by Google.org 1678 2000
2015 Lone Star Regional 118 1997
2011 Pittsburgh Regional 1114 1997
2006 Archimedes Division 233 1996
2018 Wisconsin Regional 2481 1994
2018 Sacramento Regional 1678 1994
2018 Arizona North Regional 254 1994
2016 Silicon Valley Regional presented by Google.org 254 1993
2016 Dallas Regional 148 1993
2015 FIRST in Michigan District Championship 1023 1993
2018 Einstein Field (Houston) 1678 1993
2012 Troy FIRST Robotics District Competition 67 1992
2018 Newton Division 1678 1992
2010 Greater Toronto Regional 2056 1992
2013 San Diego Regional 987 1992
2018 FIRST Mid-Atlantic District Championship 2590 1989
2015 Newton Division 118 1989
2007 Great Lakes Regional 1114 1988
2010 Archimedes Division 254 1988
2013 Greater Toronto East Regional 1114 1987
2009 Lansing FIRST Robotics District Competition 67 1986
2015 Hopper Division 987 1985
2013 Bedford FIRST Robotics District Competition 469 1984
2008 Einstein Field 1114 1984
2013 Archimedes Division 469 1981
2018 FIRST Ontario Provincial Championship 2056 1980
2013 Michigan FRC State Championship 67 1980
2018 FIRST Ontario Provincial Championship - Science Division 2056 1979
2015 Greater Toronto East Regional 2056 1979
2013 Curie Division 67 1978
2006 Waterloo Regional 1114 1977
2012 Newton Division 1717 1977
2016 Hopper Division 1678 1976
2016 Central Valley Regional 254 1976
2015 Las Vegas Regional 148 1974
2010 Einstein Field 469 1973
2018 MAR District Montgomery Event 225 1972
2011 Michigan FIRST Robotics District Competition State Championship 217 1972
2019 ONT District McMaster University Event 2056 1970
2012 Waterloo Regional 2056 1970
2010 Las Vegas Regional 25 1969
2015 Central Valley Regional 254 1969
1 Like