I was lazy and just used 1500 as the average Elo, even though that’s not exact, as it fluctuates between 1485 and 1515 each year. I would guess that’s where the discrepancy lies.
Interesting. I might have guessed 2010 would be higher, but 2015 and especially 2007 really surprise me. I had a related conversation on another channel, but I’ll copy the thoughts here.
Essentially, I’m seeing some serious shortcomings of using average Elo as a holistic measurement of a team’s season. I originally decided to use average Elo because that’s what 538 did and because I’ve recently been sick of seeing teams’ end of season Elos get artificially inflated because of one late season match. However, neither of these are adequate justifications on their own.
The biggest problem I’m seeing with average Elos is that they don’t do a particularly great job of isolating a single season’s performance, which is potentially why 1114 has so many years better than 2008. Now, for the multi-year ranges, this becomes less of a problem because it’s not that big of a deal if a part of one year’s “rating” actually comes from the previous year, but for single year strengths it’s a much larger issue. I did some rough calculations using the assumption that each match changes a team’s Elo by about 5% and got the following breakdowns of how much impact 2012 and 2013 Elos would have on a hypothetical team’s 2014 average Elo, here are the results (again, very crude):
Obviously as teams get more matches, the dependency on the previous years’ Elos goes down, but not super quickly. In contrast, if we used end of season Elo, it looks more like this:
There’s much less dependence on past years, and the drop goes more quickly with more matches.
I need to do some proper investigation to see which is actually superior. The best will very likely be to use some blend of average, end of season, and max Elos, but I’d need to determine which weights to use. Another thing I want to look into is finding the average Elo after the team has played X number of matches (I’ve seen 538 ignore the first 4 NFL matches for example). This would help remove some of the dependence on the prior seasons’ performances.
paper: FRC Elo 2008-2016
I’m really curious what that chart will look like, if end of season Elo is used here?
Whoa! What an awesome job of statistical processing.
I’ll buy the argument that this doesn’t take championships or trophies or such into account… you laid out your assumptions and on the basis of those assumptions your data makes for a very convincing case. There are different ways to define “dynasty” but given your definition… which is at least as valid as any other arbitrary definition… I don’t think there is any argument. The only argument can be with the definition of ‘dynasty’ and there is no way to prove that, statistically or otherwise.
More significantly, perhaps, when I look at the data, not just world-wide, but for the regions that I’ve been familiar with since '04 or '05, the data ‘feels’ right. I didn’t see any data that made me wonder, “Who are THEY and why are THEY here?” Every team I saw listed was always a team I would rather have as an alliance partner than as an opponent. (Although it is nice to play against the ‘greats’ occasionally, because every now and then you score a memorable win.)
Now just to show you that I actually read the data, I was caught a bit off guard by the data for British Columbia, a region I know very well because… um… for the period of 2004-2010 it was pretty much one team. Digging into the raw data, I discovered that the data was correct, but that the team (okay, heck… it was my team) 1346 wasn’t listed as being from British Columbia. Fair enough… the team wouldn’t be in any recent databases because it shut down in 2010.
So that’s the biggest critique I have… some of your location data for defunct teams is not complete. And I hope you can hear the smile in my typing as I point this out… otherwise you’d be entirely within your rights to think, “Seriously dude? THAT’s your critique?” Sorry, but it’s the best I’ve got… everything else looks really sound.
I’ll leave it to others to debate what a dynasty is, but I’ll express my admiration for the thoroughness with which you have pursued your definition, and the very interesting conversations that have stemmed from the results!
just a suggestion, but maybe you could try a logarithmic scale of win margain to lower those crazy outliers.
When I looked at the OPRs in 2008, 1114 had the greatest margin in standard deviations over the next team that I calculated. In 2015, 1114 std dev over the mean was almost as great, but about a half dozen teams had nearly as high values, so 1114 was not as dominant.
Off season should be excluded because 1) teams have differential access to off season events and 2) teams generally do not treat off season events as seriously. In California at least, many teams even bring multiple robots to the off season events that split their competitiveness.
This is probably due to Caleb’s use of 1500 as a base average instead of calculating it individually for each season.
In addition, teams typically don’t use their competition season drive team in an effort to let members with other roles have fun.
Here you go my man
Same general trends as the average Elo graph. Biggest change is that the 1 and 2 year streaks are more diverse, which I think is more reasonable. 2056 and 254 gain a little ground on 1114 as well.
End of season Elo is possibly a better representation of everyone’s memories of who the best teams were as well, since teams who had deep champs playoff runs have an additional chance to improve their Elo before the season ends.
That 1717 was the dominant robot in 2012 is more consistent with the perceived situation. Also 254 was clearly the top robot in 2014, not 1114. That 2056 surpassed 1114 after 2015 also is more consistent, and we see 469 run to 2012. 1114’s dynasty run peters out after 2015 in this analysis, and 254’s emergence is more incremental.
Can we get end of season Elo for all regions and with team lookup in a similar spreadsheet? Thanks!
Fun fact: 7/8 of time-spans where 1114 was the best team in the world contained 2008.
How about a logistic function? May have better results than logarithmic.
I might give it a shot, but I don’t think the specific function I use will help things out much, as I tried both a “Elo change maximum” and a logarithm approach in the linked thread and both were similarly unsuccessful.
I think the broader issue is that the amount of “information” that we can take away from blowout matches varies drastically depending on the game. For example, in 2018 blowouts might not tell you as much because the positive feedback loops in the game design made them “easier” to achieve. In 2017 in contrast, it was much harder to get blowout victories due to the diminishing point return for scoring gears, so the fact that an alliance destroyed another alliance is actually a pretty useful data point that you probably don’t want to give less weight than you give to tighter victories.
The best solution would obviously be to have the Elo calculation method vary between years depending on game specific features, but I’m not there yet.
This is an interesting way to look at these charts, finding the year which is in the most ranges for each team. We can think of this as a “dynasty making” year. Here are the dynasty making years for teams with at least 5 entries:
1114: 2008 (63/72)
254: 2015 (18/26)
2056: 2012 (19/20)
233: 2004 or 2005 (17/19)
469: 2010 (5/5)
254’s dynasty looks to be the least “centered” of these by a solid margin.
Sometime in the next few days.
Here’s a similar book that uses end of season Elos:
I made it public to let people fill in colors if they want (although if you do you should also add them here)
Here’s a private sheet in case someone screws up the public one:
I do think I like end of season Elos better than average elos for the single year ranges. Still not sure about the multi-year ranges, but I guess they should be better if the single year ranges are better. Maxes are probably the best though.