A couple months ago, someone asked me why I felt that Elo was a worthwhile metric to compare teams that existed in isolated regions. I did some quick calculations then to prove my point, but I wanted to look into this question more. Intuitively, we all know some regions are stronger than others (and we all think our region is under-rated), but does Elo run into regional biasing problems like this?
The best method I could think of to answer this question would be to compare the Elo ratings of every team attending championships before and after they have competed on the international stage. Since championships are by a large margin the most diverse events we have, this seemed like a good place to find out if teams from some regions consistently over or under perform their Elo expectations. Since Elo is a zero-sum algorithm (during the season at least), we would expect to consistently see the Elos of teams from heavily under-rated regions increase and the Elos of teams from heavily over-rated regions to decrease at the championship event.
I have attached the results of this endeavor to this post. It contains the raw data, a summary of Elo ratings of every region from 2008-2017, and a condensed summary that looks at the year 2017 in isolation as well as the period 2012-2017. I chose the period 2012-2017 because I wanted to give the Elo model a few years to account for regional differences on its own.
If you don’t want to look at the document, here is the summary:
The only regions that were significantly (p < 0.05) over-rated according to Elo in 2017 were RI (p = 0.024) and OK (p = 0.035). Considering I was testing 61 different regions, and none of the regions had p values less than 1/61, we are severely lacking good evidence that Elo was consistently over-rating regions in 2017 alone.
No regions were found to be significantly under-rated according to Elo in 2017.
For the period 2012-2017, 8 regions were found to be significantly over-rated by Elo. Those regions were:
region (average Elo change at championships)
RI (-55)
OK (-24)
NY (-10)
Brazil (-41)
Mexico (-16)
TX (-9)
MO (-11)
LA (-23)
For reference, a 7 point Elo change corresponds to an increased likelihood of winning an otherwise even match by about 1%.
Likewise for 2012-2017, 4 regions were found to be significantly under-rated by Elo. Those regions were:
MI (14)
CT (15)
SD (81)
ON (9)
Here is my take on the results:
The most extreme Elo changes for 2012-2017 came from regions which only sent a single digit number of teams during that period, RI with 8, Brazil with 9, and SD with 2. Even though these regions did have significant drops, they were sending less than two teams to championships each year during this time, so their regions’ large Elo changes could very likely be explained by other factors.
Excepting those 3, the largest consistent Elo change we see in any region during this period comes from OK at -24. This means that, in an otherwise even match, we would expect Elo to overestimate an OK team’s win likelihood by about 3%. As another reference point, one of your climbs in 2017 was worth about 11 Elo points to every team on your alliance and -11 Elo points to every team on the opponent alliance.
For me, the regional differences found in this endeavor were not big surprises to me, and if anything, were small enough that they have given me more confidence in Elo as a rating system. Any major biases in Elo will self-correct over time. The only region that I would feel comfortable predicting future under-rating for by Elo would be MI, considering they have been under-rated since 2009, they have a large sample size to work with, their rookie growth for the past few years has been abnormal relative to the rest of FRC, and they will likely continue to be one of the most heavily isolated regions into the near future.
Also, MN just barely wasn’t significantly over-rated for the period 2012-2017 (p = 0.057), so that makes me happy sort of.
Elo region comparison.xlsx (1.52 MB)
Elo region comparison.xlsx (1.52 MB)