paper: Miscellaneous Statistics Projects

statistics

#1

Thread created automatically to discuss a document in CD-Media.

Miscellaneous Statistics Projects
by: Caleb Sykes

A collection of small projects that will be explained in the associated thread.

I frequently work on small projects that I don’t believe merit entire threads on their own, so I have decided to upload them here and make a post about them in an existing thread. I also generally want my whitepapers to have instructions sheets so that anyone can pick them up and understand them. However, I don’t want to bother with this for my smaller projects.

IRI seeding projections.xlsx (5.13 MB)
Elo and OPR comparison.xlsx (2.84 MB)
2017 Chairman’s predictions.xlsm (738 KB)
2018_Chairman’s_predictions.xlsm (266 KB)
Historical mCA.xlsx (428 KB)
Greatest upsets.xlsx (6.12 MB)
surrogate_results.xlsx (53 KB)
2017 rest penalties.xlsx (5.79 MB)
2018_Chairman’s_predictions v2.xlsm (379 KB)
auto_mobility_data.xlsx (6.1 MB)
2018 start of season Elos.xlsx (137 KB)
2018 start of season Elos v2.xlsx (140 KB)
most competitive events since 2005.xlsx (9.64 MB)
most competitive divisions since 2005.xlsx (6.69 MB)
OPR seed investigator.xlsx (367 KB)
serpentine_valley.xlsx (15.3 MB)


#2

I frequently work on small projects that I don’t believe merit entire threads on their own, so I have decided to upload them here and make a post about them in an existing thread. I also generally want my whitepapers to have instructions sheets so that anyone can pick them up and understand them. However, I don’t want to bother with this for my smaller projects.


#3

In this post, Citrus Dad asked for a comparison of my Elo and OPR match predictions for the 2017 season. I have attached a file named “Elo and OPR comparison” that does this. Every qual match from 2017 is listed. Elo projections, OPR projections, and the average of the two, are also shown for each match. The square errors for all projections are shown, and these square errors are averaged together to get Brier scores for the three models.

Here are the Brier score summaries of the results.

Total Brier scores		
OPR	Elo	Average
0.212	0.217	0.209
		
Champs only Brier scores		
OPR	Elo	Average
0.208	0.210	0.204

The OPR and Elo models have similar Brier scores, with OPR taking a slight edge. This is directly in line with results from other years. However, predictions this year were much less predictive than any year since at least 2009. This is likely due to a combination of the non-linear and step-function-esque aspects of scoring for the 2017 game. My primary prediction method last season actually used a raw average of the Elo predictions and the OPR predictions, which provided more predictive power than either method alone.


#4

Thanks


#5

I am currently working on a model which can be used to predict who will win the Chairman’s Award at a regional or district event. I am not covering district championship Chairman’s or Championship Chairman’s because of their small sample sizes. The primary inputs to this model are the awards data of each team at all of their previous events, although previous season Elo is also taken into account.

The model essentially works by assigning value to every regional/district award a team wins. I call these points milli-Chairman’s Awards, or mCA points. I assigned the value of a Chairman’s win in the current season at a base event of 50 teams to have a value of 1000 mCA. Thus, all award values can be interpreted as what percentage of a Chairman’s award they are worth. Award values and model parameters were the values found to provide the best predictions of 2015-2016 Chairman’s wins. At each event, a logistic distribution is used to map a team’s total points to their likelihood of winning the Chairman’s Award at that event. Rookies, HOF teams, and teams that won Chairman’s earlier in the season are assigned a probability of 0%.

I have attached a file named 2017_Chairman’s_predictions.xlsm which shows my model’s predictions for all 2017 regional and district events, as well as a sheet which shows the key model parameters and a description of each. The model used for these predictions was created by running from the period 2008-2016, with tuning specifically for the period 2015-2016, so the model did not know any of the 2017 results before “predicting” them.

Key takeaways:
[ul]
[li]The mean reversion value of 19% is right in line with the 20% mean reversion value I found when building my Elo model. It intrigues me that two very different endeavors led to essentially equivalent values.
[/li][li]It was no surprise to me that EI was worth 80% of a Chairman’s Award. I was a bit surprised though to find that Dean’s List was worth 60% of a Chairman’s Award, especially because two are given out at each event. That means that the crazy teams that manage to win 2 Dean’s List Awards at a single event are better off than a team that won Chairman’s in terms of future Chairman’s performance.
[/li][li]I have gained more appreciation for certain awards after seeing how strongly they predict future Chairman’s Awards. In particular, the Team Spirit and Imagery awards.
[/li][/ul]

More work to come on this topic in the next few hours/days.


#6

I have added another workbook named “2018 Chairman’s Predictions.” This workbook can be used to predict Chairman’s results for any set of teams you enter. The model used here has the same base system as the “2017 Chairman’s Predictions” model, but some of the parameter values have changed. These parameters were found by minimizing the prediction error for the period 2016-2017.

Also in this book is a complete listing of teams and their current mCA values. The top 100 teams are listed below.

team	mCA
1718	9496
503	9334
1540	9334
2834	9191
1676	8961
1241	8941
68	8814
548	8531
2468	8112
2974	8092
27	8047
1885	7881
1511	7786
1023	7641
1305	7635
2614	7568
245	7530
1629	7381
2486	7100
66	7027
3132	6748
1816	6742
1086	6551
1311	6482
1710	6263
2648	6241
125	6223
558	6155
141	6083
1519	6082
1983	6060
4039	5985
33	5851
2771	5780
1902	5582
624	5578
1011	5496
118	5470
2137	5461
1218	5424
2169	5390
910	5382
3284	5353
3478	5344
771	5321
75	5306
2557	5291
233	5287
987	5224
1868	5215
3309	5175
1714	5158
932	5147
1986	5144
537	5138
597	5077
604	5068
2056	5059
2996	5054
4613	5042
399	5029
1477	5010
2220	4994
2337	4955
3618	4896
4125	4823
217	4816
1730	4803
359	4784
2655	4714
2500	4706
694	4695
1923	4667
708	4662
1622	4661
1987	4655
2642	4655
1671	4630
4013	4627
772	4626
2415	4622
4063	4604
540	4501
433	4440
4525	4426
384	4412
3476	4384
2485	4333
3008	4325
303	4307
1711	4288
2590	4266
3142	4264
3256	4260
836	4251
3880	4250
1678	4244
2471	4237
230	4230
78	4224

If I make an event simulator again next year, I will likely include Chairman’s predictions there.


#7

I got a question about historical mCA values for a team, so I decided to post the start of season mCA values for all teams since 2009. This can be found in the attached “Historical_mCA” document.


#8

I was wondering if alliances with surrogates were more or less likely to win than comparable alliances without surrogates. To investigate this, I found all 138 matches since 2008 in which opposing alliances had an unequal number of surrogates. I threw out the 5 matches in which one alliance had 2 surrogates more than the other alliance.

I started by finding the optimal Elo rating to add to the alliance that had more surrogates in order to minimize the Brier score of all 133 matches. This value was 25 Elo points. The Brier score improved by 0.0018 with this change. This means that, in a match between two otherwise even alliances, the alliance with the surrogate team would be expected to win about 53.5% of the time. This potentially implies that it is advantageous to have alliances which contain surrogates.

To see if this was just due to chance, I ran 10 trials where I would randomly either added or subtracted 25 Elo points from each alliance. The mean Brier score improvement with this method was -0.00005, and the standard deviation of Brier score improvement was 0.0028. Assuming the Brier score improvements to be normally distributed, we get a z-score of -0.62, which provides a p-value of 0.54. This is nowhere near significant, so we lack any good evidence that it is either beneficial or detrimental to have a surrogate team on your alliance.

Full data can be found in the “surrogate results” spreadsheet. Bolded teams are surrogates.


#9

If you have not read The Signal and the Noise by Nate Silver, (the guy who made FiveThirtyEight) I highly recommend it. I have no affiliation to the book, other than that I read it and liked it. I would recommend it to anyone who is interested in these statistics and prediction related projects.


#10

Definitely this.

I actually read that book quite a while back. At the time, I thought it was interesting, but quickly forgot much of it. It was only relatively recently that I realized that the world is full of overconfident predictions, and that humans are laughably prone to confirmation bias. I now have a much stronger appreciation for predictive models, and care very little for explanatory models that have essentially zero predictive power.


#11

I decided to investigate how important breaks between matches were for team performance. If the effect of rest is large enough, I thought I might add it into my Elo model. I was originally going to use the match start times as the basis, but after finding serious problems with this data set, I switched to using scheduled start times.

Essentially, what I did was to give each team on each alliance an Elo penalty which was determined by how much “rest” they have had since their last match. I tried both linear and exponential fits, and found that exponential fits were far better suited to this effort. I also used the scheduled time data to build two different models. In the first, I looked at the difference in scheduled start times for each team between their last scheduled match and the current match. In the second, I sorted matches within each event by start time and gave each match an index corresponding to its placement on this list (e.g. Quals 1 has index 1, Quals 95 has index 95, quarterfinals 1-1 has index 96, quarterfinals 2-2 has index 101, etc…).

The best fits for each of these cases were the following:
Time difference: Elo penalty per team = -250exp((t_current_match_scheduled_time -
t_previous_match_scheduled_time)/(5 minutes))
Match index difference: Elo penalty per team = -120
exp((current_match_index -
previous_match_index)/(0.9))

Both of these models provide statistically significant improvements to my general Elo model. However, the match index method provides about 7X more of an improvement than the time difference method (Brier score improvement of 0.000173 vs 0.000024). This was surprising to me, since I would have expected the finer resolution of the times to provide better results. My guess as to why the indexing method is superior is due to time differences between quals and playoff matches. I used the same model for both of these cases, and perhaps the differences in start times is not nearly as important as the pressure of playing back-to-back matches in playoffs.

I have attached a table summarizing how large of an effect rest has on matches (using the match index model).

Playing back to back matches clearly has a strong negative impact on teams. This generally only occurs in playoff matches between levels. However, its effect is multiplied by 3 since all three alliance members experience the penalty. A 3-team alliance who just played receives a 80 Elo penalty relative to a 3-team alliance who played 2 matches ago, and a 108 Elo penalty relative to a 3-team alliance who played 3 matches ago. 108 Elo points corresponds to 30 points in 2017, and the alliance that receives this penalty would only be expected to win 35% of matches against an otherwise evenly matched opposing alliance.

The match index method ended up providing enough improvement that I am seriously considering adding it into future iterations of my Elo model. One thing holding me back from using it is because it relies on the relatively new data of scheduled times. At 4 years old, this data isn’t nearly as dubious as the actual time data (1.5 years old), but it still has noticeable issues (like scheduling multiple playoff replays at the same time).

You can see the rest penalties for every 2017 match in the “2017 rest penalties” document. The shown penalties are from the exponential fit of the match index model.


#12

I’m a bit skeptical, because there are some effects of alliance number on amount of rest during playoffs (e.g. #1 alliances that move on in two matches will always have maximal rest, and are typically dominant). Not sure if you can think of a good way to parse that out, though.


#13

I don’t quite follow. My rest penalties are an addition onto my standard Elo model, which already accounts for general strength of alliances. 1 seeds were already heavily favored before I added rest penalties because the 1 seed almost always consists of highly Elo-rated teams. In my standard Elo model, the red alliance (often 1 seed, but not always) was expected to win the first finals match 57% of the time on average. With my rest penalties added in, the red alliance is expected to win the first finals match 62% of the time on average.


#14

Because the first seed is so often in the SF/finals with maximum rest, you could be quantifying any advantage the first seed has (except how good they are as based on quals) as opposed to just rest. To use a dumb example, if the top alliance is favored by referees, that would show up here.


#15

Got it, that is an interesting take. Let me think for a little bit on how/if it is possible to separate alliance seeds from rest.


#16

Another factor beyond what will be recognized from ELO is nonlinear improvements due to good scouting, alliance selection, and strategy. I would expect these to affect playoffs far more than quals.


#17

A first pass could be to compare cases where Alliance #X would be advantaged by rest versus disadvantaged. Would give you an idea of the relative strength of the rest effect as compared to the various other things. Gus’s examples are definitely important effects.


#18

I love the upsets paper - it’s fun to look at these games and see the obviously massive disadvantage the winning side had. I’m looking back at games I had seen/participated in, and remembering the pandemonium those games created on the sidelines and behind the glass. Super cool!


#19

I’ve spent a fair bit of time off and on for the past month looking into this more, and since I have other things I would prefer to work on, I’m going to stop working on this for the forseeable future. I would like to retract all the information in the quoted post. I’m undecided on if I should delete the spreadsheet.

Essentially, my rest penalty model actually decreased my Elo prediction performance for the year 2016 when I applied the same methodology to that year. This probably either means:

  • 2016 and 2017 rest penalties were drastically different
  • My 2017 rest penalties were an overfitting of the data, and do not actually represent any real phenomenon
  • Scheduled time data are unreliable for 2016 and/or 2017
  • There is a bug in my code somewhere I am completely unable to find

If any of the first three are true, I’m not that interested in pursuing rest penalties more, and I have given up looking for bugs for the time being. This also means that I will not be looking at alliance seed affecting playoff performance for now.

When I originally created the rest penalties, I never really applied them to years other than 2017 (for which I was optimizing). This meant that I made the mistake I often criticize others for of not keeping training and testing data separate. I incorrectly believed that my statistical significance test would be sufficient in place of testing against other data, and am still baffled as to how my model could so easily pass a significance test without having predictive power in other years.

So anyway, sorry if I misled anyone, I won’t make this same mistake again.


#20

Now that we actually have team lists for events, I thought I would revisit my 2018 Chairman’s Predictions workbook since it is the most popular download of mine. It turns out that I did not have support for 2018 rookies, resurrected teams, or new veterans in these predictions.

I have attached a new workbook titled “2018_Chairman’s_predictions_v2” which provides support for these groups. I also have added an easy way to import team lists for events simply by entering in the event key. If you have additional knowledge of events (or if you want to make a hypothetical event), you can still add teams to the list manually. I have also switched to using the TBA API v3, so this should hopefully still work after Jan 1.

Let me know if you notice any bugs with this book.


Chairman's power rating