

FRC isn't for every school; it takes a lot of drive just to fundraise every year. It is a premium program that requires a premium about of effort to sustain and be successful. And man is it fun to do.  JesseK [more] 



A collection of small projects that will be explained in the associated thread.
I frequently work on small projects that I don't believe merit entire threads on their own, so I have decided to upload them here and make a post about them in an existing thread. I also generally want my whitepapers to have instructions sheets so that anyone can pick them up and understand them. However, I don't want to bother with this for my smaller projects.
IRI seeding projections.xlsx
download fileElo and OPR comparison.xlsx
download file2017 Chairman's predictions.xlsm
download file2018_Chairman's_predictions.xlsm
download fileHistorical mCA.xlsx
download fileGreatest upsets.xlsx
download filesurrogate_results.xlsx
download file2017 rest penalties.xlsx
download file2018_Chairman's_predictions v2.xlsm
download fileauto_mobility_data.xlsx
download file2018 start of season Elos.xlsx
download file07162017 12:10 AM
Caleb SykesI frequently work on small projects that I don't believe merit entire threads on their own, so I have decided to upload them here and make a post about them in an existing thread. I also generally want my whitepapers to have instructions sheets so that anyone can pick them up and understand them. However, I don't want to bother with this for my smaller projects.
07242017 09:12 PM
Caleb SykesIn this post, Citrus Dad asked for a comparison of my Elo and OPR match predictions for the 2017 season. I have attached a file named "Elo and OPR comparison" that does this. Every qual match from 2017 is listed. Elo projections, OPR projections, and the average of the two, are also shown for each match. The square errors for all projections are shown, and these square errors are averaged together to get Brier scores for the three models.
Here are the Brier score summaries of the results.
Total Brier scores OPR Elo Average 0.212 0.217 0.209 Champs only Brier scores OPR Elo Average 0.208 0.210 0.204
07252017 06:02 PM
Citrus Dad
In this post, Citrus Dad asked for a comparison of my Elo and OPR match predictions for the 2017 season. I have attached a file named "Elo and OPR comparison" that does this. Every qual match from 2017 is listed. Elo projections, OPR projections, and the average of the two, are also shown for each match. The square errors for all projections are shown, and these square errors are averaged together to get Brier scores for the three models.
Here are the Brier score summaries of the results. Code:
Total Brier scores OPR Elo Average 0.212 0.217 0.209 Champs only Brier scores OPR Elo Average 0.208 0.210 0.204 
07282017 08:01 PM
Caleb SykesI am currently working on a model which can be used to predict who will win the Chairman's Award at a regional or district event. I am not covering district championship Chairman's or Championship Chairman's because of their small sample sizes. The primary inputs to this model are the awards data of each team at all of their previous events, although previous season Elo is also taken into account.
The model essentially works by assigning value to every regional/district award a team wins. I call these points milliChairman's Awards, or mCA points. I assigned the value of a Chairman's win in the current season at a base event of 50 teams to have a value of 1000 mCA. Thus, all award values can be interpreted as what percentage of a Chairman's award they are worth. Award values and model parameters were the values found to provide the best predictions of 20152016 Chairman's wins. At each event, a logistic distribution is used to map a team's total points to their likelihood of winning the Chairman's Award at that event. Rookies, HOF teams, and teams that won Chairman's earlier in the season are assigned a probability of 0%.
I have attached a file named 2017_Chairman's_predictions.xlsm which shows my model's predictions for all 2017 regional and district events, as well as a sheet which shows the key model parameters and a description of each. The model used for these predictions was created by running from the period 20082016, with tuning specifically for the period 20152016, so the model did not know any of the 2017 results before "predicting" them.
Key takeaways:
07292017 08:35 PM
Caleb SykesI have added another workbook named "2018 Chairman's Predictions." This workbook can be used to predict Chairman's results for any set of teams you enter. The model used here has the same base system as the "2017 Chairman's Predictions" model, but some of the parameter values have changed. These parameters were found by minimizing the prediction error for the period 20162017.
Also in this book is a complete listing of teams and their current mCA values. The top 100 teams are listed below.
team mCA 1718 9496 503 9334 1540 9334 2834 9191 1676 8961 1241 8941 68 8814 548 8531 2468 8112 2974 8092 27 8047 1885 7881 1511 7786 1023 7641 1305 7635 2614 7568 245 7530 1629 7381 2486 7100 66 7027 3132 6748 1816 6742 1086 6551 1311 6482 1710 6263 2648 6241 125 6223 558 6155 141 6083 1519 6082 1983 6060 4039 5985 33 5851 2771 5780 1902 5582 624 5578 1011 5496 118 5470 2137 5461 1218 5424 2169 5390 910 5382 3284 5353 3478 5344 771 5321 75 5306 2557 5291 233 5287 987 5224 1868 5215 3309 5175 1714 5158 932 5147 1986 5144 537 5138 597 5077 604 5068 2056 5059 2996 5054 4613 5042 399 5029 1477 5010 2220 4994 2337 4955 3618 4896 4125 4823 217 4816 1730 4803 359 4784 2655 4714 2500 4706 694 4695 1923 4667 708 4662 1622 4661 1987 4655 2642 4655 1671 4630 4013 4627 772 4626 2415 4622 4063 4604 540 4501 433 4440 4525 4426 384 4412 3476 4384 2485 4333 3008 4325 303 4307 1711 4288 2590 4266 3142 4264 3256 4260 836 4251 3880 4250 1678 4244 2471 4237 230 4230 78 4224
08042017 02:24 PM
Caleb SykesI got a question about historical mCA values for a team, so I decided to post the start of season mCA values for all teams since 2009. This can be found in the attached "Historical_mCA" document.
09262017 04:45 PM
Caleb SykesI was wondering if alliances with surrogates were more or less likely to win than comparable alliances without surrogates. To investigate this, I found all 138 matches since 2008 in which opposing alliances had an unequal number of surrogates. I threw out the 5 matches in which one alliance had 2 surrogates more than the other alliance.
I started by finding the optimal Elo rating to add to the alliance that had more surrogates in order to minimize the Brier score of all 133 matches. This value was 25 Elo points. The Brier score improved by 0.0018 with this change. This means that, in a match between two otherwise even alliances, the alliance with the surrogate team would be expected to win about 53.5% of the time. This potentially implies that it is advantageous to have alliances which contain surrogates.
To see if this was just due to chance, I ran 10 trials where I would randomly either added or subtracted 25 Elo points from each alliance. The mean Brier score improvement with this method was 0.00005, and the standard deviation of Brier score improvement was 0.0028. Assuming the Brier score improvements to be normally distributed, we get a zscore of 0.62, which provides a pvalue of 0.54. This is nowhere near significant, so we lack any good evidence that it is either beneficial or detrimental to have a surrogate team on your alliance.
Full data can be found in the "surrogate results" spreadsheet. Bolded teams are surrogates.
09262017 11:26 PM
Bryce2471
I frequently work on small projects that I don't believe merit entire threads on their own, so I have decided to upload them here and make a post about them in an existing thread. I also generally want my whitepapers to have instructions sheets so that anyone can pick them up and understand them. However, I don't want to bother with this for my smaller projects.

09262017 11:50 PM
Caleb Sykes
If you have not read The Signal and the Noise by Nate Silver, (the guy who made FiveThirtyEight) I highly recommend it. I have no affiliation to the book, other than that I read it and liked it. I would recommend it to anyone who is interested in these statistics and prediction related projects.

10022017 11:29 PM
Caleb SykesI decided to investigate how important breaks between matches were for team performance. If the effect of rest is large enough, I thought I might add it into my Elo model. I was originally going to use the match start times as the basis, but after finding serious problems with this data set, I switched to using scheduled start times.
Essentially, what I did was to give each team on each alliance an Elo penalty which was determined by how much "rest" they have had since their last match. I tried both linear and exponential fits, and found that exponential fits were far better suited to this effort. I also used the scheduled time data to build two different models. In the first, I looked at the difference in scheduled start times for each team between their last scheduled match and the current match. In the second, I sorted matches within each event by start time and gave each match an index corresponding to its placement on this list (e.g. Quals 1 has index 1, Quals 95 has index 95, quarterfinals 11 has index 96, quarterfinals 22 has index 101, etc...).
The best fits for each of these cases were the following:
Time difference: Elo penalty per team = 250*exp((t_current_match_scheduled_time 
t_previous_match_scheduled_time)/(5 minutes))
Match index difference: Elo penalty per team = 120*exp((current_match_index 
previous_match_index)/(0.9))
Both of these models provide statistically significant improvements to my general Elo model. However, the match index method provides about 7X more of an improvement than the time difference method (Brier score improvement of 0.000173 vs 0.000024). This was surprising to me, since I would have expected the finer resolution of the times to provide better results. My guess as to why the indexing method is superior is due to time differences between quals and playoff matches. I used the same model for both of these cases, and perhaps the differences in start times is not nearly as important as the pressure of playing backtoback matches in playoffs.
I have attached a table summarizing how large of an effect rest has on matches (using the match index model).
Playing back to back matches clearly has a strong negative impact on teams. This generally only occurs in playoff matches between levels. However, its effect is multiplied by 3 since all three alliance members experience the penalty. A 3team alliance who just played receives a 80 Elo penalty relative to a 3team alliance who played 2 matches ago, and a 108 Elo penalty relative to a 3team alliance who played 3 matches ago. 108 Elo points corresponds to 30 points in 2017, and the alliance that receives this penalty would only be expected to win 35% of matches against an otherwise evenly matched opposing alliance.
The match index method ended up providing enough improvement that I am seriously considering adding it into future iterations of my Elo model. One thing holding me back from using it is because it relies on the relatively new data of scheduled times. At 4 years old, this data isn't nearly as dubious as the actual time data (1.5 years old), but it still has noticeable issues (like scheduling multiple playoff replays at the same time).
You can see the rest penalties for every 2017 match in the "2017 rest penalties" document. The shown penalties are from the exponential fit of the match index model.
10022017 11:37 PM
Basel AI'm a bit skeptical, because there are some effects of alliance number on amount of rest during playoffs (e.g. #1 alliances that move on in two matches will always have maximal rest, and are typically dominant). Not sure if you can think of a good way to parse that out, though.
10032017 09:52 AM
Caleb Sykes
I'm a bit skeptical, because there are some effects of alliance number on amount of rest during playoffs (e.g. #1 alliances that move on in two matches will always have maximal rest, and are typically dominant). Not sure if you can think of a good way to parse that out, though.

10032017 10:16 AM
Basel A
I don't quite follow. My rest penalties are an addition onto my standard Elo model, which already accounts for general strength of alliances. 1 seeds were already heavily favored before I added rest penalties because the 1 seed almost always consists of highly Elorated teams. In my standard Elo model, the red alliance (often 1 seed, but not always) was expected to win the first finals match 57% of the time on average. With my rest penalties added in, the red alliance is expected to win the first finals match 62% of the time on average.

10032017 10:36 AM
Caleb Sykes
Because the first seed is so often in the SF/finals with maximum rest, you could be quantifying any advantage the first seed has (except how good they are as based on quals) as opposed to just rest. To use a dumb example, if the top alliance is favored by referees, that would show up here.

10032017 10:53 AM
GeeTwoAnother factor beyond what will be recognized from ELO is nonlinear improvements due to good scouting, alliance selection, and strategy. I would expect these to affect playoffs far more than quals.
10032017 01:19 PM
Basel A
Got it, that is an interesting take. Let me think for a little bit on how/if it is possible to separate alliance seeds from rest.

10042017 01:48 PM
microbunsI love the upsets paper  it's fun to look at these games and see the obviously massive disadvantage the winning side had. I'm looking back at games I had seen/participated in, and remembering the pandemonium those games created on the sidelines and behind the glass. Super cool!
10262017 11:36 PM
Caleb Sykes
I decided to investigate how important breaks between matches were for team performance...

10312017 11:11 AM
Caleb SykesNow that we actually have team lists for events, I thought I would revisit my 2018 Chairman's Predictions workbook since it is the most popular download of mine. It turns out that I did not have support for 2018 rookies, resurrected teams, or new veterans in these predictions.
I have attached a new workbook titled "2018_Chairman's_predictions_v2" which provides support for these groups. I also have added an easy way to import team lists for events simply by entering in the event key. If you have additional knowledge of events (or if you want to make a hypothetical event), you can still add teams to the list manually. I have also switched to using the TBA API v3, so this should hopefully still work after Jan 1.
Let me know if you notice any bugs with this book.
10312017 10:25 PM
Caleb SykesFor nearly all statistics that can be obtained from official data, one of our biggest issues is separating out individual team data from data points which actually represent something about the entire alliance. However, there was one statistic last season that was actually granular to the team level, and that data point was auto mobility. Referees were responsible this year for marking mobility points for each team individually, so these data points should have little to no dependence on other teams. Unfortunately, auto mobility was a nearly negligible point source for this game, and combined with the extremely high average mobility rates, made this a generally unimportant characteristic to describe teams. However, I thought it would be interesting to take a deeper look into these data to see if we can learn anything interesting from them.
I have uploaded a workbook titled "auto_mobility_data" which provides a few different ways of understanding mobility points. The first tab of this book contains raw data on mobility for every team in every match of 2017. The second tab contains a breakdown by team, listing each team’s seasonlong auto mobility rate as well as each team’s first match where they missed mobility (for you to check if you don’t believe your team ever missed auto mobility). Overall, about 25% of teams never missed their mobility points in auto, and another 18% had mobility rates of >95%. The top 10 teams with the most successful mobilities without a single miss are:
Team Successful Mobilities 2337 86 195 85 4039 85 27 84 3663 82 2771 73 3683 73 1391 72 1519 71 2084 71 4391 71
11032017 12:44 PM
Caleb SykesAfter trying a couple of different changes to my Elo model, I have found one that has good predictive power, is general enough to apply to all years, and is straightforward to calculate. What I have done is to adjust each team's start of season Elo to be a weighted average of their previous two year's End of season Elos. The previous year's Elo has a weight of 0.7, and the Elo of two years prior has a weight of 0.3. This weighted Elo is then reverted to the mean by 20%, just as in the previous system. In the previous system, only the last season's Elo was taken into consideration. Second year teams have their rookie rating of 1320 (1350 before mean reversion) set as their end of season Elo from two years previous.
This adjustment provides substantial predictive power improvement, particularly at the start of the season. Although it causes larger Elo jumps for some teams between seasons, Elos during the start of the season are generally more stable. As an indirect consequence of this adjustment, I also found the optimal k value for playoff matches to be 4 instead of 5 which it was under the previous system. This means that playoff matches have slightly less of an impact on a team's Elo rating under the new system.
I have attached a file called "2018 start of season Elos" that shows what every team's Elo would have been under my previous system, as well as their Elo under this new system. Sometime before kickoff, I will publish an update to my "FRC Elo" workbook that contains this change as well as any other changes I make before then.
11032017 01:11 PM
Caleb SykesWith this change, Elo actually takes a razor thin edge over standard OPR in terms of predictive power for the 2017 season (season long total Brier score = 0.211 vs 0.212 for OPR). However, it should be noted that this isn't really a fair comparison, since OPRs predictive power could probably be improved with many of these same adjustments I have been making to Elo. Even so, I think it's pretty cool that we now have a metric that provides more predictive power than conventional OPR, which has been the gold standard for at least as long as I have been around in FRC.