Paper: 4536 event simulator 2017

Thread created to discuss this whitepaper.

This workbook is a tool which can be used to simulate current 2017 events, or to calculate statistics for completed 2017 events. Data is pulled from the TBA API in the “data import” sheet. Explanations for how to use each sheet can be found in the “instructions and FAQ” sheet.

Here is the very first revision of my 2017 event simulator. I have tested it to the best of my ability using the week 0 data and data from previous years, but it still likely contains numerous bugs, which I will be ironing out as we go. Please let me know if you notice anything wrong.

Right now, all this really does is determine calculated contributions and predicted contributions. I will be building on this base in the coming weeks to make match predictions and eventually rankings predictions. I am planning to make my first update on Friday after the real match data exposes bugs, and then I will be releasing updates weekly every Tuesday or Wednesday which will have new seed values, bug fixes, and new features.

Let me know if you have any questions or suggestions.

Is this something that can be fixed?

I obviously have internet access, or I wouldn’t be able to post this from the same computer :slight_smile:

unable to access TBA API.png

unable to access TBA API.png

That error message is of my own making to catch a known error that can occur when there is no internet connection, but in your case it is being displayed for a different error. I played around with libreoffice a few weeks ago to try to get it to work. The problem seems to relate to the json parsing module that I am using. If there exists a different json parsing module out there that works with both libreoffice and excel I would be happy to try it. Alternatively, if someone who knows libreoffice macros better than I do could hunt down exactly why I am getting this error, I could change my code accordingly.

My attempts to determine why this error occurs have left me baffled.

I developed a way to verify that predicted contributions were being calculated properly, and unsurprisingly they were not correct. I have uploaded a revision which corrects this, as well as:
Changes the formatting in some areas
Corrects various small bugs
Adds more error messages for common errors
Expands the “instructions and FAQ” sheet
Adds ranking data

I still expect something to break on Friday when we get real data, but this revision is still a substantial improvement over the previous version.

Excellent work! This looks like it would be really helpful!

One suggestion: in the VLOOKUP in ‘team lookup’!A5, I’d recommend using the optional last parameter with a value of FALSE. Without this parameter, team name it pulls can get a bit wonky when the event is changed. I set it to PAHAT and this happened:

I certainly hope so.

One suggestion: in the VLOOKUP in ‘team lookup’!A5, I’d recommend using the optional last parameter with a value of FALSE. Without this parameter, team name it pulls can get a bit wonky when the event is changed. I set it to PAHAT and this happened:

Good idea, this change will be reflected in my next update. I had realized this was an issue earlier, but couldn’t think of a straightforward workaround. I hadn’t bothered though to check if VLOOKUP had any extra functionality that I wasn’t utilizing.

I have uploaded v3. Primary changes are:
Implementing Brian’s suggestion
Various small formatting changes
Now imports playoff matches
Fixes a bug that crashes the program when insufficient matches have been played to find calculated contributions.

I still expect to find a couple of other bugs before the end of the day, provided they are not crash-level bugs, I will upload v4 at around 6pm.

This is incredible, thanks for sharing it

This is one of the most useful implementations of tying the Blue Alliance API to Excel that I have seen - major kudos. The data import macro you’ve made here is quick and places the data into a reasonably useful format that can then be used for any number of calculations / analyses. We were able to wrangle the JSON a bit last year with PowerQuery, but it was pretty slow and difficult to manipulate. I look forward to using this instead :slight_smile:

Thanks for the compliments, I’ll feel more comfortable accepting them once I finish eliminating the really bad bugs. :slight_smile:

I have updated to rev4. Here are the changes:
Changed calculated contributions to display 0.00 until at least n/2+5 matches have been played where n is the number of teams at the event.
Added winning margin and match points to the “completed matches” section of the team lookup sheet
Various minor speed improvements of code
Tied scores are now handled properly instead of being assumed to be a red win
changed rankings from “average match points” to “match points”
Fixed match lookups for high-numbered teams. Previously lookup would only work for lowest numbered 30 teams.
Now auto-generates team list based off of the match schedule instead of using the team list request when a team list is available. This allows contributions to be determined even when the team list is erroneous, as in txlu for team 4301.
Fixed calculated contributions to be calculated properly and display correctly after sufficient matches have been played.
Added a change log to “instructions and FAQ”

Unless I find any major bugs, this will be my last week 1 revision. Expect week 2 version to come out on Tuesday or Wednesday.

Well, we found a major bug. Thanks to everyone who sent me info to help me track it down. Essentially, importing playoff matches caused the matrices to generate incorrectly, so calculated contributions and predicted contributions were unable to be calculated once the playoff schedule was released.

Changes since v4 are:
Fixed the rotor ranking point and kPa ranking point lookup on the match lookup sheet to work for red.
Fixed contribution calculations to work even after playoff matches have started.
Changed “last completed match” (A3) in “data imort” to read “last completed qualification match”
Fixed the VLOOKUP methods in “match lookup” and “team lookup” to correctly display matches after match 60.
Added color formatting to the “Color” column in “team lookup”

A Week 1 book has been added. Releases from now on should be much more stable since I have real event data that I can test any changes on.

Changes since v0.5 include:
No longer recalculates predicted contributions after the event has ended.
Deleted unnecessary whitespace which makes scrollbars easier to use.
Added Elo ratings to calculated contributions and predicted contributions
Added Elo predictions to team lookup and match lookup
Updated seed values to be based on week 1 data instead of week 0 data.

I’m working on predicting the attributes of match score, and I found something very interesting. Essentially, predicting any scoring category this year is somewhere between extremely difficult and impossible. Using predicted contributions, I attempted to build models to predict future scores for each category. I have attached the week 2 (excluding Southern Cross and Israel 3) qual match results of the best model I could make for each category.

Key takeaways:
Of the key areas of the game, fuel is by far the easiest to predict, takeoff points and rotor points are the next easiest, and fouls are very difficult or impossible to predict.
I was completely unable to create a model using predicted contribution to total points (predicted OPR) that had any predictive power. Likewise with CCWM, calculated contribution to win, and calculated contribution to playoff total points. This poor performance of calculated contributions is without precedent since at least 2008. I had previously created similar models for the years 2008-2016, and I was able to create models with predictive power with relative ease. I don’t see any evidence at this point that calculated contribution to total points is a useful metric in isolation, so I think people should use extreme caution when using it to rank robot performance this year.
The kPa ranking point and rotor ranking point were very difficult to impossible to predict, but I chalk that up more to the fact that they occurred so infrequently than to my models’ performances. I will look back more at these in the future when they become more frequent.
My model performances may improve in future weeks when many/all teams at events have already competed at least once, and my model can thus use their previous event contributions to make better predictions early on at events.

My next points of investigation will be as follows:
I believe there are improvements to be made in predicting rotors/gears because the total scored gears model is behaving worse than the rotor points model. If gear contributions were properly calibrated, this model should perform better than rotor points since it accounts for nonlinear gear value and rotor points does not.
I am going to try making an unpenalized total points contribution to replace total points, hopefully this will remove some of the unpredictable variance and can thus be used to predict things like match scores.
I am also going to try calculating defense metrics and using those to predict quals scores. It is possible that enough defense is being played this year to extract some useful information from defense predictions.
I will try to predict penalty points using a combination of penalty points drawn and penalty points given to the other team. I doubt this will have much predictive power, but it is worth a shot.

week 2 model summary.xlsx (12.2 KB)

week 2 model summary.xlsx (12.2 KB)

I have updated to v2.0. Changes include:
Fixed the “Upcoming matches” section of “team lookup” to no longer display illegal “ghost matches” after match 80.
Improved run speed
Improved predictions for all scoring categories
Added rotor ranking point and kPa ranking point predictions to “match lookup”
Removed data from the ranking point charts on “match lookup” because I have not verified independence of ranking point categories, so multiplication is potentially invalid
Added rotor ranking point and kPa ranking point predictions to “team lookup”
Updated seed values

As I stated above, almost everything about this game is difficult to predict. My near-term focus will be on improving predictions of gears/rotors because they are so crucial to scoring, and may provide me with the ability to predict rotor ranking points. It will be a while before I get around to predicting rankings, since I still am nowhere near where I want to be in predicting matches. I have given up trying to predict penalties, and I don’t expect that I will get around to incorporating defense until at least week 5.

Attached is a summary of my Elo predictions for week 2 events.

The short story is that it is getting a Brier score of 0.208 for quals matches, which is decent, but not great, and probably worse than any year since at least 2009. I expect predictions in later weeks to improve slightly as every team competes at least once in this season.

Not to brag, merely to make another point of reference, the cumulative TBA Brier score for quals matches is 0.231.

Week 2 Elo analysis.xlsx (908 KB)

Week 2 Elo analysis.xlsx (908 KB)

So, the above statements are basically all wrong. I had a weird bug in my code that incorrectly assigned blue score to the red alliance and vice versa. This caused all above models to behave incredibly poorly. After fixing the bug, I was able to create models that explained the following amount of variance:
total points: 20.2%
playoff total points: 18.6%
win: 3.8%
winning margin: 7.2%

Week 3 update has been posted. It is plausible that it will have some bugs since there were no mid-week events for me to test things on. Let me know if you see any.

Changes since week 2:
Improved prediction models for all calculated contributions
Updated averages for all categories
Added predicted winning margin, predicted match points, predicted contribution win probability, and win probability to the team lookup sheet
Added predicted contribution win probability to the match lookup sheet.
Updated win probability on the match lookup sheet to include predicted contributions.
Removed average ranking points from team lookup since I have not verified independence of ranking point categories.
Updated all seed values.

thank you for your work on this and posting it. I “discovered” the thread just this morning, downloaded, picked the event I am following and Viola! data is flowing into the spreadsheet. This is great. Now on to waiting for enough matches to run and learning the calculations…

Wow, this is fantastic. Thank you Caleb. I’m an old engineer who just discovered this robotics game this year in the middle of build season. Our team had never scouted before and I started us doing paper scouting. One of the students has picked up on it and was trying to put our data into Google Sheets, something I was unfamiliar with. This XL program/ simulator is fantastic. I did not know all that data was available from TBA. I’ve looked at some of our past events with this, and I’m anxious to see how it can aid us in our future events. Thank you very much.

Just to clarify, at what point does the data start calculating for a specific event? COuld this be used at the end of day 1?

I love the event simulator… sound like this could be a great solution for smaller team like us getting started with scouting…