paper: Caleb's Event Simulator 2018

I have posted an update which can be used to simulate the Houston divisions based on the preliminary schedules. Huge thanks to Wes Jordan for getting all of the schedules into the same format as the TBA data. Basically the only change to this workbook is that I reference his site instead of TBA.

This workbook will only work with the preliminary schedules on Wes Jordan’s site. You will have to switch back to one of my normal simulators when the actual schedules and match results start to be posted.

You must spread some Reputation around before giving it to Caleb Sykes again.

I’m excited to take a look! Thanks for putting this together.

I had a bug that caused ranking predictions for teams in surrogate matches to be calculated incorrectly. This should be fixed in the new v7.3.

I have posted an update that uses the preliminary schedules for Detroit. Let me know if you notice any issues. Again, very big thanks to Wes Jordan for getting the schedules into a useful format for me.

Same disclaimer as before:

This workbook will only work with the preliminary schedules on Wes Jordan’s site. You will have to switch back to one of my normal simulators when the actual schedules and match results start to be posted.

I have posted a version 10.1 which can be used for off season events. I got rid of the expiration warning on it even though there is a good chance I will be posting at least one more update in a month or two. No promises though. Subscribe to the thread if you don’t want to miss an update.

I have a bunch of changes for this version, some of which may cause bugs, so if you see any please let me know:

  • “Update” now checks if the number of simulations to run is the same before running, and allows re-importing if a different number of simulations are selected
  • Added a “settings” sheet from which you can select the number of simulations to run and whether or not to stop the “Update” macro if no new data is available.
  • Added an “Import Event Keys” macro to the “event keys” tab to look up off season events which haven’t been posted as of the publishing of this book
  • Changed seed values to use the seed values each team had going into their event instead of just their most recent seed value. This has no event on off season competitions, but will correctly show teams’ seed values for historical events.
  • Added “average rank” to predicted rankings sheet and this is now the default team sort
  • Added ability to “simulate” from any past point in the event. This can be used for example to find what win probabilities would have been predicted by the simulator before the matches actually happened.

Here’s another update. I wanted to get it out before IRI. This is probably the most changes I’ve made in an update this year, although many of the changes are behind the scenes. Here are the key changes:

  • Added conditional formatting to predicted ranks
  • Uses park/climb points as second sort ranking criteria instead of total points, added auto points as third order ranking sort
  • Rankings are now generated according to match data instead of being directly pulled from TBA. This allows you to see what the rankings were at different points throughout the event.
  • various code optimizations, re-structuring and cleanup
  • Various bug fixes
  • Shows predicted contributions before event starts by copying seed values
  • Now allows simulations of events which do not have their schedules released yet

When the IRI teams list shows up on TBA, you’ll be able to make predicted rankings even without a schedule. Likewise, you can see what each team’s predicted rankings would have been before a schedule was created for completed events. With this, I’m finally comfortable jumping more into the “strength of schedule” conversations. All we have to do to tell if a team has a good or a bad schedule is to compare their pre-schedule ranking probabilities with their ranking probabilites after the schedule is released. We would still have to agree on a way to combine the full ranking data change into a single metric (or just accept that no great single summary statistic exists), but at least now each team’s strength of schedule can be seen in a personalized way for them. I never cared much for strength of schedule metrics that are basically some variation of subtracting opponent strength from partner strength, because how “good” or “bad” you feel your schedule is depends heavily on your expectations going into the event based on how good your team is. A schedule with 5 guaranteed wins and 5 guaranteed losses is clearly awful for the team looking to seed first, and clearly awesome for the team worried about seeding last.

Here is each team’s change in ranking expectations from before the schedule is released to after the schedule is released for Medtronic:

team	avg rank change	1 seed	Top 4	Top 8	Top 12	Top 15
233	-7.8		+0.0%	+9.2%	+21.7%	+24.9%	+25.7%
1816	+12.6		-0.1%	-3.1%	-11.3%	-20.4%	-26.6%
2052	+0.3		-3.1%	-5.1%	-0.4%	-0.9%	-0.4%
2181	-4.1		+0.1%	+8.3%	+12.3%	+13.8%	+14.6%
2232	+9.9		+0.0%	-1.3%	-4.7%	-9.8%	-13.5%
2239	+11.8		-0.1%	-1.1%	-5.3%	-9.9%	-13.7%
2450	+4.7		+0.0%	-0.6%	-6.9%	-9.9%	-12.4%
2470	+0.9		+0.0%	+0.1%	-0.2%	-1.6%	-2.2%
2480	-9.0		+0.0%	+0.4%	+1.4%	+2.8%	+4.0%
2498	-0.4		+0.0%	-1.2%	-2.6%	-5.0%	-4.6%
2500	+1.2		+0.0%	-0.6%	-1.2%	-3.2%	-4.3%
2501	-4.7		-0.1%	+10.3%	+17.8%	+17.4%	+17.2%
2502	+0.6		-2.7%	-4.9%	-7.7%	-3.0%	-0.7%
2508	+8.0		+0.0%	-0.6%	-4.4%	-10.1%	-12.5%
2509	-11.1		-0.1%	+13.8%	+23.5%	+32.4%	+35.5%
2513	-6.1		+0.0%	+0.1%	+0.9%	+3.2%	+5.2%
2515	-0.1		+0.0%	-0.1%	-0.4%	-1.2%	-1.1%
2518	-4.8		+0.0%	-0.4%	-0.1%	+0.8%	+1.6%
2529	-7.9		+0.0%	+0.0%	+0.9%	+3.6%	+6.2%
2532	-1.0		+0.0%	-0.3%	-0.7%	-1.2%	-1.8%
2545	-1.9		+0.0%	-0.2%	-0.7%	-0.8%	-0.9%
2823	-11.6		+0.3%	+9.8%	+22.2%	+32.1%	+35.7%
2825	+3.8		+0.0%	-0.1%	-1.0%	-1.9%	-3.1%
2846	+10.9		+0.0%	-13.6%	-27.2%	-34.4%	-36.6%
2847	+3.2		+0.0%	+0.0%	-0.5%	-1.9%	-2.4%
2879	-7.6		+0.0%	+1.2%	+5.5%	+10.4%	+15.7%
3018	-8.6		+0.0%	+1.9%	+8.0%	+11.9%	+15.0%
3023	-10.4		+0.0%	+1.7%	+5.8%	+14.2%	+19.1%
3026	+0.8		-0.1%	-1.5%	-5.4%	-7.5%	-8.4%
3038	+5.6		+0.0%	-0.2%	-2.4%	-5.6%	-7.9%
3058	+7.1		-0.2%	-2.6%	-7.7%	-13.9%	-15.3%
3081	-7.7		+0.0%	+0.1%	+1.4%	+5.9%	+9.2%
3102	-2.8		+0.1%	-1.1%	-1.8%	-1.8%	-0.2%
3184	+0.6		-0.3%	-6.9%	-5.0%	-4.3%	-3.9%
3244	-13.3		+0.0%	+2.5%	+11.4%	+20.6%	+26.2%
3278	+11.6		+0.0%	-0.5%	-3.1%	-7.3%	-10.7%
3298	+6.6		+0.0%	-0.3%	-2.8%	-7.3%	-7.7%
3299	-11.8		+0.1%	+10.1%	+24.6%	+33.5%	+35.3%
3407	+13.8		+0.0%	-0.4%	-2.3%	-5.5%	-9.3%
3454	+0.1		+0.0%	-0.6%	-1.4%	-3.1%	-4.5%
3630	+6.4		-0.3%	-5.7%	-14.5%	-19.9%	-21.9%
3745	+0.1		+0.0%	-1.3%	-4.4%	-2.5%	-4.3%
3751	+11.3		+0.0%	-3.9%	-13.1%	-21.4%	-27.3%
3839	+6.8		+0.0%	-4.6%	-9.7%	-14.8%	-18.7%
3840	+9.2		+0.0%	-1.3%	-8.1%	-13.9%	-18.1%
4207	-2.0		+0.0%	+0.2%	-0.7%	-0.9%	+1.1%
4229	-2.9		+0.0%	-0.1%	-0.4%	-0.5%	-0.2%
4536	+3.4		+0.0%	-0.6%	-2.7%	-5.3%	-6.7%
4549	-13.5		+0.0%	+8.0%	+21.8%	+32.8%	+36.9%
4607	-0.5		+0.0%	-0.8%	-3.4%	-3.7%	-1.0%
4664	-4.9		+0.0%	-0.1%	+0.3%	+0.7%	+0.9%
5172	-0.2		+7.9%	+2.1%	+0.4%	+0.3%	+0.2%
5434	+2.8		-1.3%	-15.5%	-17.2%	-12.4%	-9.8%
5637	-17.2		+0.0%	+1.7%	+6.4%	+14.5%	+21.2%
5913	-2.0		+0.0%	+0.1%	-0.1%	-0.7%	-1.5%
5996	+0.3		+0.0%	+0.1%	-0.7%	-1.2%	-1.6%
6709	-7.0		+0.0%	+0.6%	+2.4%	+6.9%	+11.3%
7038	+2.8		+0.0%	-0.2%	-1.5%	-3.2%	-5.2%
7068	+4.0		-0.1%	-0.5%	-2.7%	-5.1%	-7.2%
7180	+7.6		+0.0%	-0.4%	-1.9%	-4.5%	-7.6%

I’ll probably make a full workbook that contains something like this for all 2018 events soon.

Huge thanks to Patrick Fairbank and the other developers of Cheesy Arena for creating these awesome pre-generated schedules. Adding pre-event rankings was already a large effort, I’m so grateful I could use these schedules instead of building my own scheduler from scratch.

You’ve been busy! As you know, I’ve been interested and experimenting with this Strength of Schedule stuff for some time. Thanks so much for all your hard work on it. This approach is a really nice solution to this difficult problem. It’s a great addition to your “I Can’t Live Without It” simulator.

So your “strength of schedule” metric just compares with one pre-generated schedule? That doesn’t seems like a good metric, as there is no guarantee that these pre-generated schedules are fair. It would make more sense to generate hundreds of potential schedules and compare the actual schedule against those.

EDIT:
Perhaps another solution is to keep the same pre-generated schedule, but randomize the order of the teams in it hundreds of times. That way you won’t need to create a whole new scheduling algorithm.

This is indeed what it currently does, every simulation randomizes the assignment of teams to the schedule indices.

I just uploaded a v4.

Changes:
Fixed a bug that didn’t reset surrogate assignment to the schedule when a new event is imported. Essentially, this meant that if you ran events with surrogates in different positions, teams would randomly have matches removed
Changed graph color in team lookup to green

This was a crazy bug fix, it took me a couple full days to track down. Running any single event was fine, but when I ran a different event after certain events, the ranking projections would be a little bit off for a few random teams. The affected teams would also change on each new simulation.

Should be all good now though.

Another update, primary purpose was to fix handling of DQs.

Updates:
Added bolding and strikethroughs to data import for surrogate and Dqed teams respectively
Covers some special cases for surrogate teams that were not covered previously
Now handles Dqed teams properly
Updated instructions and FAQ

I really hate dealing with surrogates and DQs, they’re such a pain. Hopefully I did a good enough job this time that I don’t have to think about them again.

Hey Caleb,

I’m wondering if there is some sort of substantial difference between these schedules and the schedules actually used in FRC. (Maybe you play more of the same teams over and over again in a real schedule?). Would you be able to use the schedule for some 40-team district event as a template and run your simulator on that? (Still with the random team assignments).

For example, compare the simulator probabilities for MAR Hatboro-Horsham with your current random schedule maker to a simulator using the actual schedule as a template.

I don’t expect there to be much of a difference, but it would be interesting to see that empirically.

Yeah, I could give that a whirl. My guess is that there’ll be more variance in the runs with the actual schedule than there will be using the Cheesy Arena schedule, but that’s solely based on my uninformed guess that the Cheesy schedules have fewer duplicate partners/opponents than FIRST’s schedules.

I’ve uploaded a file in my Miscellaneous Statistics Projects called “pahat schedule comparison.xlsx” that compares the actual schedule versus the cheesy arena schedule for pahat. These ranking projections were done from the point prior to the schedule release, but after finding out that 6667 would not be attending.

The pahat schedules were found by taking the actual pahat schedule and giving each team an index to create a format equivalent to that of the cheesy schedules (see attached). These indices were randomly shuffled among the teams in each simulation, just as in a normal simulation using the cheesy arena schedules.

I don’t see any appreciable difference between the two base schedules. I ran both tests using 10000 simulations, so I would expect to see a bit of variation between the trials just by chance, the only two “big” changes of >1% (25 ranking second and 1640 seeding third) did not exceed 1% on subsequent trials. So overall, there may be some negligible differences between the cheesy schedules and the actual schedules, but it isn’t nearly enough to concern me considering I have other known sources of error of larger magnitude.

I might do a full analysis soon of the cheesy schedules versus the actual schedules over all 2018 events just out of curiosity.

Forgot that you can’t attach files here. See the pahat.csv file in Miscellaneous Statistics Projects.

I decided to compare two sets of differences using the actual pahat schedule compared against the cheesy arena schedule, the differences for each team for each rank can be seen in the following graph:

The two outlier points are 25 seeding second and 1640 seeding third. For some reason, using the cheesy schedule makes these scenarios seem about 1% more likely than using the actual schedules.

Still though, the total variance that can be explained by this discrepancy is only about half a percent of the variance due purely to sample error, which means that the simulator would need to run about 500,000 simulations before any discrepancies between the schedule formats cause as much of a problem as sample error alone. I’ve got much bigger problems to deal with than this for now, so I’ll leave it alone.

@AGPapa I’m unsure if you suggested Hatboro Horsham because of this or not, but based on my analysis, Hatboro Horsham had one of the biggest differences between the Cheesy Arena schedule and the actual schedule of any 2018 event. The Cheesy Arena schedule had 4 instances where a team was partnered with another team twice and 42 instances where a team had to play the same opponent 3 times. The actual schedule for this event in contrast had 50 instances where a team was partnered with another team twice, and 68 instances where a team had to play the same opponent 3 times, in addition to 4 instances where a team had to play the same opponent 4 times.

I expect that these differences are the root of the reason why 25 and 1640 have different seeding projections with the cheesy schedule than with the actual schedule.

Thanks for looking into this Caleb. It’s good to know that this method of generating potential schedules seems to work very well.

I had no idea how different the Hatboro schedule was from the Cheezy Arena one. I just felt like teams were playing each other a lot while watching.

Thanks for the good work!

I just updated to v6.

Change log:
Added more conditional formatting to the “predicted rankings” sheet
Added min and max possible ranks to “predicted rankings”
Added a toggle for the additional ranking formatting in settings
Added ability to forecast an event from the point after the team list is confirmed. This is done by using the match schedule ONLY to find which teams actually have qual matches, and then to proceed the rest of the way assuming that no match schedule exists. This is done to be able to have more interpretable pre-schedule predictions, since many events (e.g. ausc) have team lists that differ from the actual teams that competed.
Changed workbook name to “Caleb_event_simulator_xxxx.x.x.xlsm”, eliminating the apostrophe on my name. It’s a stupid name for a workbook, but I keep running into silly problems with the apostrophe in the file name. I’ll rename to something better in 2019, but for now I want it relatively the same but with no apostrophe since I’m sick of dealing with it. :slight_smile:
Updated Instructions and FAQ

I’ve been meaning to make the formatting change for a while (on my unending quest to make things more and more like this), I don’t want anyone to think a 0% prediction means that a rank is a mathematical impossibility for a team, or conversely that 100% means a team is locked in.

Is there a specific reason as to why Excel is unable to respond after I click ‘update’ to get a simulation of an event? It is most likely me, but just thought I’d ask.