Good questions! Let’s dive in.

Yes, I consider x unique schedules before the schedule is released. After the schedule is released I only consider that one schedule.

I copy the schedules from cheesy arena here. These schedules were created by 254 to mimic FMS-generated schedules and have very similar properties to the FMS schedules. I take these base schedules from there and, in each simulation, randomly assign teams to the indices in the base schedule.

So, for each match, I first calculate win probabilities. Then, in each simulation, I randomly assign each match as a win or a loss at the win probability rate I already determined. So, each simulation has certainty for the result, but when running hundreds of individual simulations and aggregating the results, I create a Monte Carlo distribution, which is what appears in the ranking probabilities.

So, my probabilities are all pretty well-calibrated, in the sense that, if you looked at 100 matches that I had all said red would win 80% of the time, red would win about 80 of them. If you just took the team I said had a higher probability of winning, that team would win about 72% of the time (depending on the year). Or alternatively, I get Brier scores around 0.18 (again year-dependant). I believe robust scouting systems can hit around 80%, which it sounds like you may have done (although you should make sure to predict at least a few events worth of data before you can be confident of this).

Oh geez. Well, we can set a lower bound of possible schedules at n! where n is the number of teams at the event. This is because any schedule generated could have the n teams shuffled in any order and still be a valid schedule. For a 50-team regional event, this is 10^64 possible schedules. As an upper bound, if each team has m matches, there will be m blocks of n! permutations of teams in addition to the possibility of surrogates, which I’ll just make their own block for simplicity. So at a maximum there are (n!)^(m+1) possible schedules. For a 50-team, 10 match per team event, that means there are no more than 10^715 possible schedules. So the actual number of possible schedules for a 50 team, 10 match/team event is somewhere between 10^64 and 10^715. My shot-in-the-dark estimate would be that there are around 10^300 schedules in this case, with error bars of 100 orders of magnitude either way. Whatever the exact number is, it’s easily high enough that you will never be able to test every schedule possible, so I would recommend looking into Monte Carlo simulations if I were you.

It may also be helpful for you to review the IdleLoop match schedule generation documentation, which walks through how schedules are created and what is prioritized in this process. Something to keep in mind is that not all schedules are actually equally likely, “better” schedules using the criteria in the IdleLoop documentation will occur more frequently than “worse” but still viable schedules.

I think I did, but if anything is unclear or you have followup questions feel free to reach out. Sounds like a fun project, I’d just caution that you should probably analyze across multiple events if you want results that are generally applicable.