I hear the term “strength of schedule” (SoS) thrown around sometimes and I’m not quite sure I understand what this means in an FRC context. For reference, here and here the most recent thread discussing this topic. I’d like to start a discussion specifically on what (if any) metric makes the most sense to represent an individual team’s SoS, and then ideally to expand on this definition to come up with a metric that represents how “fair” or “balanced” an overall match schedule is.
I’ve made a program that I can use to calculate metrics like this for all of last season’s events, the first set of results from that program can be found in my miscellaneous statistics projects thread. I have created a metric which I think has potential as a SoS metric, and I’ll copy the details of that metric below. As the thread progresses, I’m happy to revisit that program and calculate other metrics if people want.
Here’s my first pass at a SoS metric. I calculate it by finding the probability that a given team will seed better with the actual schedule than they would have with a random schedule. So a “schedule strength” of 0% means that you will never seed higher with the existing schedule than you would have with a random schedule, and a “schedule strength” of 100% means that you are guaranteed to seed higher with the actual schedule than you would have with a random schedule.
What I like about this metric:
- It compares the given schedule against other hypothetical schedules
- It is customized for each team, that is, it compares your hypothetical results with a random schedule with your hypothetical results with the given schedule. I’m not the biggest fan of team-independent metrics since, for example, a schedule full of buddy climb capable partners is amazing for a team without a buddy climber, but just alright for a team that has a good buddy climber, and team-independent metrics would have to give the schedule a single score for both of these teams.
- It’s on an interpretable scale (0% to 100%) and has meaningful significance
- It’s able to be calculated before the event occurs (I don’t like metrics that require hindsight unless maybe we want to use SoS as a tiebreaker or something)
- Incorporates RPs
What I don’t like about this metric:
- Requires a full event simulator to calculate
- Teams that are basically guaranteed to seed first (like 1678 at their later regionals) will inevitably be shown to have bad schedules, since there is no schedule that would give them much of a better chance of seeding higher than their expectation of almost certainly first. Switching to greater than or equal ranks just flips the problem to high scores instead of low scores for these scenarios
- Average value is 48.1% instead of 50%