Reverse Engineering the MatchRater Algorithm

Continuing along on my journey of building a match scheduler. I decided to dive deeper into FIRST’s MatchRater Algorithm which is used to determine schedule quality. You can download MatchRater from the IdleLoop website to see how it works yourself.

The piece of MatchRater I am most interested in is seeing the way it “scores” schedules. You can find this score yourself with the “-x” command line parameter. Based on my investigation, there are 3 score components: duplicate partner penalties, duplicate opponent penalties, and repeat appearance penalties. It evaluates these penalties for every team pair at the competition, and summing all of the penalties together gives the final schedule score, where a higher number is worse.

For duplicate partners, the base penalty for a single duplication is 5, this penalty doubles for every additional pairing after the first 2. This is summarized in the below chart:
image
The left side represents the number of times that two teams are partners at an event. The top represents the number of times that the two teams are opponents at an event. The numbers in the middle are a representation of the penalty for that particular combination of partner and opponent appearances.

In a similar way, there is also a duplicate opponent penalty. The base penalty in this case is 1. Here is a chart for that penalty:
image

Finally, there is a duplicate appearance penalty. This penalty applies if two teams are in the same match at all, whether as partners or as opponents. It also has a base penalty of 1. Here is it’s chart:
image

With these three charts, we can sum up the penalties at each partner/opponent intersection to get the full penalty chart:
image

Thoughts on this chart:
I’m truly shocked to find that this is the scoring algorithm for schedules. I’m still wrapping my head around the implications.

To start, the manual is straight up lying about the priorities of the scheduler. From section 11.6.2 of the 2021 game manual:


Priority 2 according to the game manual is to minimize duplicate opponents, and priority 3 is to minimize duplicate partners, but that is blatantly not what the scoring algorithm values.
The total penalty for the 2-partner, 0-opponent case is 6, which is lower than the total penalty of 2 for the 0-partner, 2-opponent case. Not only that, but it is even lower than the 4 point penalty for the 0-partner, 3-opponent case!

Here’s a more detailed example outlining the preference for non-duplicated partners over non-duplicated opponents:
Hypothetically, say a team has 3 matches. The scheduler can either give them:

  1. 6 unique partners and 4 unique opponents
  2. 4 unique partners and 6 unique opponents

Let’s say that none of the partners appear as opponents for simplicity.

In case 1, two opponents appear in all 3 matches, one opponent appears twice, and one opponent appears once. All partners are unique and thus all appear once. The penalties are thus:
0-partner, 3-opponents: 2 at a penalty of 4 each
0-partner, 2-opponents: 1 at a penalty of 2
0-partner, 1-opponent: 1 at a penalty of 0
1-partner, 0-opponent: 6 at a penalty 0 each
Total penalty = 2 * 4 + 1 * 2 + 1 * 0 + 6 * 0 = 10

In case 2, three opponents appear twice and three appear once. Two partners appear twice and two partners appear once. The penalties are thus:
0-partner, 2-opponents: 3 at a penalty of 2 each
0-partner, 1-opponent: 3 at a penalty of 0
2-partner, 0-opponent: 2 at a penalty of 5 each
1-partner, 0-opponent: 2 at a penalty of 0 each
Total penalty = 3 * 2 + 3 * 0 + 2 * 5 + 2 * 0 = 16

So the scheduler very strongly prefers case 1, which clearly minimizes duplicate partners and not duplicate opponents. This preference exists in spite of the fact that I am forcing unique opponents down to 4 and 6 when, over the course of 3 matches, a team could have as many as 9 unique opponents. Comparing 9 unique opponents and 4 unique partners against 7 unique opponents and 6 unique partners provides even more of a penalty difference in favor of unique partners.

To be clear, I don’t mind that the scheduler prefers unique partners in a vacuum. Every schedule is going to have to make tradeoffs, but the tradeoffs the scheduler is incentivized to make are in direct opposition to what is stated in the game manual.

My next thought is that, intuitively, I feel like the cost function should prefer a more balanced number of partner/opponent appearances more strongly than duplicate opponents and no partner apperances. If two teams are going to encounter each other 3 times, I feel like 2 matches as an opponent and 1 match as a partner should be much more strongly preferred over 3 matches as an opponent and 0 matches as a partner. But the scoring algorithm only provides a slight incentive for this (3 point penalty vs 4 point penalty). Indeed, a very common complaint I’ve seen regarding schedules comes from the 3 (or 4) opponent, 0 partner instances. The fact that this happens is not some intrinsic undesirable property of FRC schedules, but rather a natural outcome of the fact that 3 partner 0 opponent schedules are not penalized very much relative to alternatives.

Finally, I think it’s bizarre that there is no associated penalty for the 0 partner, 0 opponent case. I don’t think it should be an enormous penalty, but I don’t see why the scoring algorithm shouldn’t nudge teams toward 1 total appearance instead of 0. Essentially, such a penalty would incentivize more unique team appearances. Currently, all the schedule does is dis-incentivize duplicate team appearances, which is a similar goal, but I think we might get more unique encounters with a slight 0 partner, 0 opponent penalty.

I’ll play the game and build schedules according to this algorithm. I’m noting though that I think this is a poor scoring structure for schedules. Regardless of my opinions, section 11.6.2 should definitely be changed to reflect the actual scoring algorithm used.

19 Likes

Hmm… Thanks for another interesting analysis on the inner workings of schedule generation. Just because it’s something I’m not clear on, are you sure that the default penalty weightings from IdleLoop are the same as the ones FIRST actually uses?

I’m not positive, but it seems likely to me based on the official schedules I’ve seen. I can check with Tom Saxton from IdleLoop. Not sure who else might know. Maybe the scorekeepers out there have some more insight?

2 Likes

Amazing work. Thank you for your tests - I wish this (and other aspects of the FMS) were more open. Many eyes makes for better code review.

7 Likes

Keep in mind that “partners” and “opponents” aren’t strictly equivalent. You only get 2 partners per match but 3 opponents.

4 Likes

This would justify a 3:2 penalty weighting for repeat partners over repeat opponents, but surely not 5:1. I think the intuitively optimal solution to the problem in which one team has to appear with every other team five times is to be partners twice and opponents thrice, but the current rating implies the scheduler would prefer to have teams partner once and face off four times.

On an anecdotal level, it’s really fascinating to see the guts of the algorithm, because I would have sworn up and down that my teams have far more often faced that one team you really want to avoid twice as an opponent than had them twice as a partner. Perhaps it’s not just confirmation bias.

So I’ve been a scorekeeper (and otherwise around the scoring table) for a few years. We have very little impact on the schedule and therefore not a ton of insight into how it tunes. We work with the event coordinators and FTAs to build the time blocks that are filled in with matches, but past that we just hit the “generate schedule” button.

After the fact we can look at summary stats as far as number of unique opponents and partners, and look at the actual schedule itself, but nothing more in depth than that.

1 Like

From Tom Saxton:

We build MatchRater from the same code base as MatchMaker. They share the method for evaluating the quality of a schedule.

I wondered how this actually played out with real schedules, so I ran the numbers on all the districts from Week 1 2019. Districts tend to have fewer teams and more matches per team than regionals, which stress tests the algorithm more, and I skipped back to the most recent full season in case I for some reason want to look at an even bigger sample.

The numbers are exactly as I feared. It’s extremely rare to have more than one match paired with another team, but far more common to have two or even three matches against the same team. One unlucky pair of teams at Granite State even played each other four times in 12 matches at a 36-team event. I should go track down who they are. (EDIT: It was 1721 and 5902, who faced off in matches 10, 13, 28, and 51.)

4 Likes

This will always be true, regardless of algorithm. You play against 3 teams per match and with 2 teams per match.

Take a closer look at the numbers, though. I know that’s the case, but you are disproportionately likely to have multiple matches against the same team rather than multiple with them because the algorithm weights those cases differently.

Out of 4329 pairs of teams in my sample data who played in at least two qualifying matches together at the same event, 36 (0.8%) were partners more often than opponents (and this total comes from 19, 16, and 1 instances across only three out of twelve events, one of which was extremely small and has higher repeat numbers overall), 2755 (63.6%) were partners and opponents the same number of times, and 1538 (35.5%) were opponents more often than partners.

That’s just not the 3:2 ratio that would fall out of neutral combinatorics. It’s totally indicative of an algorithm that chooses to harshly penalize having repeat partners over having repeat opponents.

3 Likes

I suspect that this actually becomes worse the less teams there are at an event. If you’re at a 36 team district event with 12 matches, that means you only have 35 opponents, and are guaranteed to require at least 1 duplicate per team. It wouldn’t shock me that the algorithm starts having issues at that point, especially since the algorithm was designed before districts, where less matches per team and more teams per event was much more common.

I’d like to see the numbers for something like the 2019 LA regional, with 56 teams and 9 matches per team.

Also, for your original charts, how many teams and how many matches were used.

2 Likes

If you were referring to my charts, I do have team and match counts listed there. Regardless, your suspicion is correct that more teams means less of an issue. Here are the LA results.

image

Ah, I didn’t see that. But it definitely shows the point. When the amount of teams is basically equivalent to the number of opponents you’re required to have, attempting to permutate so you have less duplicate opponents likely makes things worse or fails the other criteria. Ideal schedule in a 36 team event would require every team to have 1 duplicate opponent, but the only way to ensure that happens is to have every team play a different duplicate. Its very possible the algorithm just never gets to that schedule, as it doesn’t magically know to pick that one first, and only a limited number (5 million, but still compared to the total number thats pretty small) ever actually gets generated. If you never get to the perfect permutation, at minimum you’ll have 1 team face the same team twice. But again thats also a pretty small percentage of schedules to only have 1 team have that, so its not surprising schedules often get picked that never make it to that number of minimum opponents.

For those of you involved in combat robotics:
How far in advance is the competition schedule made? Do you know your full day’s pairings at the beginning of the day, or is the schedule built round by round, based on team performance and attrition?

In my experience, there were three qualification rounds. The first two were set after inspection, then the third was generated from those left standing.

I haven’t participated yet, but I am actively building for Norwalk Havoc, and they release their schedule about 1 week ahead of time. They seem to be sort of an outlier in this though, and they don’t reshuffle the bracket if people drop/no show after the bracket is made.

1 Like

At Franklin Institute, we found out our first match only once everyone in the weight class passed safety inspection. For the first few teams up, it was less than an hour ahead of time. The rest of the bracket was built as teams won/lost (double elimination).

1 Like

I wonder if it would be possible for FRC to use a known, fixed RNG seed for the schedule generation so that the match schedule could be known before events, without any further action required on the part of the event organizers/FTAs. Obviously the schedule produced beforehand wouldn’t be ironclad, much like the pre-made ones for Championship divisions and maybe seeds should be released at the beginning of a competition week or something to prevent it from affecting how teams build robots too much, but I wonder if this could be an option to try at some point.

3 Likes

Ooh, I like this a lot. You could have the seed be based on the event key (including year). Assuming there are pre-generated public schedules, that seed could be used along with a preliminary team list and an expected number of matches per team for that event to generate schedules as early as people want. Those schedules would of course be subject to change in the case that a team drops or the matches/team changes.

I think I’m going to do this for my scheduler. At least as an option, random seeds will be an option as well.

Is there any value to just pre-generating schedules based on number of matches and number of teams and then just randomly filling teams into the slots? This has the benefit of reducing the sacrifices needed for the random number gods.

2 Likes