For the past 4 years, FRC has followed the general ranking structure of 4 possible ranking points (RPs) per match, 2 for winning and 2 for bonus objectives. 4 years is long enough for me to consider this a trend and not just a fad of FRC game design, so I thought I would take a look at improving my bonus Ranking Point predictions. Better RP predictions will really improve my ranking projections and future analyses I do that rely on them. The current method I use to predict bonus RPs in my simulator has some notable downsides that I’ll describe here, and I’ll also go into how I’ve improved on them with my new methodology.
This was an insightful, well-written post. Good work, Caleb.
My understanding is that win/loss and each bonus RP are each forecasted independently of each other. Have you explored methods of correlating these predictions? The independence of each RP may not make a difference in forecasting the likelihood of each RP, I have a feeling it would impact the overall distribution of predicted RP for an alliance.
Here are some hypotheticals to get at what I mean:
When a robot is broken and an alliance is less likely to win the match and less likely to pick up bonus RP
When a robot attempts a last-second cargo score in the rocket and misses the HAB RP
When a robot abandons the rocket early to climb
An alliance’s defending robot breaks. Their opponents are more likely to win and more likely to pick up a rocket RP without defense, and maybe even more likely to get the HAB RP if there is less pressure to complete a rocket
This has indeed been on my mind for a long time, since 2016 in fact. I’m almost certain that there would be some level of correlation between the RPs, winning, and even the second/third order ranking sorts, however I have not yet explored these kind of correlations in depth.
Another related thing I’ve had on my mind for a while is match to match correlations. For example, if at the start of an event, a team is better than their Elo rating suggests, all of their matches should have a higher win probability than what is given, and the opposite if their Elo is too high. So although any individual match for that team has a well calibrated probability, the team’s overall rank projection that my simulator spits out won’t have enough spread on it because I don’t have rating uncertainty built in yet.
Based on how I’m feeling at this moment, I think my broad future projects will be in rough order:
Predicting ties, dqs, and red cards
More ZEBRA DART data work
Predicting second/third order ranking sorts
Alliance selection predictions
Correlations are really cool and important for a sound model, but they’re just not very high up on my list as I’d really rather just do them once after most everything else gets settled. Otherwise I fear I’m going to have to re-do them every time I change one of the underlying systems built up underneath them. For example, had I worked on correlations last year with my old method of predicting bonus RPs, I would probably have had to redo much of that work now because I’ve just switched to using ILSs.
So, I’ll get to them eventually, but it’s probably gonna be a while.
What good does predicting ties and red cards do? Just curious why they are on top as they have to be one of the most unreliable things to predict. I think the DART data will be far more useful and I’m really hoping they become mainstream across FRC. What are you looking to dig into there?
Given that rating uncertainty is an issue, would it make sense to also track Glicko scores? These scores would handle some of the uncertainty issues as you noticed, and should help with determining the uncertainty of ranking projections.
I really really don’t like completely blind spots in my event simulator, and that’s what ties/red cards are right now for it. I got rid of one big blind spot by eliminating the 0%/100% RP predictions present in my old bonus RP Predictions, and I’d really like to remove these as well for completeness. I know they are relatively obscure, and the models I build for them probably won’t be particularly detailed, but I’d like to build something at least for them.
For example, say that there is a team in a match that has a 90% chance of winning, 90% chance of getting the HAB RP, and 90% chance of getting the Rocket RP, which isn’t too unreasonable for some very powerful teams. Assuming independence, they would have a 99.9% chance of getting at least one RP. I don’t know off hand what the prevalence of red cards was last year, but I’m willing to bet it was greater than one in a thousand. If we say it was 1%, then I should have it built in that there is about a 1% chance that this team gets 0 RPs in that match because they get red-carded. For teams that just need a single RP to lock up the 1 seed at the end of the event, that red card might be their biggest threat since they’re probably getting at least one RP in every other case.
My bigger reason for wanting these though is that I fear we’ll get a game like 2010 or 2017 again soon where ties happen on the order of 10% of the time. At which point it becomes pretty questionable to ignore the possibility of ties as they will almost certainly happen multiple times per event.
So when we run simulations on our simulator with a match schedule we have defensive factors involved with percentages based on how good a team is against defense and we even figured out how to evolve a basic strategy of splitting a field and had two robots work on a rocket together. Do you have any defensive factors involved with the rp sims?
I’d have to look at these kind of rating uncertainties again, but I tried a couple of rating uncertainty methods similar to what Glicko uses when I first developed Elo and they really didn’t improve predictions much.
The kind of rating uncertainty I was talking about in the post you referenced is more related to match to match correlations than just having a quantifiable “unknown” to be used for rating changes. I don’t know if I’m describing that well, but it’s not really due to whether I use Elo or Glicko or any other metric. It’s more related to how I run simulations. Currently they run “cold” in the sense that the results of latter matches have no correlation to results of earlier matches. Better (but more computationally intensive) would be to run simulations “hot” in the sense that after getting a result for match 1, I would then re-compute the probabilities of winning later matches using the results from match 1. This way, a team’s rating could change within a simulated event, like it does in actual events, instead of staying static the whole time. That’s what I was trying to refer to when I said rating uncertainties, although that was ambiguous as Glicko deals with a different kind of rating uncertainty.
Dude, I can’t even figure out how to incorporate defense for match scores. I don’t even want to think about incorporating it into RP predictions.
I really think I need a way to know which robot is playing defense and ideally for how long before I can predict/quantify defensive effects. The best candidate for that I think is the ZEBRA DART data, so maybe as that becomes more mainstream I can incorporate that somehow.