This is something of a personal project I’ve been messing with since high school. Back when I was on a FIRST team, I had the idea of taking Day 1’s data and using it to try to predict Day 2’s qualifying matches. Given the math I knew in high school, this wasn’t possible.
Now that I’m going off to grad school, this seems a lot more doable of a problem. The largest issue is that it’s a scarce data environment, so any metric is going to be an error filled derivation based in alliance score. OPR comes to mind as a starting point, but it’s flawed since it’s just a least squares approximation of contribution to alliance score. My first thought would be get into the computation of OPR to use the residuals of the least squares approximation to get an impression of how good an OPR fit is.
Other ideas I’m playing around with are using bayesian methods to weight in past performance as a prior to be influenced by current event performance. Along those lines, using a markov chain in the simulation to handle a transition between broken and repaired states seems like something that has promise.
If anyone else has tried to work on simulations for events, I’d love to hear what has and hasn’t worked. Intuitively it seems possible since experienced people can look at a match and get a pretty good idea of which side is going to win.
I gave a presentation in high school about the predictive power of OPR in 2013. While I didn’t include a residuals slide (if memory serves, I think it had something to do with time constraints), the dataset is included for you to derive it yourself fairly easily.
Of course, 2013 was an awesome year for OPR. Scoring was both linear and remarkably separable for most matches of the year (exceptions include some autonomous conflicts at high levels of play, along with floor pickup and FCS synergy). However, OPR’s usefulness changes drastically game to game. The closer scoring in the game is to simply adding the individual scores contributed from each alliance member (and the closer individual scores are to individual robot actions), the more accurate it will be.
If anyone else has tried to work on simulations for events, I’d love to hear what has and hasn’t worked
You might also find my comparison of different prediction models to be useful. The methods I found to have the most predictive power were calculated contributions (OPR), Ether Power Rating, and Winning Margin Elo.
Do you understand how predicted contributions and Elo ratings are calculated? The best source for understanding predicted contributions is probably my link to Eugene Fang’s post, although I have described it in some detail in my comparison of prediction models. A rough summary of how I calculate Elo can also be found there, with a more detailed description in my FRC Elo document.
Elo predictions are found according to the formula:
redWinProbabilityElo = 1 / (1 + 10 ^ ((blueAllianceElo - redAllianceElo) / 400))
Where blueAllianceElo is the sum of the Elo ratings of all teams on the blue alliance and redAllianceElo is the sum of the Elo ratings of all teams on the red alliance.
CC predictions are found according to the formula:
redWinProbabilityCC = 1 / (1 + 10 ^ (predictedBlueScore - predictedRedScore) / scale))
Where predictedBlueScore is found by:
((The highest predicted contribution of the teams on blue alliance) + (The median predicted contribution of the teams on blue alliance) * 1.5 + (The lowest predicted contribution of the teams on blue alliance) * 0.5) + (average score))/2
where scale and average score varied each week, but scale is 100 and average score is 289.4 in my most recent update.
Overall, my predictions are given by:
redWinProbability = (redWinProbabilityElo+redWinProbabilityCC)/2
One of my team’s members made a program to estimate results of matches from previous match’s results. From just half of a district champ of data, it was estimating the results very accurately. It was kind of scary.
It could also be done with Machine Learning. Using something like python with sklearn to train the data and make it predict who will win the match. However, the lack of data would be an issue. I could see this work during worlds however when you can feed in team’s data from all their previous events.
It’s definitely possible to get very near to 100% prediction accuracy with machine learning. All you have to do is use a network at least as large as your data set and neglect to keep your training and testing data separate. :rolleyes:
I’ve tried to use a machine learning model to predict matches this year and as Caleb hinted at, the results aren’t that great. I reached a max of 65-68% accuracy when feeding the majority of this year’s data (keeping training and test data separate).