TLDR: ~97% accurate match predictions plus or minus 10 points over four events in two different games (2020 and 2022)

Howdy y’all. Here is a neat project y’all might like: a scouting algo that isn’t OPR. In the past I have found OPR to be inconsistent and honestly not all that great. I set out with my team to fix that with a new algo that calculates the potential or usefulness of a robot relative to the other robots it is competing against. It is much more complicated and took longer than I want to admit to make, but the results speak for themselves.

This is a great tool to check alliance picks with as it ignores rank and purely focuses on what a robot can do relative to everybody else. It also can be used to simulate multiple elim bracket configs to figure out what works best and how you should approach your picks.

I will post a technical document later going over how it works in more detail but for now here you go. If you use it please credit 3676 Redshift Robotics and if you find any bugs or have questions feel free to let me know!

Does this mean that a match score prediction is counted as “correct” if the actual score is within 10 points of the predicted score?
Quickly looking through the sheet, it looks like this system uses human-scouted data as the source, is that correct?
97% is extremely high. Are these predictions done before the match occurs? Or is it using full-event data to estimate scores within the event? If the former, how are scores predicted for the first few matches at the event?

Looking forward to it. I can be patient for the answers to my questions if you are planning to answer them in this document.

That is correct. If the actual score is within 10 points it is counted as correct. It is fully reliant on human scouting, one person per robot every match. All predictions are before the match occurs but because I do not have data from previous events I can not predict the first few matches. I need every robot to play at least three matches in order to start predicting and the accuracy obviously goes up the more data I have. This point occurs around the amount of teams at the event divided by 2 usually. So for irving that is match 22. While it is kind of late into the event, this still leaves a large portion of matches left and elims of course. Additionally, Early matches are hard to simulate since there is no data and if you reuse old data it does not take into account any changes they made. This is specific for the robot at that event.

The tech doc will have way more info including how I am doing the calculations. I will put it out on Monday but I am more than happy to answer any questions about it before I release the doc.

This reminds me of earthquake prediction sites +/- 1 magnitude when they count correct “hits”, “correct” here is a 20-point window (+/- 10) in a low scoring game. I think it’s good for general forecasts but hard to rely on 20-point swings for anything else IMO

Good for doing one-bot observations, that’s the way we do it too with our scouts. Maybe tighten up that window of correct and see if it still holds out

Yeah it is usually within 5 points but the 10 point margin is because of the point values for actual scoring components in the game (mainly hangs sadly). Trying to still refine it even tighter though we will see

I suggest incorporating HANG equivalents… " Hang v Cargo with 30 seconds to go and no defense: Hang values (4,6,10,15) Cargo (with 2 cargo cycle) values (4,8,12,16). Let’s look at cycle time To get 4 points that is a 30 -second cycle, to get 8 that is a 15-second cycle , to get 12 that is a 7.5 second cycle . There are single one cargo cycles too (2,4,6,8,10 ,12,14,16), so to get 10 ( 2 cycles of 2 cargo plus one with one cargo or 3 cycles ) that is a 10 second cycle endgame = High Hang value**.**"

Also simply counting Auto as x2 CARGO to get scoring chunk/cycles out of anything even Fouls

Finished cleaning everything up from Irving and now have a technical document written up for this. Feel free to make a copy of the sheet and mess around with it and let me know if you find any bugs.

Future iterations of this scouting algo will involve some aspect of a compatibility factor. This will be part of an Elim simulator section where you can put in the alliance captains and it will return a list of best picks. This list will give you first and second pick along with predicted win percentages. I am student so this may take a while to get done as I am busy with classes and such.

Let me know if you have questions and feel free to use it at one of your comps!! This is open to everyone and the more people we get to test this the faster we can improve it.

I have not dug into the details of your methods yet, but if you are serious about trying to measure how good your predictions are, I would suggest measuring it using some sort of proper scoring rule.

I am personally a fan of the logarithmic scoring rule, but the Brier score may be more popular.

If you calculate how your predictions are vs OPR, ELO, or some other known method you could show the value of your technique in a rigorous way.

Cool stuff I had no idea about that type of scoring. Only have the data for irving so I could only go through irving, but I got a brier score of 0.14 and a logarithmic score of -0.46. I went through 76 matches and it predicted 73 of them. Although looking at these scores you brought up, I can see how the percentages I am displaying for winning odds are passable but could be much better. I wonder if changing some stuff around to use kurtosis may help. I will add that to the list of things to improve it with

I will do opr later and update this all. What is your take on the scores? I just learned them so I do not really have a metric for good or bad. All I know is that both are over 50% so I am at least better than a coin flip

Ok so for OPR I got ~0.17 brier score. Which if we do a brier skill score versus the 0.14 I got for RPP, I get 0.176. From what I am reading that means an improvement of 17.6% over OPR by using RPP. Granted that is with only one competition’s worth of data but the three other comps I tested with had similar RPP results (no longer have the data to run these scores with sadly). I would imagine over the long run a more conservative 10-15% improvement is what it would total out to. Please correct me if I am interpreting it all wrong, after all I am still learning.

While I haven’t done the math to verify the calculations of .17 and .14 specifically, it sounds to me like your method is not only an improvement but an improvement by a large enough margin that other people ought to pay attention to your work.

Interesting. I am gonna keep messing with the spreadsheet and add some of the changes i previously mentioned and fix some bugs I found. Going to try to scout texas district champs and see how it goes.

Howdy Y’all! I have now updated the sheet for the Mercury division at the Texas District Champs. This new sheet had some changes which I will talk about below but the new brier score for the data set is 0.117 which is a 30% improvement over OPR and a 16% improvement over my previous iteration.

Changes:

Made the algorithm not “sit on the fence” and choose a side more.

Adjusted win and loss percentage calculations to handle close matches more consistently

Added a change over comp statistic. While not used in the algorithm this iteration, it helps teams pick others based on if they have been getting better or worse as the comp goes on.