While a trendline is interesting, I think a good deal of insight could be gained from also adding a symmetry line. IE slope of 1 through 0,0. This helps illustrate distance from balanced value.
What is the meaning of the colors in this scatter plot?
What do you mean by “actual scoring abilities”?
Is the OPR based on Match Results final score? Or is it based on Team Ranking Auto+Container+Tote+Litter score? Or something else?
Could you also post a list of team# with their OPR and scoring values? That way teams could see how they did compared to others.
This graph is very interesting and shows that OPR doesn’t always accurately portray scoring ability.
Thanks for posting this and keep up the good work!
Thanks, this is cool to look at. Would you be able to post any form of data/graph relating average to actual scoring ability?
If the “scoring ability” is based on manual scouting of Auto+Container+Tote+Litter but the OPR is based on Match Results Final Score then the difference is exaggerated.
For a more apples-to-apples comparison, the OPR could be computed using the Auto+Container+Tote+Litter components from the Team Ranking data provided by FIRST (and TBA).
You can find the computed component OPRs in Ed Law’s scouting database spreadsheet. Or if your version of Excel cannot read that, you can download the component OPR’s here.
I am working on expanding this to be more comprehensive. It’s AP exam week, so I don’t have much time to work on this project, but I will continue to post my progress.
Take your time! As cool and informative as this all is, your exams are more important. Good luck!
Could you take just a moment to clarify what the colors in the scatter plot mean, and what “OPR” you are using?
It’s been known that OPR doesn’t reflect actual scoring ability. It’s a regression analysis that computes the implied “contribution” to scores. Unfortunately no one ever posts the estimates’ standard errors (which I imagine to be enormous with 10 or so observations.)
In 2013, our Curie OPR was 27 points below our actual scoring because we were scheduled with really good partners (yes it was a factor in qualifying first) who performed below their average in other matches simply because there’s only so much that 3 strong robots can do on a single alliance.
In 2014, the OPR had little to do with actual goal scoring. I think we scored one 10 point goal in all of the qualifying rounds and no goals of any kind in the elimination rounds, but we managed the midfield passing and trussing.
I suggest running a log regression on this data. I can see that it has a diminishing relationship between OPR and scoring rather than a linear one. I’d be interested in the parameter on the log variable.
I’m not sure why you chose to ignore co-op points.
Just so you don’t think I’m stupid - yes I realize that co-op is not allowed in elims, etc. etc. But…
Unless you have a 3-yellow tote stack autonomous, the time it takes to complete a co-op is about the same time it takes a decent HP stacker to complete a stack. Many robots sacrifice one HP stack in order to do the co-op, so I would argue that keeping the co-op score is a more accurate reflection of the team’s ability.
…Especially if the OPR is not computed from the same data components used to compute “actual scoring ability”
I suggest running a log regression on this data. I can see that it has a diminishing relationship between OPR and scoring rather than a linear one.
It would be most helpful to first clarify what OPR is being plotted and how “actual scoring ability” is being computed.
Unfortunately no one ever posts the estimates’ standard errors (which I imagine to be enormous with 10 or so observations.)
In years past, I have posted plots and charts and analyses showing the Ax-b residuals, which is the difference between the actual alliance scores and the “OPR-computed” scores for those same alliances, and makes no assumptions about the error distribution being normal. If this is of general interest, I can generate and post that analysis for 2015.
Working on graphs comparing component OPRs to scouting data. Will post the actual graphs later but here’s some linear trendlines for you:
Y(tote points) = 0.983517 * X(tote OPR) - 0.3000815 Pvalue <.0001
Y(can points) = 0.979315 * X(can OPR) + 0.162711 Pvalue <.0001
Y(auto points) = 1.13906 * X(auto OPR) + 2.73116 Pvalue <.0001
-totes and cans both show reasonably precise correlations.
-auto shows a not-so precise correlation, partly due to the fact that, usually, one robot does all the scoring in auto.
-litter OPR vs. scouted litter scoring shows no correlation.
For low tier robots coop was often their only method of scoring points, which is useless come eliminations.
Higher tier robots often gave up a stack to complete coop, my team included.
Our general match strategy was 2 stacks from the landfill, or one stack and coop.
We only would do coop if there were not 2 containers available for us to use (I.e. Capable alliance partners using them).
Seeing stronger robots doing coop is a strategy decision, and neglecting coop devalues good strategy.
Some teams, mine included, had no ability to actually score co-op, but would assemble the co-op stack and pass it to an alliance partner (recycle assist?) to score. We originally had a mechanism that allowed us to score co-op, but we found that it was more of a burden than anything. In the end, disregarding co-op is essential when assembling a proper picklist.
I just submitted the component graphs, they should be up soon. I will not include a litter OPR graph because there is no practical way to accurately scout litter point-for-point. (it’s very difficult to see whose human player is throwing what litter where, while still tracking the robot.)
I already had a script which required only minor modification to generate this, so I went ahead and did it for all 117 events in 2015 with qual match data.
We also ignored coop points. For the highest performing robots in the Newton, the only robot for which it was an effective strategy was 118 who could put up a coop AND 3 stacks. For others the tradeoff is an almost certain 36-42 pt stack vs. a probability weighted 40 points (after the first 2 weeks, this probability was less than 1/3.)
The litter OPR includes putting noodles in the cans which is why the noodle scores are so high. 3-5 stacks noodled are worth 18-30 points.
I added another attachment with Ax-b residual statistics for each of the 117 events:
I’m thinking of the parameter standard errors, i.e., the error estimate around the OPR parameter itself for each team. That can be computed from the matrix–it’s a primary output of any statistical software package.
The second quote above is from a dialog I’ve been having with Citrus Dad, and I have his permission to post it here.
I’d like to hear what others have to say.
In order not to hijack this thread, I’ve created a separate discussion thread here: