|
|
|
![]() |
|
|||||||
|
||||||||
![]() |
|
|
Thread Tools |
Rating:
|
Display Modes |
|
|
|
#1
|
||||
|
||||
|
"standard error" of OPR values
Quote:
Quote:
I'd like to hear what others have to say. Do you think the concept of standard error applies to the individual computed values of OPR, given the way OPR is computed and the data from which it is computed? Why or why not? If yes: explain how you would propose to compute the standard error for each OPR value, what assumptions would need to be made about the model and the data in order for said computed standard error values to be meaningful, and how the standard error values should be interpreted. Last edited by Ether : 12-05-2015 at 22:45. |
|
#2
|
|||
|
|||
|
Re: "standard error" of OPR values
Just to check that I understand this correctly: standard error is basically the standard deviation from the "correct" value, and you're asking if OPR values have this distribution from the "correct" value (i.e. the given OPR)?
Also, is OPR is calculated by taking t1+t2+t3 = redScore1, t4+t5+t6 = blueScore1, etc. and then solving that series of linear equations? I would guess it would depend on what you mean by OPR. I always assumed, perhaps incorrectly, that OPR was the solution of the above calculations, and thus it is just a number, neither correct nor incorrect. If OPR is meant to indicate the actual scoring ability, this would change. However I'm not sure how to figure out how many points a team contributes--if one team stacked 6 totes and another capped, does the first get 12 and the second 24, or each get 36, or something other combination? I suppose one way to do it would be to take the difference between a team's OPR and 1/3 of their alliance's points from each match they played in, and see the change in that difference. Comparing that between teams will be tricky, since the very top/bottom teams will have a greater difference than an average one. Similarly, looking at a team's OPR after X matches and 1/3 of the match X+1 score would be interesting but would also have that problem. (Or I could just be very confused about what OPR and standard error really are--I've tried to piece together what I've read here and on the internet but haven't formally taken linear algebra or statistics.) |
|
#3
|
|||
|
|||
|
Re: "standard error" of OPR values
I'm not sure if there is a good clean method which produces some sort of statistical standard deviation or the such, although I would be happy to be proven wrong.
However, I believe that the following method should give a useful result: If you start out with the standard OPR calculations, with the matrix equation A * x = b, where x is a n x 1 matrix containing all the OPRs, A is the matrix describing which teams a given team has played with and b has the sum of the scores from the matches a team played in, then in order to compute a useful error value we would do the following: 1) Calculate the expected score from each match (using OPR), storing the result in a matrix exp, which is m x 1. Also, store all the actual scores in another m x 1 matrix, act. 2) Calculate the square of the error for each match, in the matrix err = (act - exp)^2 (using the squared notation to refer to squaring individual elements). You could also try taking the absolute value of each element, which would result in a similar distinction as that between the L1 and L2 norm. 3) Sum up the squared err for each match into the matrix errsum, which will replace the b from the original OPR calculation. 4) Solve for y in A * y = errsum (obviously, this would be over-determined, just like the original OPR calculation). In order to get things into the right units, you should then take the square root of every element of y and that will give a team's typical variance. This should give each team's typical contribution to the change in their match scores. added-in note: I'm not sure what statistical meaning the values generated by this method would have, but I do believe that they would have some useful meaning, unlike the values generated by just directly computing the total least-squared error of the original calculation (ie, (A*x - b)^2). If no one else does, I may implement this method just to see how it performs. Last edited by James Kuszmaul : 12-05-2015 at 23:39. |
|
#4
|
|||||
|
|||||
|
Re: "standard error" of OPR values
Calculation of the standard error in OPR for each team sounds straightforward - the RMS of the residuals between the linear model and the match data for the matches in which a team participated. However, this number would probably not cast much if any light on the source of this scatter. One obvious source of scatter is the actual match-to-match performance variation of each team - puts up two stacks per match, but in that match, they set the stack on some litter and it knocked over the first. Another is non-linearity in the combined scoring (e.g. two good teams that perform very well when with mediocre partners, but run out of game pieces when allied, or a tote specialist allied with an RC specialist who do much better together than separately).
|
|
#5
|
||||
|
||||
|
Re: "standard error" of OPR values
There are two types of error:
The first is the prediction residual which measures how well the OPR model is predicting match outcomes. In games where there is a lot of match-to-match variation, the prediction residual will be high no matter how many matches each team plays. The second is the error in measuring the actual, underlying OPR value (if you buy into the linear model). If teams actually had an underlying OPR value, then as teams play 10, 100, 1000 matches the error in computing this value will go to zero. So, the question is, what exactly are you trying to measure? If you want confidence in the underlying OPR values or perhaps the rankings produced by the OPR values, then the second error is the one you want to figure out and the prediction residual won't really answer that. If you want to know how well the OPR model will predict match outcomes, then the first error is the one you care about. |
|
#6
|
|||
|
|||
|
Re: "standard error" of OPR values
Quote:
If one were to assume that this is actually the case, though, then one would just take the error from the first part and divide it by sqrt(n) to find the error in the estimation of the mean. |
|
#7
|
|||
|
|||
|
Re: "standard error" of OPR values
This year may be an anomaly, but it seems to me like, for some teams anyway, this is a reasonable model. Teams have built robots that are very predictable and task-oriented. For example: grab a bin, drive to the feeder station, stack, stack, stack, push, stack, stack, stack, push, etc. Knowing how quickly our human player and stack mechanism are, we can predict with reasonable accuracy how many points we can typically score in a match, with the only real variance coming from when things go wrong.
|
|
#8
|
|||||
|
|||||
|
Re: "standard error" of OPR values
I have to strongly agree with what Ed had to say above. Errors in OPR happen when its assumptions go unmet: partner or opponent interaction, team inconsistency (including improvement), etc. If one if these single factors caused significantly more variation than the others, then the standard error might be a reasonable estimate of that factor. However, I don't believe that this is the case.
Another option would be to take this measure in the same way that we take OPR. We know that OPR is not a perfect depiction of a team's robot quality or even a team's contribution to its alliance, but we use OPR anyway. In the same way, we know the standard error is an imperfect depiction of a team's variation in contribution. People constantly use the same example in discussing consistency in FRC. A low-seeded captain, when considering two similarly contributing teams, is generally better off selecting an inconsistent team over a consistent one. Standard error could be a reasonable measure of this inconsistency (whether due to simple variation or improvement). At a scouting meeting, higher standard error could indicate "teams to watch" (for improvement). But without having tried it, I suspect a team's standard error will ultimately be mostly unintelligible noise. |
|
#9
|
||||
|
||||
|
Re: "standard error" of OPR values
Has anyone ever attempted a validation study to compare "actual contribution" (based on scouting data or a review of match video) to OPR values? It seems like this would be fairly easy and accurate for Recycle Rush (and very difficult for Aerial Assist). I did that with our performance at one district event and found the result to be very close (OPR=71 vs "Actual"= 74).
In some ways, OPR is probably more relevant than "actual contribution". For example, a good strategist in Aerial Assist could extract productivity from teams that might otherwise just drive around aimlessly. This sort of contribution would show up in OPR, but a scout wouldn't attribute it to them as an "actual contribution". It would be interesting to see if OPR error was the same (magnitude and direction) for low, medium, and high OPR teams, etc. |
|
#10
|
||||
|
||||
|
Re: "standard error" of OPR values
Quote:
Someone did a study for Archimedes this year. I would say it is similar to 2011 where 3 really impressive scorers would put up a really great score, but if you expected 3X, you would instead get more like 2.25 to 2.5.... |
|
#11
|
||||
|
||||
|
Re: "standard error" of OPR values
Quote:
|
|
#12
|
||||
|
||||
|
Re: "standard error" of OPR values
Quote:
|
|
#13
|
||||
|
||||
|
Re: "standard error" of OPR values
I agree with most of what people have said so far. I would like to add my observations and opinions on this topic.
First of all, it is important to understand how OPR is calculated and what it means from a mathematical standpoint. Next it is important to understand all the reasons why OPR does not perfectly reflect what a team actually scores in a match. To put things in perspective, I would like to categorize all the reasons into two bins. Things that are beyond a team's control and things that reflects the actual "performance" of the team. I consider anything that is beyond a team's control as noise. This is something that will always be there. Some examples, as others have also pointed out, are bad call by refs, compatibility with partners' robots, non-linearity of scoring, accidents that is not due to carelessness, field fault not being recognized, robot failure that is not repeatable etc. The second bin will be things that truly reflects the "performance" of a team. This will measure what a team can potentially contribute to a match. This will take into account how consistent a team is. The variation here will include factors like how careful they are in not knocking stacks down, getting fouls, robot not functioning due to wiring that is avoidable. The problem is this measure is meaningful only if no teams are allowed to modify their robot between matches meaning the robot is in the exact same condition in every match. However in reality there are three scenarios. 1) The robot keeps getting better as teams worked out the kinks or tuned it better. 2) The robot keeps getting worse as things wear down quickly due to inappropriate choice of motors, bearings or the lack of, design or construction techniques were used. Performance can get worse also as some teams keep tinkering with their robot or programming without fully validating the change. 3) The robot stays the same. I understand what some people are trying to do. We want a measure of expected variability around each team's OPR numbers, some kind of a confidence band. If we have that information, then there will be a max and min prediction of the outcome of the score of each alliance. Mathematically, this can be done relatively easily. However the engineer in me tells me that it is a waste of time. Based on the noise factors I listed above and that the robot performance may change over time, this becomes just a mathematical exercise and does not have much contribution to the prediction of outcome of the next match. However I do support the publication of the R^2 coefficient of determination. It will give an overall number as to how well the actual outcome fits the statistical model. |
|
#14
|
||||
|
||||
|
Re: "standard error" of OPR values
Quote:
And regardless I think see the SEs lets us see if a team has a more variable performance than another. That's another piece of information that we can then use to explore it further. For example is the variability arising because parts keep breaking or is there an underlying improvement trend through the competition--either one would increase the SE compared to a steady performance rate. There's other tools for digging into that data, but we may not look unless we have that SE measure first. |
|
#15
|
||||
|
||||
|
Re: "standard error" of OPR values
Kind of reminds me of a joke I heard this past weekend that was accidentally butchered:
A physicist, engineer and a statistician are out hunting. Suddenly, a deer appears 50 yards away. The physicist does some basic ballistic calculations, assuming a vacuum, lifts his rifle to a specific angle, and shoots. The bullet lands 5 yards short. The engineer adds a fudge factor for air resistance, lifts his rifle slightly higher, and shoots. The bullet lands 5 yards long. The statistician yells "We got him!" ************************************************** ******** A really interesting read into "what is important" from stats in basketball: http://www.nytimes.com/2009/02/15/ma...ewanted=1&_r=0 +/- system is probably the most similar "stat" to OPR utilized in basketball. It is figured a different way, but is a good way of estimating impact from a player vs. just using points/rebounds and.... The article does a really good job of doing some comparison to a metric like that to more typical event driven stats to actual impactful details of a particularly difficult to scout player. I really enjoy the line where it discusses trying to find undervalued mid pack players. Often with scouting, this is exactly what you too are trying to do. Rank the #16-#24 team at an event as accurately as possible in order to help foster your alliances best chance at advancing. If you enjoy this topic, enjoy the article, and have not read Moneyball, it is well worth the read. I enjoyed the movie, but the book is so much better about the details. |
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|