Quote:
Originally Posted by wgardner
So, repeating the Executive Summary:
1. The mean of the standard error vector for the OPR estimates is a decent approximation for the standard deviation of the team-specific OPR estimates themselves, and is a very good approximation for the mean of the standard deviations of the team-specific OPR estimates taken across all of the teams in the tournament.
2. Teams with more variability in their offensive contributions (e.g., teams that contribute a huge amount to their alliance's score by performing some high-scoring feats, but fail at doing so 1/2 the time) will have slightly more uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.
3. Teams with less variability in their offensive contributions (e.g., consistent teams that always contribute about the same amount to their alliance's score every match) will have slightly less uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.
|
The bottom line here seems to be that even assuming that an alliance's expected score is a simple sum of each team's contributions, the statistics tend to properly report the
global match-to-match variation, while under-reporting
each team's match-to-match variation.
The elephant in the room here is that assumption that the alliance is equal to the sum of its members. For example, consider a 2015 (Recycle Rush) robot with a highly effective 2-can grab during autonomous, and the ability to build, score, cap and noodle one stack of six from the HP station, or cap five stacks of up to six totes during a match, or cap four stacks with noodles loaded over the wall. For argument's sake, it is essentially 100% proficient at these tasks, selecting which to do based on its alliance partners. I will also admit up front that the alliance match-ups are somewhat contrived, but none truly unrealistic. If I'd wanted to really stack the deck, I'd have assumed that the robot was the consummate RC specialist and had no tote manipulators at all.
- If the robot had the field to itself, it could score 42 points. (one noodled, capped stack of 6) The canburglar is useless, except as a defensive measure.
- If paired with two HP robots that could combine to score 2 or 3 capped stacks, this robot would add at most a few noodles to the final score. It either can't get to the HP station, or it would displace another robot that would have been using the station. Again, the canburglar has no offensive value.
- If paired with an HP robot that could score two capped & noodled stacks, and a landfill miner that could build and cap two non-noodled stacks, the margin for this robot would be 66 points. (42 points for its own noodled, capped stack, and 24 points for the fourth stack that the landfill robot could cap). The canburglar definitely contributes here!
- If allied with two HP robots that could put up 4 or 5 6-stacks of totes (but no RCs), the margin value of this robot would be a whopping 120 points. (Cap 4 6-stacks with RCs and noodles, or cap 5 6-stacks with RCs). Couldn't do it without that canburglar!
The real point is that
this variation is based on the alliance composition, not on "performance variation" of the robot in the same situation. I also left HP littering out, which would provide additional wrinkles.
My takeaway on this thread is that it would be good and useful information to know the rms (root-mean-square) of the residuals for an OPR/DPR data set (tournament or season). This would provide some understanding as to how much difference really is a difference, and a clue as to when the statistics mean about as much as the scouting.
On another slightly related matter, I have wondered why CCWM (Combined Contribution to Winning Margin) is calculated by combining separate calculations of
OPR and DPR, rather than by solving a single matrix of winning margin. I suspect that the single calculation would prove to be more consistent for games with robot-based defense (
not Recycle Rush); if a robot plays offense five matches and defense five matches, then both
OPR and DPR would each have a lot of noise, whereas true CCWM should be a more consistent number.