Chief Delphi

Chief Delphi (http://www.chiefdelphi.com/forums/index.php)
-   General Forum (http://www.chiefdelphi.com/forums/forumdisplay.php?f=16)
-   -   "standard error" of OPR values (http://www.chiefdelphi.com/forums/showthread.php?t=137223)

IKE 13-05-2015 17:37

Re: "standard error" of OPR values
 
Kind of reminds me of a joke I heard this past weekend that was accidentally butchered:

A physicist, engineer and a statistician are out hunting. Suddenly, a deer appears 50 yards away.

The physicist does some basic ballistic calculations, assuming a vacuum, lifts his rifle to a specific angle, and shoots. The bullet lands 5 yards short.

The engineer adds a fudge factor for air resistance, lifts his rifle slightly higher, and shoots. The bullet lands 5 yards long.

The statistician yells "We got him!"
************************************************** ********

A really interesting read into "what is important" from stats in basketball:
http://www.nytimes.com/2009/02/15/ma...ewanted=1&_r=0

+/- system is probably the most similar "stat" to OPR utilized in basketball. It is figured a different way, but is a good way of estimating impact from a player vs. just using points/rebounds and....

The article does a really good job of doing some comparison to a metric like that to more typical event driven stats to actual impactful details of a particularly difficult to scout player.

I really enjoy the line where it discusses trying to find undervalued mid pack players. Often with scouting, this is exactly what you too are trying to do. Rank the #16-#24 team at an event as accurately as possible in order to help foster your alliances best chance at advancing.

If you enjoy this topic, enjoy the article, and have not read Moneyball, it is well worth the read. I enjoyed the movie, but the book is so much better about the details.

Citrus Dad 13-05-2015 18:35

Re: "standard error" of OPR values
 
Quote:

Originally Posted by IKE (Post 1481941)
Kind of reminds me of a joke I heard this past weekend that was accidentally butchered:

A physicist, engineer and a statistician are out hunting. Suddenly, a deer appears 50 yards away.

The physicist does some basic ballistic calculations, assuming a vacuum, lifts his rifle to a specific angle, and shoots. The bullet lands 5 yards short.

The engineer adds a fudge factor for air resistance, lifts his rifle slightly higher, and shoots. The bullet lands 5 yards long.

The statistician yells "We got him!"

Of course! I'm absolutely successful everytime I go hunting!

There's an equivalent economists' joke in which trying to feed a group on a desert island ends with "assume a can opener!":D
************************************************** ********

Quote:

Originally Posted by IKE (Post 1481941)
If you enjoy this topic, enjoy the article, and have not read Moneyball, it is well worth the read. I enjoyed the movie, but the book is so much better about the details.

Wholly endorse Moneyball to anyone reading this thread. It's what FRC scouting is all about. We call our system "MoneyBot."

In baseball, this use of statistics is called "sabremetrics." Bill James is the originator of this method.

Ether 15-05-2015 18:27

Re: "standard error" of OPR values
 
1 Attachment(s)

Getting back to the original question:

Quote:

Do you think the concept of standard error applies to the individual computed values of OPR, given the way OPR is computed and the data from which it is computed?

Why or why not?

If yes: explain how you would propose to compute the standard error for each OPR value, what assumptions would need to be made about the model and the data in order for said computed standard error values to be meaningful, and how the standard error values should be interpreted.

So for those of you who answered "yes":

Pick an authoritative (within the field of statistics) definition for standard error, and compute that "standard error" for each Team's OPR for the attached example.




Ether 16-05-2015 09:42

Re: "standard error" of OPR values
 
Quote:

Originally Posted by Ether (Post 1482422)
So for those of you who answered "yes"

... and for those of you who think the answer is "no", explain why none of the well-defined "standard errors" (within the field of statistics) can be meaningfully applied to the example data (provided in the linked post) in a statistically valid way.



wgardner 16-05-2015 14:12

Re: "standard error" of OPR values
 
Here's a poor-man's approach to approximating the error of the OPR value calculation (as opposed to the prediction error aka regression error):

1. Collect all of a team's match results.

2. Compute the normal OPR.

3. Then, re-compute the OPR but excluding the result from the first match.

4. Repeat this process by removing the results from only the 2nd match, then only the 3rd, etc. This will give you a set of OPR values computed by excluding a single match. So for example, if a team played 6 matches, there would be the original OPR plus 6 additional "OPR-" values.

5. Compute the standard deviation of the set of OPR- values. This should give you some idea of how much variability a particular match contributes to the team's OPR. Note that this will even vary team-by-team.

Thoughts?

Ether 16-05-2015 14:31

Re: "standard error" of OPR values
 
Quote:

Originally Posted by wgardner (Post 1482521)
Here's a poor-man's approach to approximating the error of the OPR value calculation (as opposed to the prediction error aka regression error):

1. Collect all of a team's match results.

2. Compute the normal OPR.

3. Then, re-compute the OPR but excluding the result from the first match.

4. Repeat this process by removing the results from only the 2nd match, then only the 3rd, etc. This will give you a set of OPR values computed by excluding a single match. So for example, if a team played 6 matches, there would be the original OPR plus 6 additional "OPR-" values.

5. Compute the standard deviation of the set of OPR- values. This should give you some idea of how much variability a particular match contributes to the team's OPR. Note that this will even vary team-by-team.

Thoughts?

This is interesting but not what I'm looking for.

The question is this thread is how (or if) a standard, textbook, widely-used, statistically valid "standard error" (as mention by Citrus Dad and quoted in the original post in this thread) can be computed for OPR from official FRC qual match results data unsupplemented by manual scouting data or any other data.



James Kuszmaul 16-05-2015 16:31

Re: "standard error" of OPR values
 
Quote:

Originally Posted by wgardner (Post 1482521)
Here's a poor-man's approach to approximating the error of the OPR value calculation (as opposed to the prediction error aka regression error):

1. Collect all of a team's match results.

2. Compute the normal OPR.

3. Then, re-compute the OPR but excluding the result from the first match.

4. Repeat this process by removing the results from only the 2nd match, then only the 3rd, etc. This will give you a set of OPR values computed by excluding a single match. So for example, if a team played 6 matches, there would be the original OPR plus 6 additional "OPR-" values.

5. Compute the standard deviation of the set of OPR- values. This should give you some idea of how much variability a particular match contributes to the team's OPR. Note that this will even vary team-by-team.

Thoughts?

Using Ether's data, I just did essentially this, where I randomly* selected 200 (I just chose this because it excludes enough matches to ensure variation in OPRs, but should include enough matches to keep the system sufficiently over-determined) of the 254 alliance scores to use for the OPR calculation. I ran this 200 times and got the following:
Code:

Team        Original OPR        Mean OPR        Standard Deviation        StdDev / Mean
1023        119.9222385        120.0083320153        11.227427964        0.0935554038
234        73.13049299        72.801129356        8.9138064084        0.1224404963
135        71.73803792        72.0499437529        7.953512079        0.1103888728
1310        68.29454232        69.3467152712        14.1978070751        0.2047365476
1538        66.51660956        65.739882921        10.0642899215        0.1530926049
1640        63.89355804        63.1124212044        12.5486944006        0.1988308191
4213        59.83218159        60.3799737845        9.7581954471        0.1616131117
2383        59.3454496        58.4390556944        8.8170835924        0.1508765583
5687        58.89565276        58.0801454327        8.5447703278        0.1471203328
2338        57.52050487        57.8998084926        9.9345796042        0.1715822533
68        57.31570571        57.5000280561        7.3734953486        0.1282346391
2342        56.91016998        57.2987212179        6.6038945531        0.115253786
2974        55.52108592        57.1342122847        8.3752237419        0.1465885921
857        56.58983207        56.5258351411        7.2736015551        0.1286774718
2619        55.87939909        55.7690519681        8.4202867997        0.150984937
314        54.93283739        54.2189755764        9.2781646413        0.1711239385
4201        54.36868175        53.4393101098        10.5474638148        0.1973727541
2907        52.20131966        52.8528874425        7.542822466        0.1427135362
360        50.27624758        50.4115562132        7.0992892482        0.1408266235
5403        50.29915841        50.3683881678        6.7117433122        0.133253089
201        45.9115291        44.7743914139        8.4846178186        0.189497111
2013        44.91032156        44.6243506137        6.8765159824        0.1540978387
3602        44.27190346        44.0845482182        9.1690079569        0.2079868872
207        43.76003325        43.534273676        9.6975195297        0.2227559739
1785        42.88695283        43.4312399486        8.2699452851        0.1904146714
1714        43.01192386        42.548981107        10.4744349747        0.2461735793
2848        42.09926229        42.3315382699        5.5963086425        0.1322018729
5571        41.52437471        41.7434170692        9.1647109829        0.2195486528
3322        41.46602143        41.5494849767        7.1743838875        0.1726708259
4334        40.44991373        41.05033774        8.7102627815        0.2121849237
5162        40.45440709        40.9929568271        8.2624477928        0.2015577414
5048        39.89000748        40.3308767357        11.0199899828        0.2732395344
2363        39.94545778        40.1152579819        6.6177263936        0.1649678134
280        39.5619946        39.5341268065        7.3717432763        0.1864653117
4207        38.2684727        39.4991498122        6.9528849981        0.1760261938
5505        39.67352888        38.9668291926        11.3348728596        0.2908851732
217        36.77649547        37.4492632177        6.4891284445        0.1732778668
836        36.43648963        37.0437210956        12.1307341233        0.3274707228
503        36.81699351        36.7802949819        7.9491833149        0.2161261436
1322        36.38199798        36.7254993257        8.5268395114        0.2321776332
4451        35.19372256        35.3483644749        9.807710599        0.2774586815
623        34.52165055        35.1189107974        7.930898959        0.2258298671
1648        35.50610406        35.0638323174        10.815198205        0.3084431304
51        34.66010328        34.6703806244        5.4485310273        0.157152328
122        34.32806143        33.5962803896        7.5092149942        0.223513285
115        31.91437124        31.3399395607        8.4108320311        0.2683742263
5212        30.01729221        30.4525516362        8.9862156315        0.2950890861
1701        29.87650404        30.3212455768        6.3833025833        0.2105224394
3357        29.17742219        29.6022237315        6.381280757        0.2155676146
1572        29.88934385        29.5148636895        7.882621955        0.2670729582
3996        29.80296599        29.071104692        12.1221539603        0.4169829144
2655        26.12997208        26.8414199039        8.2799141902        0.3084752677
3278        27.75400612        26.676383757        8.7090459236        0.3264702593
2605        26.77170149        26.4416718205        7.2093344642        0.2726504781
2914        25.16358084        25.6405460981        8.2266061339        0.3208436397
5536        25.12712518        25.537683706        8.9692243899        0.3512152666
108        25.12900331        24.9994393089        8.1059495087        0.3242452524
4977        23.84091367        24.1678220977        8.8309117942        0.3653995697
931        20.64386303        20.6395850124        9.7862519781        0.4741496485
3284        20.6263851        20.3004828941        7.7358872421        0.3810691244
5667        20.24853487        20.2012572648        10.5728126478        0.5233739915
188        19.63432177        19.5009951172        8.527091207        0.4372644142
5692        17.52522898        16.9741593261        9.9533189003        0.5863806689
1700        15.35451961        15.0093164719        7.5208523959        0.5010789405
4010        12.26210563        13.9952121466        9.8487154699        0.7037203414
1706        12.6972477        11.7147928015        6.1811481569        0.5276361487
3103        12.14379904        11.6822069225        8.4008681879        0.7191165371
378        11.36567533        11.6581748916        8.2483175766        0.7075136248
3238        8.946537399        9.2298154231        9.6683698675        1.0475149745
5581        9.500192257        8.7380812257        8.2123397521        0.9398333044
5464        4.214298451        5.4505495437        7.2289498778        1.326279088
41        5.007828439        4.3002816244        9.0353666405        2.1011104457
2220        4.381189923        4.2360658386        6.880055327        1.6241615662
4364        4.923793169        3.504087428        8.6917749423        2.4804674886
1089        1.005273551        0.9765385053        6.9399339807        7.1066670109
691        -1.731531162        -1.2995295456        11.9708242834        9.2116599609

Original OPR is just copied straight from Ether's OPR.csv; Mean OPR is just the mean OPR that a team received across the 200 iterations; Standard Deviation is the Standard Deviation of all the OPRs a team recieved and the final column is just the standard deviation divided by the Mean OPR. The data is sorted by Mean OPR.

In terms of whether this is a valid way of looking at it, I'm not sure--the results seem to have some meaning, but I'm not sure how much of it is just that only looking at 200 scores is even worse than just 254, or if there is something more meaningful going on.

*Using python's random.sample() function. This means that I did nothing to prevent duplicate runs (which are extremely unlikely; 254 choose 200 is ~7.2 * 10^55) and nothing to ensure that a team didn't "play" <3 times in the selection of 200 scores.

wgardner 16-05-2015 18:07

Re: "standard error" of OPR values
 
Quote:

Originally Posted by Ether (Post 1482525)
This is interesting but not what I'm looking for.

The question is this thread is how (or if) a standard, textbook, widely-used, statistically valid "standard error" (as mention by Citrus Dad and quoted in the original post in this thread) can be computed for OPR from official FRC qual match results data unsupplemented by manual scouting data or any other data.



I guess I'm not sure how you're defining "standard error." I assume you're trying to get some confidence on the OPR value itself (not in how well the OPR can predict match results, which is the other error I referred to previously).

The method I propose above gives a standard deviation measure on how much a single match changes a team's OPR. I would think this is something like what you want. If not, can you define what you're looking for more precisely?

Also, rather than taking 200 of 254 matches and looking at the standard deviation of all OPRs, I suggest just removing a single match (e.g., compute OPR based on 253 of the 254 matches) and looking at how that removal affects only the OPRs of the teams involved in the removed match.

So if you had 254 matches in a tournament, you'd compute 254 different sets of OPRs (1 for each possible match removal) and then look at the variability of the OPRs only for the teams involved in each specific removed match.

This only uses the actual qualification match results, no scouting or other data as you want.

wgardner 16-05-2015 18:24

Re: "standard error" of OPR values
 
And just to make sure I'm being clear (because I fear that I may not be):

Let's say that team 1234 played in a tournament and was involved in matches 5, 16, 28, 39, 51, and 70.

You compute team 1234's OPR using all matches except match 5. Say it's 55.
Then you compute team 1234's OPR using all matches except match 16. Say it's 60.
Keep repeating this, removing each of that team's matches, which will give you 6 different OPR numbers. Let's say that they're 55, 60, 50, 44, 61, and 53. Then you can compute the standard deviation of those 6 numbers to give you a confidence on what team 1234's OPR is.

Of course, you can do this for every team in the tournament and get team-specific OPR standard deviations and an overall tournament OPR standard deviation.

Team 1234 may have a large standard deviation (because maybe 1/3 of the time they always knock over a stack in the last second) while team 5678 may have a small standard deviation (because they always contribute the exactly same point value to their alliance's final score).

And hopefully the standard deviations will be lower in tournaments with more matches per team because you have more data points to average.

Ether 16-05-2015 18:29

Re: "standard error" of OPR values
 
Quote:

Originally Posted by wgardner (Post 1482546)
I guess I'm not sure how you're defining "standard error."

I am not defining "standard error".

I am asking you (or anyone who cares to weigh in) to pick a definition from an authoritative source and use that definition to compute said standard errors of the OPRs (or state why not):

Quote:

Originally Posted by Ether (Post 1482422)
for those of you who answered "yes":

Pick an authoritative (within the field of statistics) definition for standard error, and compute that "standard error" for each Team's OPR for the attached example.

Quote:

Originally Posted by Ether (Post 1482497)
... and for those of you who think the answer is "no", explain why none of the well-defined "standard errors" (within the field of statistics) can be meaningfully applied to the example data (provided in the linked post) in a statistically valid way.




Ether 16-05-2015 18:39

Re: "standard error" of OPR values
 
Quote:

Originally Posted by wgardner (Post 1482546)
I assume you're trying to get some confidence on the OPR value itself

No. I am not trying to do this. I will try to be clearer:

Citrus Dad asked why no-one ever reports "the" standard error for the OPRs.

"Standard Error" is a concept within the field of statistics. There are several well-defined meanings depending on the context.

So what am trying to do is this: have a discussion about what "the" standard error might mean in the context of OPR.



Ether 16-05-2015 18:47

Re: "standard error" of OPR values
 
Quote:

Originally Posted by wgardner (Post 1482548)
And just to make sure I'm being clear (because I fear that I may not be)

No, your original post was quite clear. And interesting. But it's not what I am asking about.




Ether 16-05-2015 19:11

Re: "standard error" of OPR values
 

@ Citrus Dad:

If you are reading this thread, would you please weigh in here and reveal what you mean by "the standard errors" of the OPRs, and how you would compute them, using only the data in the example I posted?

Also, what assumptions do you have to make about the data and the model in order for the computed standard errors to be statistically valid/relevant/meaningful, and what is the statistical meaning of those computed errors?



Oblarg 17-05-2015 03:20

Re: "standard error" of OPR values
 
Quote:

Originally Posted by Ether (Post 1482550)
So what am trying to do is this: have a discussion about what "the" standard error might mean in the context of OPR.

Let us assume that the model OPR uses is a good description of FRC match performance - that is, match scores are given by a linear sum of team performance values, and each team's performance in a given match is described by a random variable whose distribution is identical between matches.

OPR should then yield an estimate of the mean of this distribution. An estimate of the standard deviation can be obtained, as mentioned, by taking the RMS of the residuals.

To approximate the standard deviation of the mean (which is what is usually meant by "standard error" of these sorts of measurements), one would then divide this by sqrt(n) (for those interested in a proof of this, simply consider the fact that when summing random variables, variances add), where n is the number of matches used in the team's OPR calculation.

This, of course, fails if the assumptions we made at the outset aren't good (e.g. OPR is not a good model of team performance). Moreover, even if the assumptions hold, if the distribution of the random variable describing a team's performance in a given match is sufficiently wonky that the distribution of the mean is not particularly Gaussian then one is fairly limited in the conclusions they can draw from the standard deviation, anyway.

wgardner 17-05-2015 05:51

Re: "standard error" of OPR values
 
Quote:

Originally Posted by Oblarg (Post 1482620)
To approximate the standard deviation of the mean (which is what is usually meant by "standard error" of these sorts of measurements), one would then divide this by sqrt(n) (for those interested in a proof of this, simply consider the fact that when summing random variables, variances add), where n is the number of matches used in the team's OPR calculation.

I am interested in a proof of this, because I don't think the normal assumptions hold. Can you explain this more in the full context of how OPR is computed? [Edit: I spent more time trying to derive this whole thing. See my next posts for an attempt at the derivation].

What you say holds if one is taking a number of independent, noisy measurements of a value and computing the mean of the measurements as the estimate of the underlying value. So that would work if OPR was computed by simply averaging the match scores for a team (and dividing by 3 to accommodate for 1/3 of the match score being due to each team's contribution).

But that's not the way OPR is computed at all. It's computed using linear regressions and all of the OPRs for all of the teams are computed simultaneously in one big matrix operation.

For example, it isn't clear to me what n should be. You say "n is the number of matches used in the team's OPR calculation." But all OPRs are computed at the same time using all of the available match data. Does n count matches that a team didn't play in, but that are still used in the computation? Is n the number of matches a team has played? Or the total matches? OPR can be computed based on whatever matches have already occurred at any time. So if some teams have played 4 matches and some have played 5, it would seem like the OPRs for the teams that have played fewer matches should have more uncertainty than the OPRs for the teams that have played more. And the fact that the computation is all intertwined and that the OPRs for different teams are not independent (e.g., if one alliance has a huge score in one match, that affects 3 OPRs directly and the rest of them indirectly through the computation) seems to make the standard assumptions and arguments suspect.

Thoughts?


All times are GMT -5. The time now is 13:38.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi