You may possess me without penalty.
Home
Go Back   Chief Delphi > FIRST > General Forum
CD-Media   CD-Spy  
portal register members calendar search Today's Posts Mark Forums Read FAQ rules

 
Reply
Thread Tools Rating: Thread Rating: 4 votes, 5.00 average. Display Modes
  #76   Spotlight this post!  
Unread 27-05-2015, 17:45
jtrv's Avatar
jtrv jtrv is offline
github.com/jhtervay
AKA: Justin
FRC #2791 (Shaker Robotics)
Team Role: College Student
 
Join Date: Jan 2013
Rookie Year: 2012
Location: Latham, NY
Posts: 148
jtrv is a name known to alljtrv is a name known to alljtrv is a name known to alljtrv is a name known to alljtrv is a name known to alljtrv is a name known to all
Re: "standard error" of OPR values

hi all,

as a student going into his first year of undergrad this fall, this kind of stuff interests me. what level (or course equivalent or experience of the student) is this kind of stuff typically taught at?

I have researched into interpolation, as I would like to spend some time developing spline path generation for auton modes independently, and that particular area requires a bit of knowledge in Linear Algebra, which I will begin the process of self-teaching soon enough.

As for this, what would be the equivalent of interpolation:linear algebra?

I don't mean to hijack the thread, but it feels like the most appropriate place to ask...
__________________
2791 (2012-2017)
Reply With Quote
  #77   Spotlight this post!  
Unread 23-06-2015, 21:32
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,154
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by Citrus Dad View Post
It's been known that OPR doesn't reflect actual scoring ability. It's a regression analysis that computes the implied "contribution" to scores. Unfortunately no one ever posts the estimates' standard errors
Getting back to this after a long hiatus.

If you are asking for individual standard error associated with each OPR value, no one ever posts them because the official FRC match data doesn't contain enough information to make a meaningful computation of those individual values.

In a situation, unlike FRC OPR, where you know the variance of each observed value (either by repeated observations using the same values for the predictor variables, or if you are measuring something with an instrument of known accuracy) you can put those variances into the design matrix for each observation and compute a meaningful standard error for each of the model parameters.

Or if, unlike FRC OPR, you have good reason to believe the observations are homoscedastic, you can compute the variance of the residuals and use that to back-calculate standard errors for the model parameters. If you do this for FRC data the result will be standard errors which are very nearly the same for each OPR value... which is clearly not the expected result.


Reply With Quote
  #78   Spotlight this post!  
Unread 29-06-2015, 16:52
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 994
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by Ether View Post
Getting back to this after a long hiatus.

If you are asking for individual standard error associated with each OPR value, no one ever posts them because the official FRC match data doesn't contain enough information to make a meaningful computation of those individual values.

In a situation, unlike FRC OPR, where you know the variance of each observed value (either by repeated observations using the same values for the predictor variables, or if you are measuring something with an instrument of known accuracy) you can put those variances into the design matrix for each observation and compute a meaningful standard error for each of the model parameters.

Or if, unlike FRC OPR, you have good reason to believe the observations are homoscedastic, you can compute the variance of the residuals and use that to back-calculate standard errors for the model parameters. If you do this for FRC data the result will be standard errors which are very nearly the same for each OPR value... which is clearly not the expected result.


The standard errors for the OPR values can be computed, but they are in fact quite large relative to the parameter values. Which is actually my point--the statistical precision of the OPR values are really quite poor because there are so few observations, which are in fact not independent. Rather than ignoring the SEs because they show how poor the OPR estimators are performing, the SEs should be reported to show how poorly the estimators perform for everyone's consideration.
__________________
Reply With Quote
  #79   Spotlight this post!  
Unread 29-06-2015, 17:20
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,154
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: "standard error" of OPR values


I think you missed my point entirely.

Quote:
Originally Posted by Citrus Dad View Post
The standard errors for the OPR values can be computed...
Yes, they can be computed, but that doesn't mean they are statistically valid. They are not, because the data does not conform to the necessary assumptions.

Quote:
...but they are in fact quite large relative to the parameter values.
Yes they are, but they are also nearly all the same value... which is obviously incorrect... and a result of assumptions which the data does not meet.

Quote:
the statistical precision of the OPR values are really quite poor because there are so few observations, which are in fact not independent.
Lack of independence is only one of the assumptions which the data do not meet.

Quote:
Rather than ignoring the SEs because they show how poor the OPR estimators are performing...
They are not being ignored "because they show how poor the OPR estimators are performing"; they are not being reported because they are invalid and misleading.

Quote:
the SEs should be reported to show how poorly the estimators perform for everyone's consideration.
There are better metrics to report to show how poorly the estimators perform.



Reply With Quote
  #80   Spotlight this post!  
Unread 29-06-2015, 21:38
asid61's Avatar
asid61 asid61 is offline
Registered User
AKA: Anand Rajamani
FRC #1072 (Harker Robotics)
Team Role: Mentor
 
Join Date: Jan 2014
Rookie Year: 2013
Location: Cupertino, CA
Posts: 2,234
asid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond reputeasid61 has a reputation beyond repute
Re: "standard error" of OPR values

So it's not possible to perform a statistically valid calculation for standard deviation? Are there no ways to solve for it with a system that is dependent on other robots' performances?
__________________
<Now accepting CAD requests and commissions>

Reply With Quote
  #81   Spotlight this post!  
Unread 30-06-2015, 01:20
GeeTwo's Avatar
GeeTwo GeeTwo is offline
Technical Director
AKA: Gus Michel II
FRC #3946 (Tiger Robotics)
Team Role: Mentor
 
Join Date: Jan 2014
Rookie Year: 2013
Location: Slidell, LA
Posts: 3,797
GeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by Citrus Dad View Post
The standard errors for the OPR values can be computed, but they are in fact quite large relative to the parameter values. Which is actually my point--the statistical precision of the OPR values are really quite poor because there are so few observations, which are in fact not independent. Rather than ignoring the SEs because they show how poor the OPR estimators are performing, the SEs should be reported to show how poorly the estimators perform for everyone's consideration.
+1, +/- 0.3.

Quote:
Originally Posted by Ether View Post
There are better metrics to report to show how poorly the estimators perform.
While it would be great if standard error could be used as a measure of consistency of a team, but that's not its only function. I agree with Richard that one of the benefits of an error value is to provide an indication of how much difference is (or is not) significant. If the error bars on the OPRs are all (for example) about 10 points, then a 4 point difference in OPR between two teams probably means less in sorting a pick list than does a qualitative difference in a scouting report.

As it turns out, I was recently asked for the average time it takes members of my branch to produce environmental support products. Because we get requests that range from a 10 mile square box on one day to seasonal variability for a whole ocean basin, the (requested) mean production time means nothing. For one class of product, the standard deviation of production times was greater than the mean. Without the scatter info, the reader would have probably assumed that we were making essentially identical widgets and that the scatter was +/- 1 or 2 in the last reported digit.
__________________

If you can't find time to do it right, how are you going to find time to do it over?
If you don't pass it on, it never happened.
Robots are great, but inspiration is the reason we're here.
Friends don't let friends use master links.
Reply With Quote
  #82   Spotlight this post!  
Unread 30-06-2015, 10:01
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,154
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by asid61 View Post
So it's not possible to perform a statistically valid calculation for standard deviation?
We're discussing standard error of the model parameters, also known as standard error of the regression coefficients. So in our particular case, that would be standard error of the OPRs.

Standard error of the model parameters is a very useful statistic in those cases where it applies. I mentioned one such situation in my previous post:

Quote:
In a situation, unlike FRC OPR, where you know the variance of each observed value (either by repeated observations using the same values for the predictor variables, or if you are measuring something with an instrument of known accuracy) you can put those variances into the design matrix for each observation and compute a meaningful standard error for each of the model parameters.
An example of the above would be analysis and correction of land surveying network measurement data. The standard deviation of the measurements is known a priori from the manufacturer's specs for the measurement instruments and from the surveyor's prior experience with those instruments.

In such as case, computing standard error of the model parameters is justified, and the results are meaningful. All modern land surveying measurement adjustment apps include it in their reports.


Quote:
Are there no ways to solve for it with a system that is dependent on other robots' performances?
That's a large (but not the only) part of the problem.

I briefly addressed this in my previous post:

Quote:
Or if, unlike FRC OPR, you have good reason to believe the observations are homoscedastic, you can compute the variance of the residuals and use that to back-calculate standard errors for the model parameters. If you do this for FRC data the result will be standard errors which are very nearly the same for each OPR value... which is clearly not the expected result.
In the case computing OPRs using only FIRST-provided match results data (no manual scouting), the data does not meet the requirements for using the above technique.

In fact, when you use the above technique for OPR you are essentially assuming that all teams are identical in their consistency of scoring, so it's not surprising that when you put that assumption into the calculation you get it back out in the results. GIGO.

Posting invalid and misleading statistics is a bad idea, especially when there are better, more meaningful statistics to fill the role.

For Richard and Gus: If all you are looking for is one overall ballpark number "how bad are the OPR calculations for this event" let's explore better ways to present that.



Reply With Quote
  #83   Spotlight this post!  
Unread 30-06-2015, 13:59
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 994
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by Ether View Post

I think you missed my point entirely.



Yes, they can be computed, but that doesn't mean they are statistically valid. They are not, because the data does not conform to the necessary assumptions.



Yes they are, but they are also nearly all the same value... which is obviously incorrect... and a result of assumptions which the data does not meet.



Lack of independence is only one of the assumptions which the data do not meet.



They are not being ignored "because they show how poor the OPR estimators are performing"; they are not being reported because they are invalid and misleading.



There are better metrics to report to show how poorly the estimators perform.



But based on this response, the OPR estimates themselves should not be reported because they are not statistically valid either. Instead by not reporting some measure of the potential error, they give the impression of precision to the OPRs.

I just discussed this problem as a major failing for engineers in general--if they are not fully comfortable in reporting a parameter, e.g., a measure of uncertainty, they often will simply ignore the parameter entirely. (I was discussing how the value of solar PV is being estimated across a dozen studies. I've seen this tendency over and over in almost 30 years of professional work.) Instead, the appropriate method ALWAYS, ALWAYS, ALWAYS is to report the uncertain or unknown parameter with some sort of estimate and all sorts of caveats. Instead what happens is that decisionmakers and stakeholders much too often accept the values given as having much greater precision than they actually have.

While calculating the OPR really is of no true consequence, because we are working with high school students who are very likely to be engineers, it is imperative that they understand and use the correct method of presenting their results.

So, the SEs should be reported as the best available approximation of the error term around the OPR estimates. And the caveats about the properties of the distribution can be reported with a discussion about the likely biases in the parameters due to the probability distributions.
__________________
Reply With Quote
  #84   Spotlight this post!  
Unread 30-06-2015, 15:00
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,154
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by Citrus Dad View Post
But based on this response, the OPR estimates themselves should not be reported because they are not statistically valid either.
Sez who? They are the valid least-squares fit to the model. That is all they are. According to what criteria are they then not valid?

Quote:
Instead by not reporting some measure of the potential error, they give the impression of precision to the OPRs.
Who is suggesting not to report some measure of the potential error? Certainly not me. Read my posts.

Quote:
I just discussed this problem as a major failing for engineers in general--if they are not fully comfortable in reporting a parameter, e.g., a measure of uncertainty, they often will simply ignore the parameter entirely.
I do not have the above failing, if that is what you were implying.


Quote:
ALWAYS, ALWAYS, ALWAYS is to report the uncertain or unknown parameter with some sort of estimate and all sorts of caveats.
You are saying this as if you think I disagree. If so, you would be wrong.


Quote:
Instead what happens is that decisionmakers and stakeholders much too often accept the values given as having much greater precision than they actually have.
Exactly. And perhaps more often than you realize, those values they are given shouldn't have been reported in the first place because the data does not support them. Different (more valid) measures of uncertainty should have been reported.


Quote:
While calculating the OPR really is of no true consequence, because we are working with high school students who are very likely to be engineers, it is imperative that they understand and use the correct method of presenting their results.
Well I couldn't agree more, and it is why we are having this discussion.

Quote:
So, the SEs should be reported as the best available approximation of the error term around the OPR estimates
Assigning a separate standard error to each OPR value computed from the FIRST match results data is totally meaningless and statistically invalid. As you said above, "it is imperative that they understand and use the correct method of presenting their results".

Let's explore alternative ways to demonstrate the shortcomings of the OPR values.

Quote:
the caveats about the properties of the distribution can be reported with a discussion about the likely biases in the parameters due to the probability distributions
"Likely" is an understatement. The individual (per-OPR) computed standard error values are obviously and demonstrably wrong (this can be verified with manual scouting data). And what's more, we know why they are wrong.

As I've suggested in my previous two posts, how about let's explore alternative, valid ways to demonstrate the shortcomings of the OPR values.

One place to start might be to ask whether or not the average value of the vector of standard errors of OPRs might be meaningful, and if so, what exactly it means.



Reply With Quote
  #85   Spotlight this post!  
Unread 01-07-2015, 18:07
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 994
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: "standard error" of OPR values

Ether

I wasn't quite sure why you dug up my original post to start this discussion. It seemed out of context with all of your other discussion about adding error estimates. That said, my request was more general, and it seems to be answered more generally by the other computational efforts that have been going on in the 2 related threads.

But one point, I will say that using a fixed effects models with a separate match progression parameter (to capture the most likely source of heteroskedasticity) should lead to parameter estimates that will provide valid error terms using FRC data. But computing fixed effects models are much more complex processes. It is something that can be done in R.

Quote:
Originally Posted by Ether View Post
Sez who? They are the valid least-squares fit to the model. That is all they are. According to what criteria are they then not valid?
That one can calculate a number doesn't mean that the number is meaningful. Without a report of the error around the parameter estimates, the least squares fit is not statistically valid and the meaning cannot be interpreted. This is a fundamental principle in econometrics (and I presume in statistics in general.)
__________________
Reply With Quote
  #86   Spotlight this post!  
Unread 01-07-2015, 20:15
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,154
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by Citrus Dad View Post
That one can calculate a number doesn't mean that the number is meaningful.
I'm glad you agree with me on this very important point. It's what I have been saying about your request for SE estimates for each individual OPR.


Quote:
Without a report of the error around the parameter estimates, the least squares fit is not statistically valid
Without knowing your private definition of "statistically valid" I can neither agree nor disagree.


Quote:
and the meaning cannot be interpreted.
The meaning can be interpreted as follows: It is the set of model parameters which minimizes the sum of the squares of the differences between the actual and model-predicted alliance scores. This is universally understood. Now once you've done that regression, proceeding to do inferential statistics based on the fitted model is where you hit a speed bump because the data does not satisfy the assumptions required for many of the common statistics.

The usefulness of the fitted model can, however, be assessed without using said statistics.


Quote:
I wasn't quite sure why you dug up my original post to start this discussion.
I had spent quite some time researching the OP question and came back to tie up loose ends.

Quote:
It seemed out of context with all of your other discussion about adding error estimates.
How so? I think I have been fairly consistent throughout this thread.

Quote:
That said, my request was more general, and it seems to be answered more generally by the other computational efforts that have been going on in the 2 related threads.
Your original request was (emphasis mine):
Quote:
I'm thinking of the parameter standard errors, i.e., the error estimate around the OPR parameter itself for each team. That can be computed from the matrix--it's a primary output of any statistical software package.
During the hiatus I researched this extensively. The standard error of model parameters (regression coefficients) is reported by SPSS, SAS, MINITAB, R, ASP, MicrOsiris, Tanagra, and even Excel. All these packages compute the same set of values, so they are all doing the same thing.

Given [A][x]=[b], the following computation produces the same values as those packages:
x = A\b;
residuals = b-A*x;
SSres = residuals'*residuals;
VARres = SSres/(alliances-teams);

Av = A/sqrt(VARres);
Nvi = inv(Av'*Av);
SE_of_parameters = sqrt(diag(Nvi))
The above code clearly shows that this computation is assuming that the standard deviation is constant for all measurements (alliance scores) and thus for all teams... which we know is clearly not the case. That's one reason it produces meaningless results in the case of FRC match results data.


Quote:
But one point, I will say that using a fixed effects models with a separate match progression parameter (to capture the most likely source of heteroskedasticity) should lead to parameter estimates that will provide valid error terms using FRC data. But computing fixed effects models are much more complex processes. It is something that can be done in R.
That's an interesting suggestion, but I doubt it would be successful. I'd be pleased to be proven wrong. If you are willing to try it, I will provide whatever raw data you need in your format of choice.



Reply With Quote
  #87   Spotlight this post!  
Unread 02-07-2015, 18:25
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 994
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: "standard error" of OPR values

One definition of statistical validity:
https://explorable.com/statistical-validity

Statistical validity refers to whether a statistical study is able to draw conclusions that are in agreement with statistical and scientific laws. This means if a conclusion is drawn from a given data set after experimentation, it is said to be scientifically valid if the conclusion drawn from the experiment is scientific and relies on mathematical and statistical laws.

Quote:
It is the set of model parameters which minimizes the sum of the squares of the differences between the actual and model-predicted alliance scores. This is universally understood.
This is the point upon which we disagree. This is not a mathematical exercise--it is a statistical one. And statistical analysis requires inference about the validity of the estimated parameters. And I strongly believe that the many students who will be working in engineering in the future who read this need to understand that this is a statistical exercise which requires all of the caveats of such analysis.

Here's a discussion for fixed effects from the SAS manual:
http://www.sas.com/storefront/aux/en...48_excerpt.pdf
__________________
Reply With Quote
  #88   Spotlight this post!  
Unread 02-07-2015, 20:21
GeeTwo's Avatar
GeeTwo GeeTwo is offline
Technical Director
AKA: Gus Michel II
FRC #3946 (Tiger Robotics)
Team Role: Mentor
 
Join Date: Jan 2014
Rookie Year: 2013
Location: Slidell, LA
Posts: 3,797
GeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond reputeGeeTwo has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by Citrus Dad View Post
This is not a mathematical exercise--it is a statistical one. And statistical analysis requires inference about the validity of the estimated parameters.
One of the two textbooks for my intermediate mechanics lab (sophomores and juniors in physics and engineering) was entitled How to Lie with Statistics. Chapter 4 is entitled "Much Ado about Practically Nothing." For me, the takeaway sentence from this chapter is:
Quote:
Originally Posted by Huff, How to Lie with Statistics
You must always keep that plus-or-minus in mind, even (or especially) when it is not stated.
Unfortunately, not many high schoolers have been exposed to this concept.

Finally, if standard errors could be validly produced for each team as a measure of its consistency/reliability, that would be outstanding. Given that teams change strategy and modify robots between matches, (and this year's nonlinear scoring), it is not surprising that per-team standard error calculations are not valid. (And by the way, Ether's finding that the numbers could be calculated but did not communicate variability is at least qualitatively similar to Richard's argument concerning OPR.)

This does not negate the need for a "standard error" or "probable error" of the whole data set. OPR is ultimately a measurement, and anyone using OPR to drive a decision needs to understand the accuracy. That is, does a difference of 5 points in OPR means that one team is better than the other with 10% confidence, 50% confidence, or 90% confidence?
__________________

If you can't find time to do it right, how are you going to find time to do it over?
If you don't pass it on, it never happened.
Robots are great, but inspiration is the reason we're here.
Friends don't let friends use master links.

Last edited by GeeTwo : 02-07-2015 at 20:27.
Reply With Quote
  #89   Spotlight this post!  
Unread 12-07-2015, 09:25
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: "standard error" of OPR values

Quote:
Originally Posted by Ether View Post
As I've suggested in my previous two posts, how about let's explore alternative, valid ways to demonstrate the shortcomings of the OPR values.

One place to start might be to ask whether or not the average value of the vector of standard errors of OPRs might be meaningful, and if so, what exactly it means.
Hi All,

Ether and I have been having some private discussions and running some simulations on this topic. I thought I'd report the general results here. I think Ether agrees with what I say below, but I'll leave that for him to confirm or deny.


Executive Summary:

1. The mean of the standard error vector for the OPR estimates is a decent approximation for the standard deviation of the team-specific OPR estimates themselves, and is a very good approximation for the mean of the standard deviations of the team-specific OPR estimates taken across all of the teams in the tournament.

2. Teams with more variability in their offensive contributions (e.g., teams that contribute a huge amount to their alliance's score by performing some high-scoring feats, but fail at doing so 1/2 the time) will have slightly more uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.

3. Teams with less variability in their offensive contributions (e.g., consistent teams that always contribute about the same amount to their alliance's score every match) will have slightly less uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.

Details:

I simulated match scores in the following way.

1. I computed the actual OPRs from the actual match data (in this case, from the 2014 misjo tournament as suggested by Ether).

2. I computed the sum of the squared values of the prediction residual and divided this sum by (#matches - #teams) to get an estimate of the per-match randomness that exists after the OPR prediction is performed.

3. I divided the result from step#2 above by 3 to get a per-team estimate of the variance of each team's offensive contribution. I took the square root of this to get the per-team estimate of the standard deviation of each team's offensive contribution.

4. I then simulated 1000 tournaments using the same match schedule as the 2014 misjo tournament. The simulated match scores were the sum of the 3 OPRs for the teams in that match plus 3 zero-mean, variance-1 normally distributed random numbers scaled by the 3 per-team offensive standard deviations computed in step #3. Note that at this point, each team has the same value for the per-team offensive standard deviations.

5. I then computed the OPR estimates from the match scores for each simulated tournament and computed the actual standard deviation of the 1000 OPR estimates for each team. These standard deviations were all close to 11.5 (between 11 and 12) which was the average of the elements of the traditional standard error vector calculation performed on the original data. This makes sense, as the standard error is supposed to be the standard deviation of the estimates if the randomness of the match scores had equal variance for all matches, as was simulated. As a reminder, all of the individual elements of the standard error vector were extremely close to 11.5 in this case.

6. But then I tried something different. Instead of having the per-team standard deviation of the offensive contributions be constant, I instead added a random variable to these standard deviations and then renormalized all of them so that the average variance of the match scores would be unchanged. In other words, now some teams have a larger variance in their offensive contributions (e.g., team A might have an OPR of 30 but have its score contribution typically vary between 15 and 45) while other teams might have a smaller variance in their contributions (e.g., team B might also have an OPR of 30 but have its score contribution only typically vary between 25 and 35).

7. Now I resimulated another 1000 tournaments using this model. So now, some match scores might have greater variances and some match scores might have smaller variances. But the way OPR was calculated was not changed.

8. Then I calculated the OPRs for these new 1000 simulated tournaments and calculated the standard deviations of these 1000 new per-team OPR estimates.

What I found was that the OPR estimates did vary more for teams that had a greater offensive variance and did vary less for teams that had a smaller offensive variance. So, if you're convinced that different teams have substantially different variances in their offensive contributions, then just using the one average standard error computation to estimate how reliable all of the different OPR estimates are is not completely accurate.

But the differences were not that large. For example, in one set of simulations, team A had an offensive contribution with a standard deviation of 8 while team B had an offensive contribution with a standard deviation of 29. So in this case, team B had a LOT more variability in their offensive contribution than team A did (almost 4x as much). But the standard deviation of the 1000 OPR estimates for team A was 10.8 while the standard deviation of the 1000 OPR estimates for team B was 12.9. So yes, team B had a much bigger offensive variability and that made the confidence in their OPR estimates worse than the 11.5 that the standard error would suggest, but it only went up by 1.4, while team A had a much smaller offensive variability but that only improved the confidence in their OPR estimates by 0.7.

And also, the average of the standard deviations of the OPR estimates for the teams in the 1000 tournaments was still very close to the average of the standard error vector computed assuming that the match scores had identical variances.

So, repeating the Executive Summary:

1. The mean of the standard error vector for the OPR estimates is a decent approximation for the standard deviation of the team-specific OPR estimates themselves, and is a very good approximation for the mean of the standard deviations of the team-specific OPR estimates taken across all of the teams in the tournament.

2. Teams with more variability in their offensive contributions (e.g., teams that contribute a huge amount to their alliance's score by performing some high-scoring feats, but fail at doing so 1/2 the time) will have slightly more uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.

3. Teams with less variability in their offensive contributions (e.g., consistent teams that always contribute about the same amount to their alliance's score every match) will have slightly less uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 12-07-2015 at 13:21.
Reply With Quote
  #90   Spotlight this post!  
Unread 12-07-2015, 20:21
Oblarg Oblarg is online now
Registered User
AKA: Eli Barnett
FRC #0449 (The Blair Robot Project)
Team Role: Mentor
 
Join Date: Mar 2009
Rookie Year: 2008
Location: Philadelphia, PA
Posts: 1,133
Oblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond reputeOblarg has a reputation beyond repute
Re: "standard error" of OPR values

Quote:
Originally Posted by wgardner View Post
Hi All,

Ether and I have been having some private discussions and running some simulations on this topic. I thought I'd report the general results here. I think Ether agrees with what I say below, but I'll leave that for him to confirm or deny.


Executive Summary:

1. The mean of the standard error vector for the OPR estimates is a decent approximation for the standard deviation of the team-specific OPR estimates themselves, and is a very good approximation for the mean of the standard deviations of the team-specific OPR estimates taken across all of the teams in the tournament.

2. Teams with more variability in their offensive contributions (e.g., teams that contribute a huge amount to their alliance's score by performing some high-scoring feats, but fail at doing so 1/2 the time) will have slightly more uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.

3. Teams with less variability in their offensive contributions (e.g., consistent teams that always contribute about the same amount to their alliance's score every match) will have slightly less uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.

Details:

I simulated match scores in the following way.

1. I computed the actual OPRs from the actual match data (in this case, from the 2014 misjo tournament as suggested by Ether).

2. I computed the sum of the squared values of the prediction residual and divided this sum by (#matches - #teams) to get an estimate of the per-match randomness that exists after the OPR prediction is performed.

3. I divided the result from step#2 above by 3 to get a per-team estimate of the variance of each team's offensive contribution. I took the square root of this to get the per-team estimate of the standard deviation of each team's offensive contribution.

4. I then simulated 1000 tournaments using the same match schedule as the 2014 misjo tournament. The simulated match scores were the sum of the 3 OPRs for the teams in that match plus 3 zero-mean, variance-1 normally distributed random numbers scaled by the 3 per-team offensive standard deviations computed in step #3. Note that at this point, each team has the same value for the per-team offensive standard deviations.

5. I then computed the OPR estimates from the match scores for each simulated tournament and computed the actual standard deviation of the 1000 OPR estimates for each team. These standard deviations were all close to 11.5 (between 11 and 12) which was the average of the elements of the traditional standard error vector calculation performed on the original data. This makes sense, as the standard error is supposed to be the standard deviation of the estimates if the randomness of the match scores had equal variance for all matches, as was simulated. As a reminder, all of the individual elements of the standard error vector were extremely close to 11.5 in this case.

6. But then I tried something different. Instead of having the per-team standard deviation of the offensive contributions be constant, I instead added a random variable to these standard deviations and then renormalized all of them so that the average variance of the match scores would be unchanged. In other words, now some teams have a larger variance in their offensive contributions (e.g., team A might have an OPR of 30 but have its score contribution typically vary between 15 and 45) while other teams might have a smaller variance in their contributions (e.g., team B might also have an OPR of 30 but have its score contribution only typically vary between 25 and 35).

7. Now I resimulated another 1000 tournaments using this model. So now, some match scores might have greater variances and some match scores might have smaller variances. But the way OPR was calculated was not changed.

8. Then I calculated the OPRs for these new 1000 simulated tournaments and calculated the standard deviations of these 1000 new per-team OPR estimates.

What I found was that the OPR estimates did vary more for teams that had a greater offensive variance and did vary less for teams that had a smaller offensive variance. So, if you're convinced that different teams have substantially different variances in their offensive contributions, then just using the one average standard error computation to estimate how reliable all of the different OPR estimates are is not completely accurate.

But the differences were not that large. For example, in one set of simulations, team A had an offensive contribution with a standard deviation of 8 while team B had an offensive contribution with a standard deviation of 29. So in this case, team B had a LOT more variability in their offensive contribution than team A did (almost 4x as much). But the standard deviation of the 1000 OPR estimates for team A was 10.8 while the standard deviation of the 1000 OPR estimates for team B was 12.9. So yes, team B had a much bigger offensive variability and that made the confidence in their OPR estimates worse than the 11.5 that the standard error would suggest, but it only went up by 1.4, while team A had a much smaller offensive variability but that only improved the confidence in their OPR estimates by 0.7.

And also, the average of the standard deviations of the OPR estimates for the teams in the 1000 tournaments was still very close to the average of the standard error vector computed assuming that the match scores had identical variances.

So, repeating the Executive Summary:

1. The mean of the standard error vector for the OPR estimates is a decent approximation for the standard deviation of the team-specific OPR estimates themselves, and is a very good approximation for the mean of the standard deviations of the team-specific OPR estimates taken across all of the teams in the tournament.

2. Teams with more variability in their offensive contributions (e.g., teams that contribute a huge amount to their alliance's score by performing some high-scoring feats, but fail at doing so 1/2 the time) will have slightly more uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.

3. Teams with less variability in their offensive contributions (e.g., consistent teams that always contribute about the same amount to their alliance's score every match) will have slightly less uncertainty in their OPR estimate than the mean of the standard error vector would indicate, but not by too much.
Couldn't one generate an estimate for each team's "contribution to variance" by doing the same least-squares fit used to generate OPR in the first place (using the matrix of squared residuals rather than of scores)? This might run the risk of assigning some team a negative contribution to variance (good luck making sense of that one), but other than that (seemingly unlikely) case I can't think of why this wouldn't work.
__________________
"Mmmmm, chain grease and aluminum shavings..."
"The breakfast of champions!"

Member, FRC Team 449: 2007-2010
Drive Mechanics Lead, FRC Team 449: 2009-2010
Alumnus/Technical Mentor, FRC Team 449: 2010-Present
Lead Technical Mentor, FRC Team 4464: 2012-2015
Technical Mentor, FRC Team 5830: 2015-2016
Reply With Quote
Reply


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 14:22.

The Chief Delphi Forums are sponsored by Innovation First International, Inc.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi