Go to Post You know you are a true tech geek when you end up on top of the Sears tower, and the only pictures you post are the ones of the radio antennae. ;) - Tom Line [more]
Home
Go Back   Chief Delphi > Competition > Rules/Strategy > Scouting
CD-Media   CD-Spy  
portal register members calendar search Today's Posts Mark Forums Read FAQ rules

 
Closed Thread
 
Thread Tools Rate Thread Display Modes
  #1   Spotlight this post!  
Unread 19-05-2013, 14:56
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 984
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: An improvement to OPR

Note: I was reviewing the 2834 database and think I found that the Championship OPRs are in error. The sums of the individual components often do not add up to the Total. (3824's in Curie is off by 32.) A quick scan of the regionals finds in some cases no deviations whatsoever and <2 pts maximum in others. I suggest going back and recomputing the OPRs.
  #2   Spotlight this post!  
Unread 19-05-2013, 14:58
efoote868 efoote868 is offline
foote stepped in
AKA: E. Foote
FRC #0868
Team Role: Mentor
 
Join Date: Mar 2006
Rookie Year: 2005
Location: Noblesville, IN
Posts: 1,385
efoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond reputeefoote868 has a reputation beyond repute
Re: An improvement to OPR

A possible explanation is that Ed took into account surrogate matches.
__________________
Be Healthy. Never Stop Learning. Say It Like It Is. Own It.

Like our values? Flexware Innovation is looking for Automation Engineers. Check us out!
  #3   Spotlight this post!  
Unread 20-05-2013, 01:36
Ed Law's Avatar
Ed Law Ed Law is offline
Registered User
no team (formerly with 2834)
 
Join Date: Apr 2008
Rookie Year: 2009
Location: Foster City, CA, USA
Posts: 752
Ed Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond repute
Re: An improvement to OPR

That is exactly the problem. OPR is calculated using all matches including surrogate matches. I would still want to calculate OPR this way. More data point is better even if the match does not count for that team.

Unfortunately team standing from FIRST website only adds up the total of the auto, teleop and climb points of non surrogate matches. This means when I solve A x = b, the matrix A contains the surrogate match while vector b does not contain surrogate match.

My proposal is to scale the value of b for the teams that have the surrogate matches before solving A x = b. Does anybody have any other suggestion?
__________________
Please don't call me Mr. Ed, I am not a talking horse.
  #4   Spotlight this post!  
Unread 20-05-2013, 11:56
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,038
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by Ed Law View Post
My proposal is to scale the value of b for the teams that have the surrogate matches before solving A x = b. Does anybody have any other suggestion?
As a short-term solution that sounds like a reasonable approach to try to make the best out of the data that is available.

Going forward, perhaps someone who has Frank's ear and is interested in statistics could make an appeal to him to resolve the Twitter data issues. At the very least, store the data locally (at the event) and don't delete it until it has been archived at FIRST. Then make the data available to the community.



Last edited by Ether : 20-05-2013 at 11:59. Reason: added link
  #5   Spotlight this post!  
Unread 02-07-2013, 23:56
Ed Law's Avatar
Ed Law Ed Law is offline
Registered User
no team (formerly with 2834)
 
Join Date: Apr 2008
Rookie Year: 2009
Location: Foster City, CA, USA
Posts: 752
Ed Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by Ether View Post
As a short-term solution that sounds like a reasonable approach to try to make the best out of the data that is available.

Going forward, perhaps someone who has Frank's ear and is interested in statistics could make an appeal to him to resolve the Twitter data issues. At the very least, store the data locally (at the event) and don't delete it until it has been archived at FIRST. Then make the data available to the community.


Ether, I changed my mind about scaling numbers that has surrogate match in the vector b before solving Ax=b. I now propose to scale x(auto), x(tele) and x(climb) for each team proportionally so they will add up to the overall OPR.

We can test it afterwards and calculate the b and see how close it is to the missing subscore of the surrogate match.
__________________
Please don't call me Mr. Ed, I am not a talking horse.
  #6   Spotlight this post!  
Unread 20-05-2013, 13:45
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 984
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by efoote868 View Post
A possible explanation is that Ed took into account surrogate matches.
I believe the method relies on the official score database, not on match by match reported scores. The surrogates don't show up there. He would have to use 2 different data sets to get different answers.
  #7   Spotlight this post!  
Unread 20-05-2013, 14:25
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,038
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by Citrus Dad View Post
I believe the method relies on the official score database, not on match by match reported scores. The surrogates don't show up there. He would have to use 2 different data sets to get different answers.
There are two different score datasets at USFIRST: "Match Results" and "Team Standings".

"Match Results" is necessary to construct the alliances matrix and obtain the total match score. It contains the surrogate matches.

"Team Standings" is necessary to obtain the Auto, TeleOp, and Climb alliance scoring. Problem is, the totals shown there do not include the scores for surrogate teams in matches where said teams played as surrogates.

Ed's proposed work-around to scale the "Team Standings" totals for teams which played as surrogates seems like a reasonable one. Do you have a different suggestion?


  #8   Spotlight this post!  
Unread 20-05-2013, 22:59
MikeE's Avatar
MikeE MikeE is offline
Wrecking nice beaches since 1990
no team (Volunteer)
Team Role: Engineer
 
Join Date: Nov 2008
Rookie Year: 2008
Location: New England -> Alaska
Posts: 381
MikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by Ether View Post
There are two different score datasets at USFIRST: "Match Results" and "Team Standings".

"Match Results" is necessary to construct the alliances matrix and obtain the total match score. It contains the surrogate matches.

"Team Standings" is necessary to obtain the Auto, TeleOp, and Climb alliance scoring. Problem is, the totals shown there do not include the scores for surrogate teams in matches where said teams played as surrogates.

Ed's proposed work-around to scale the "Team Standings" totals for teams which played as surrogates seems like a reasonable one. Do you have a different suggestion?


My preferred solution is for FIRST to move to an all district model with 12 matches per event and therefore no more surrogates

Until then...

If we have complete Twitter data for an event then we get the component scores for every match so we don't have an issue.

But to solve the surrogate problem we just need the component scores from the specific surrogate matches. There are at most 3 of these in any competition and typically just 1 or 2 consecutive matches in round 3.
Since there is a single surrogate team in an alliance we just need to add the Twitter component scores to their "Team Standing" score to get the corrected total scores for that surrogate team.

Last edited by MikeE : 20-05-2013 at 23:00. Reason: typo
  #9   Spotlight this post!  
Unread 21-05-2013, 00:33
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,038
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: An improvement to OPR

Quote:
My preferred solution is for FIRST to move to an all district model with 12 matches per event and therefore no more surrogates
The criterion for "no surrogates" is M*6/T = N, where

M is the number of qual matches
T is the number of teams
N is a whole number (the number of matches played by each team)

At CMP, T=100 and M=134, so N was not a whole number; thus there were surrogates.

If instead T=96 and M=128, N would be a whole number (namely 8) and there would be no surrogates.


Quote:
Since there is a single surrogate team in an alliance we just need to add the Twitter component scores to their "Team Standing" score to get the corrected total scores for that surrogate team.
Here's the 2013 season Twitter data for elim and qual matches. It has Archi, Curie, Galileo, & Newton. The usual Twitter data caveats apply.


Quote:
I've been playing around with maximum likelihood estimate models as an alternative (really an extension) to OPR...
What do you mean by "maximum likelihood estimate models" in this context?


Quote:
One more point: I'm a fan of the binary matrix approach...
In this context, I'm assuming "the binary matrix" refers to the 2MxN design matrix [A] of the overdetermined system.

Do you then use QR factorization directly on the binary matrix to obtain the solution, or do you form the normal equations and use Cholesky?



Last edited by Ether : 21-05-2013 at 09:36.
  #10   Spotlight this post!  
Unread 22-05-2013, 15:48
MikeE's Avatar
MikeE MikeE is offline
Wrecking nice beaches since 1990
no team (Volunteer)
Team Role: Engineer
 
Join Date: Nov 2008
Rookie Year: 2008
Location: New England -> Alaska
Posts: 381
MikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by Ether View Post
The criterion for "no surrogates" is M*6/T = N, where

M is the number of qual matches
T is the number of teams
N is a whole number (the number of matches played by each team)

At CMP, T=100 and M=134, so N was not a whole number; thus there were surrogates.

If instead T=96 and M=128, N would be a whole number (namely 8) and there would be no surrogates.
I prefer to think of the surrogate issue in terms of looking at
(T*N) mod 6
i.e. how many teams are left over if everyone plays a certain number of rounds.

If there are any teams left over then we need at least one surrogate match. The scheduling software also adds the reasonable constraint that there can only be one surrogate team per alliance. Putting this all together there are between 0 and 3 matches with surrogates in qualification rounds.

Clearly if either T or N are multiples of 6 then the remainder is zero so no surrogates.

Choosing N=12 guarantees no surrogates however many teams are at the event, gives plenty of matches for each team and also has the nice property that M=2*T so it's easy to estimate the schedule impact. I'm sure the designers of FiM and MAR settled on 12 matches per event through similar reasoning.

Quote:
Originally Posted by Ether View Post
What do you mean by "maximum likelihood estimate models" in this context?
(I'll try to keep this accessible to a wider audience but we can go into further details later.)

OPR estimates a single parameter model for each team, i.e. what is the optimal solution if we model a team's contribution to each match as a constant. We can also use regression (or other optimization techniques) to build richer models. For example we can model each team with two parameters: a constant contribution per match similar to OPR, plus a term which models a team's improvement per round.

But these type of models are deterministic. In other words if we use the model to predict the outcome of a hypothetical match we will always get the same answer. That means we can't use a class of useful simulation methods to get deeper insight into how a collection of matches might play out.

Here's an alternative approach.
Instead of modeling a team's score as a constant (or polynomial function of known features), we treat each team's score as if it is generated from an underlying statistical distribution. Now the problem becomes one of estimating (or assuming) the type of distribution and also estimating the parameters of that distribution for each team.

With OPR we model team X as scoring say 15.3 points every match, so our prediction for a hypothetical match is always 15.3 points.
With a statistical model we would model team X as something like 15.3 +/- 6.3 points. To predict the score for a hypothetical match we choose randomly from the appropriate distribution, and this will obviously be different each time we "play" the hypothetical match.

So with OPR if we "play" a hypothetical match 100 times where OPR(red) > OPR(blue), the final score would be the same every time so red will always win. But if we use a statistical model then red should still win most matches but blue will also win some of the time. Now we have an estimate of the probability of victory for red, which is potentially more useful information than "red wins", and can be used in further reasoning.

MLE is just an approach for getting the parameters from match data. For simplicity I assume a Gaussian distribution, use linear regression as an initial estimate of each team's mean and linear regression on the squared residuals as an initial estimate of each team's variance.

Quote:
Originally Posted by Ether View Post
In this context, I'm assuming "the binary matrix" refers to the 2MxN design matrix [A] of the overdetermined system.

Do you then use QR factorization directly on the binary matrix to obtain the solution, or do you form the normal equations and use Cholesky?
Yes, I mean the design matrix.

I've implemented many numerical algorithms over the years and the main lesson it taught me is not to write them yourself unless absolutely necessary!
So for linear regression I solve the normal equation using Octave (similar to MATLAB). I don't see any meaningful difference between my results and other published sources on CD.
  #11   Spotlight this post!  
Unread 22-05-2013, 19:17
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,038
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by MikeE View Post
for linear regression I solve the normal equation using Octave

Octave uses a polymorphic solver, that selects an appropriate matrix factorization depending on the properties of the matrix.

If the matrix is Hermitian with a real positive diagonal, the polymorphic solver will attempt Cholesky factorization.

Since the normal matrix N=ATA satisfies this condition, Cholesky factorization will be used.


Quote:
MLE is just an approach for getting the parameters from match data. For simplicity I assume a Gaussian distribution, use linear regression as an initial estimate of each team's mean and linear regression on the squared residuals as an initial estimate of each team's variance.
The solution of the normal equations is a maximum likelihood estimator only if the data follows a normal distribution. I was wondering what was the theoretical basis for assuming a normal distribution.


Attached Thumbnails
Click image for larger version

Name:	MLE.JPG
Views:	38
Size:	127.0 KB
ID:	14857  
  #12   Spotlight this post!  
Unread 22-05-2013, 22:59
MikeE's Avatar
MikeE MikeE is offline
Wrecking nice beaches since 1990
no team (Volunteer)
Team Role: Engineer
 
Join Date: Nov 2008
Rookie Year: 2008
Location: New England -> Alaska
Posts: 381
MikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond reputeMikeE has a reputation beyond repute
Re: An improvement to OPR

Thanks for the information Ether. It spurred me to check the details in the Octave documentation

Quote:
Originally Posted by Ether View Post
I was wondering what was the theoretical basis for assuming a normal distribution.
There is no theoretical or empirical* basis for assuming a normal distribution, it's just a matter of convenience and convention. For the purposes of estimating mean, minimizing the squared-error will give the right result for any non-skewed underlying distribution.

Unfortunately I don't have access to reliable per robot score data otherwise we could establish how well a Gaussian distribution models typical robot performance. (I did check my team's scouting data but it varied too far from the official scores to rely on.) If anyone would like to share scouting data from this season I'd be very interested.

In my professional life I work on big statistical modeling problems and we still usually base the models on Gaussians due to their computational ease, albeit as Gaussian Mixture Models to approximate any probability density function.

* In fact we know for certain that a pure climber can only score discrete values of 0, 10, 20 or 30 points.
  #13   Spotlight this post!  
Unread 23-05-2013, 00:10
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,038
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by MikeE View Post
In my professional life I work on big statistical modeling problems and we still usually base the models on Gaussians due to their computational ease...
Yes, computational ease... and speed.

Speaking of speed, attached is a zip file containing a test case of N and d for the normal equations Nx=d.

Would you please solve it for x using Octave and tell me how long it takes? (Don't include the time it takes to read the large N matrix from the disk, just the computation time).


PS:

N and d were created from the official qual Match Results posted by FIRST for 75 regional and district events plus MAR, MSC, Archi, Curie, Galileo, & Newton. So solving for x is solving for World OPR.



Attached Files
File Type: zip Nx=d.zip (296.3 KB, 4 views)

Last edited by Ether : 23-05-2013 at 11:46.
  #14   Spotlight this post!  
Unread 20-05-2013, 01:42
Ed Law's Avatar
Ed Law Ed Law is offline
Registered User
no team (formerly with 2834)
 
Join Date: Apr 2008
Rookie Year: 2009
Location: Foster City, CA, USA
Posts: 752
Ed Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond reputeEd Law has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by Citrus Dad View Post
I suggest going back and recomputing the OPRs.
Thank you for pointing out the issue with sum of individual categoty OPR do not add up to total OPR. I don't know exactly what you mean. You made it sound like I do the calculations by hand. I can ask the computer to run it 100 times and I can guarantee you that I will get the same answer every time.
__________________
Please don't call me Mr. Ed, I am not a talking horse.
  #15   Spotlight this post!  
Unread 28-08-2013, 01:29
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 984
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: An improvement to OPR

Quote:
Originally Posted by Ed Law View Post
Thank you for pointing out the issue with sum of individual categoty OPR do not add up to total OPR. I don't know exactly what you mean. You made it sound like I do the calculations by hand. I can ask the computer to run it 100 times and I can guarantee you that I will get the same answer every time.
I see your explanation about inclusion of the surrogate match scores. I think one check would be to see if the deviations of the total OPR vs the sum of the individual components is larger with the inclusion of more surrogate matches. You may be able to derive a correction factor based on the number of surrogates. (I estimated the average foul scores with a correction factor against our scouting data.)
Closed Thread


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 10:31.

The Chief Delphi Forums are sponsored by Innovation First International, Inc.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi