Chief Delphi

Chief Delphi (http://www.chiefdelphi.com/forums/index.php)
-   Scouting (http://www.chiefdelphi.com/forums/forumdisplay.php?f=36)
-   -   An improvement to OPR (http://www.chiefdelphi.com/forums/showthread.php?t=116791)

Frenchie461 09-05-2013 21:16

An improvement to OPR
 
Just a thought I had recently while working on some interesting linear algebra problems. Given that OPR is generated is generated by This
(thanks to Ether for this)


All of the operations performed to compute OPR are completely functional for complex numbers, so there appears to me to be no reason why OPR could not be solved for complex numbers where the real part of the element in the matrix is the teleoperated score and the imaginary part of the complex number being autonomous score. This should yield an OPR matrix containing complex entries, which theoretically should have a least squares average for both teleop and auton.

SoftwareBug2.0 09-05-2013 21:54

Re: An improvement to OPR
 
What advantage does this have over simply calculating independently with just auto scores, and then just teleop scores, and then just climb scores? I've actually seen people do that.

And the results are fun sometimes because you get teams where their climb OPR is like -2.

RyanCahoon 09-05-2013 23:38

Re: An improvement to OPR
 
Quote:

Originally Posted by Frenchie461 (Post 1273781)
This should yield an OPR matrix containing complex entries, which theoretically should have a least squares average for both teleop and auton.

Quote:

Originally Posted by SoftwareBug2.0 (Post 1273796)
What advantage does this have over simply calculating independently with just auto scores, and then just teleop scores, and then just climb scores?

Since the OPR calculation boils to down to

P = (A^-1) * S

where P is the OPR, A is the binary matrix denoting teams in each alliance and S is the alliance scores, then Frenchie461 is essentially advocating

Pt + Pa*i = (A^-1) * (St + Sa*i)

and since matrix multiplication is distributive

Pt + Pa*i = (A^-1) * St + ((A^-1) * Sa)*i

So you'll end up with the same result as calculating each OPR component independently. You'll get least-squares best fit for each component (as you would otherwise), but there won't be any additional interaction gained between them. This makes sense, because the least-squares fitting part of the operation happens when taking the inverse of A, and isn't affected by the value of S (whether real or complex) that it is post-multiplied by. Performance-wise, I would guess they would take about the same amount of time, assuming you're not re-calculating the value of A^-1 when doing the calculations independently.



note: the inverse operation written ^-1 above becomes the generalized inverse for non-square cases of A

efoote868 09-05-2013 23:55

Re: An improvement to OPR
 
I'm not sure how this is an improvement; your code might be more concise but only if you're working with a computational package like MATLAB.

Frenchie461 10-05-2013 09:19

Re: An improvement to OPR
 
Quote:

Originally Posted by efoote868 (Post 1273818)
I'm not sure how this is an improvement; your code might be more concise but only if you're working with a computational package like MATLAB.

It's only an improvement so far as it's more data than most teams usually compute, and it should be computationally faster than a pair of OPR calculations. Hypothetically, let's say that OPR is a O(N^3) operation, with this method it's N^3 to find auton and teleop rather than 2(N^3).

Ether 10-05-2013 09:59

Re: An improvement to OPR
 
Quote:

Originally Posted by Frenchie461 (Post 1273870)
it should be computationally faster than a pair of OPR calculations.


In the formula for OPR, namely [A][OPR]~[SCORE], [OPR] and [SCORE] need not be vectors - they can be matrices.

So instead of [OPR] being a Nx1 column vector, it can be an Nx2 matrix... and [SCORE] can be a (2M)x2 matrix.

The first column of [OPR] and [SCORE] can then be for TeleOp, and the second column for Autonomous.

This can be extended to any desired number of columns. For example use 4 columns for TeleOp, Autonomous, Climb, and Foul points.

Adding extra columns to [OPR] and [SCORE] increases the computation time only minimally, since the lion's share of the computation is spent factoring [A]T[A].



Siri 10-05-2013 10:46

Re: An improvement to OPR
 
Quote:

Originally Posted by Ether (Post 1273878)
This can be extended to any desired number of columns. For example use 4 columns for TeleOp, Autonomous, Climb, and Foul points.

Minor detour: does someone have comprehensive foul point data broken out from teleop scores? (because I would love them forever :])

Other than that, it's a cool idea and would be a fun way to team the concept to new students around that level. In terms of data though, I don't know that it brings something new. It'd actually complicate my work to do it that way, because the matrix case Ether describes allows the simultaneous calculation of endgame OPR by the same method (i.e. it's not limited to 2, and we're looking for at least 3 basically ever year).

AGPapa 10-05-2013 10:55

Re: An improvement to OPR
 
Quote:

Originally Posted by Siri (Post 1273883)
Minor detour: does someone have comprehensive foul point data broken out from teleop scores? (because I would love them forever :])

Other than that, it's a cool idea and would be a fun way to team the concept to new students around that level. In terms of data though, I don't know that it brings something new. It'd actually complicate my work to do it that way, because the matrix case Ether describes allows the simultaneous calculation of endgame OPR by the same method (i.e. it's not limited to 2, and we're looking for at least 3 basically ever year).

Ether has created a spreadsheet with all of the twitter data from every match.
http://www.chiefdelphi.com/forums/sh...t=twitter+data

It has everything you need.

Ether 10-05-2013 10:59

Re: An improvement to OPR
 
Quote:

Originally Posted by Siri (Post 1273883)
Minor detour: does someone have comprehensive foul point data broken out from teleop scores? (because I would love them forever :]).

A few weeks ago I posted a least-squares analysis of foul points using Twitter data:

http://www.chiefdelphi.com/forums/sh...53&postcount=1

As with any analysis using Twitter data, caveat utilitor.



Siri 10-05-2013 11:21

Re: An improvement to OPR
 
Quote:

Originally Posted by AGPapa (Post 1273884)
Ether has created a spreadsheet will all of the twitter data from every match.
http://www.chiefdelphi.com/forums/sh...t=twitter+data

It has everything you need.

Quote:

Originally Posted by Ether (Post 1273886)
A few weeks ago I posted a least-squares analysis of foul points using Twitter data:

http://www.chiefdelphi.com/forums/sh...53&postcount=1

As with any analysis using Twitter data, caveat utilitor.



Sorry, I should have been more specific about comprehensive (Twitter this year is missing a good chunk of MAR data). Thank you though, you're correct, I'll use this. Awesome analysis as always, Ether. Thanks!

[/and now back to your regularly scheduled thread]

Citrus Dad 11-05-2013 17:40

Re: An improvement to OPR
 
Note that the OPR parameters are like any other statistical measure and have standard errors associated with those parameters. I haven't seen the SEs posted with the OPR parameters, but I can tell you that the SEs are likely to be VERY large for so few observations--only 8 per team at the Champs, at most 12 in any of the regionals. The OPRs are useful indicators, and probably can be used in a pinch if you lack other data, but they are highly unreliable for live scouting. I wouldn't bother calculating them on the fly at a competition. (The CCWM and DPRs appear to be even less reliable).

Our team had an OPR of 43 on Curie, yet our own scouting data showed we were scoring over 62 per match--quite a discrepancy. 4814 showed a CCWM ranked only 33rd yet went undefeated, and our defensive scouting data confirmed that they had substantial value added.

SoftwareBug2.0 11-05-2013 22:57

Re: An improvement to OPR
 
Quote:

Originally Posted by Citrus Dad (Post 1274148)
Note that the OPR parameters are like any other statistical measure and have standard errors ... I wouldn't bother calculating them on the fly at a competition. (The CCWM and DPRs appear to be even less reliable).

Obviously having your scouts get better data is the best solution, but there really is value in having the numbers as unreliable as they may be. Even if OPR for each phase of the game is +/- 50% from what that teams scores, that's often close enough to figure out what your match strategy should be.

Also, DPR may be less reliable than OPR, but I've been told too many times that it's totally meaningless. It isn't. It just doesn't mean what people think that it means. It's purely how many points your opponent scores. So last year, when we were a robot that scored just enough points at championship to make it so that we were better off playing offense than defense our DPR was awful, while this year our DPR was great because opposing alliances were forced to defend us, and therefore have one less robot scoring.

Citrus Dad 13-05-2013 11:36

Re: An improvement to OPR
 
Quote:

Originally Posted by SoftwareBug2.0 (Post 1274209)
Obviously having your scouts get better data is the best solution, but there really is value in having the numbers as unreliable as they may be. Even if OPR for each phase of the game is +/- 50% from what that teams scores, that's often close enough to figure out what your match strategy should be.

I agree that the OPR is better than nothing--it still had a 0.91 correlation with our offensive scouting data. However, it tends to miss the outlier teams that may be hard to pick out otherwise. Our OPR was 27 points less than our actual offensive average--more than 50% off. But if your team has sufficient resources to calculate the OPR on the fly then you probably have enough to do full scouting. On the other hand, you might be relying on the OPR calculated with one of the apps tracking the competition, in which case then that's the best you have.

Quote:

Originally Posted by SoftwareBug2.0 (Post 1274209)
Also, DPR may be less reliable than OPR, but I've been told too many times that it's totally meaningless. It isn't. It just doesn't mean what people think that it means. It's purely how many points your opponent scores. So last year, when we were a robot that scored just enough points at championship to make it so that we were better off playing offense than defense our DPR was awful, while this year our DPR was great because opposing alliances were forced to defend us, and therefore have one less robot scoring.

4814 is a great case in point in Curie. Their OPR was only two points higher than their actual offensive output and their CCWM was only 9 points less than their OPR. They look like a defensive dud. Yet deeper digging shows that their defense was probably worth at least 30 points a match, and perhaps twice that.

Ether 13-05-2013 17:12

Re: An improvement to OPR
 
Quote:

Originally Posted by Citrus Dad (Post 1274581)
Their OPR was only two points higher than their actual offensive output

How are you calculating their actual offensive output? Do you have six scouts monitoring every match, one scout recording the scoring for each team?



T^2 13-05-2013 23:38

Re: An improvement to OPR
 
Quote:

Originally Posted by Ether (Post 1274672)
How are you calculating their actual offensive output? Do you have six scouts monitoring every match, one scout recording the scoring for each team?

Yes.

Citrus Dad 19-05-2013 14:56

Re: An improvement to OPR
 
Note: I was reviewing the 2834 database and think I found that the Championship OPRs are in error. The sums of the individual components often do not add up to the Total. (3824's in Curie is off by 32.) A quick scan of the regionals finds in some cases no deviations whatsoever and <2 pts maximum in others. I suggest going back and recomputing the OPRs.

efoote868 19-05-2013 14:58

Re: An improvement to OPR
 
A possible explanation is that Ed took into account surrogate matches.

Ed Law 20-05-2013 01:36

Re: An improvement to OPR
 
That is exactly the problem. OPR is calculated using all matches including surrogate matches. I would still want to calculate OPR this way. More data point is better even if the match does not count for that team.

Unfortunately team standing from FIRST website only adds up the total of the auto, teleop and climb points of non surrogate matches. This means when I solve A x = b, the matrix A contains the surrogate match while vector b does not contain surrogate match.

My proposal is to scale the value of b for the teams that have the surrogate matches before solving A x = b. Does anybody have any other suggestion?

Ed Law 20-05-2013 01:42

Re: An improvement to OPR
 
Quote:

Originally Posted by Citrus Dad (Post 1275934)
I suggest going back and recomputing the OPRs.

Thank you for pointing out the issue with sum of individual categoty OPR do not add up to total OPR. I don't know exactly what you mean. You made it sound like I do the calculations by hand. I can ask the computer to run it 100 times and I can guarantee you that I will get the same answer every time. :)

Ether 20-05-2013 11:56

Re: An improvement to OPR
 
Quote:

Originally Posted by Ed Law (Post 1276030)
My proposal is to scale the value of b for the teams that have the surrogate matches before solving A x = b. Does anybody have any other suggestion?

As a short-term solution that sounds like a reasonable approach to try to make the best out of the data that is available.

Going forward, perhaps someone who has Frank's ear and is interested in statistics could make an appeal to him to resolve the Twitter data issues. At the very least, store the data locally (at the event) and don't delete it until it has been archived at FIRST. Then make the data available to the community.



Citrus Dad 20-05-2013 13:45

Re: An improvement to OPR
 
Quote:

Originally Posted by efoote868 (Post 1275937)
A possible explanation is that Ed took into account surrogate matches.

I believe the method relies on the official score database, not on match by match reported scores. The surrogates don't show up there. He would have to use 2 different data sets to get different answers.

Ether 20-05-2013 14:25

Re: An improvement to OPR
 
Quote:

Originally Posted by Citrus Dad (Post 1276145)
I believe the method relies on the official score database, not on match by match reported scores. The surrogates don't show up there. He would have to use 2 different data sets to get different answers.

There are two different score datasets at USFIRST: "Match Results" and "Team Standings".

"Match Results" is necessary to construct the alliances matrix and obtain the total match score. It contains the surrogate matches.

"Team Standings" is necessary to obtain the Auto, TeleOp, and Climb alliance scoring. Problem is, the totals shown there do not include the scores for surrogate teams in matches where said teams played as surrogates.

Ed's proposed work-around to scale the "Team Standings" totals for teams which played as surrogates seems like a reasonable one. Do you have a different suggestion?



MikeE 20-05-2013 22:59

Re: An improvement to OPR
 
Quote:

Originally Posted by Ether (Post 1276151)
There are two different score datasets at USFIRST: "Match Results" and "Team Standings".

"Match Results" is necessary to construct the alliances matrix and obtain the total match score. It contains the surrogate matches.

"Team Standings" is necessary to obtain the Auto, TeleOp, and Climb alliance scoring. Problem is, the totals shown there do not include the scores for surrogate teams in matches where said teams played as surrogates.

Ed's proposed work-around to scale the "Team Standings" totals for teams which played as surrogates seems like a reasonable one. Do you have a different suggestion?



My preferred solution is for FIRST to move to an all district model with 12 matches per event and therefore no more surrogates :)

Until then...

If we have complete Twitter data for an event then we get the component scores for every match so we don't have an issue.

But to solve the surrogate problem we just need the component scores from the specific surrogate matches. There are at most 3 of these in any competition and typically just 1 or 2 consecutive matches in round 3.
Since there is a single surrogate team in an alliance we just need to add the Twitter component scores to their "Team Standing" score to get the corrected total scores for that surrogate team.

MikeE 20-05-2013 23:30

Re: An improvement to OPR
 
Quote:

Originally Posted by Citrus Dad (Post 1274148)
I haven't seen the SEs posted with the OPR parameters, but I can tell you that the SEs are likely to be VERY large for so few observations--only 8 per team at the Champs, at most 12 in any of the regionals.

Small point: The Pinetree regional in Maine had 13 matches per team in qualifications; one of the reasons it was the Best Regional* of the 2013 season.

Bigger point: I've been playing around with maximum likelihood estimate models as an alternative (really an extension) to OPR, and these do provide both a mean and variance of team contribution. It's not quite ready to write up as a white paper but it's giving some interesting early results from Monte Carlo event simulations.

One more point: I'm a fan of the binary matrix approach to solving the regression described by Ryan since it's easy to add in additional match-by-match features such as average (or per team) score gradient during an event.

* from my very small sample of 4 events

Ether 21-05-2013 00:33

Re: An improvement to OPR
 
Quote:

My preferred solution is for FIRST to move to an all district model with 12 matches per event and therefore no more surrogates
The criterion for "no surrogates" is M*6/T = N, where

M is the number of qual matches
T is the number of teams
N is a whole number (the number of matches played by each team)

At CMP, T=100 and M=134, so N was not a whole number; thus there were surrogates.

If instead T=96 and M=128, N would be a whole number (namely 8) and there would be no surrogates.


Quote:

Since there is a single surrogate team in an alliance we just need to add the Twitter component scores to their "Team Standing" score to get the corrected total scores for that surrogate team.
Here's the 2013 season Twitter data for elim and qual matches. It has Archi, Curie, Galileo, & Newton. The usual Twitter data caveats apply.


Quote:

I've been playing around with maximum likelihood estimate models as an alternative (really an extension) to OPR...
What do you mean by "maximum likelihood estimate models" in this context?


Quote:

One more point: I'm a fan of the binary matrix approach...
In this context, I'm assuming "the binary matrix" refers to the 2MxN design matrix [A] of the overdetermined system.

Do you then use QR factorization directly on the binary matrix to obtain the solution, or do you form the normal equations and use Cholesky?



MikeE 22-05-2013 15:48

Re: An improvement to OPR
 
Quote:

Originally Posted by Ether (Post 1276340)
The criterion for "no surrogates" is M*6/T = N, where

M is the number of qual matches
T is the number of teams
N is a whole number (the number of matches played by each team)

At CMP, T=100 and M=134, so N was not a whole number; thus there were surrogates.

If instead T=96 and M=128, N would be a whole number (namely 8) and there would be no surrogates.

I prefer to think of the surrogate issue in terms of looking at
(T*N) mod 6
i.e. how many teams are left over if everyone plays a certain number of rounds.

If there are any teams left over then we need at least one surrogate match. The scheduling software also adds the reasonable constraint that there can only be one surrogate team per alliance. Putting this all together there are between 0 and 3 matches with surrogates in qualification rounds.

Clearly if either T or N are multiples of 6 then the remainder is zero so no surrogates.

Choosing N=12 guarantees no surrogates however many teams are at the event, gives plenty of matches for each team and also has the nice property that M=2*T so it's easy to estimate the schedule impact. I'm sure the designers of FiM and MAR settled on 12 matches per event through similar reasoning.

Quote:

Originally Posted by Ether (Post 1276340)
What do you mean by "maximum likelihood estimate models" in this context?

(I'll try to keep this accessible to a wider audience but we can go into further details later.)

OPR estimates a single parameter model for each team, i.e. what is the optimal solution if we model a team's contribution to each match as a constant. We can also use regression (or other optimization techniques) to build richer models. For example we can model each team with two parameters: a constant contribution per match similar to OPR, plus a term which models a team's improvement per round.

But these type of models are deterministic. In other words if we use the model to predict the outcome of a hypothetical match we will always get the same answer. That means we can't use a class of useful simulation methods to get deeper insight into how a collection of matches might play out.

Here's an alternative approach.
Instead of modeling a team's score as a constant (or polynomial function of known features), we treat each team's score as if it is generated from an underlying statistical distribution. Now the problem becomes one of estimating (or assuming) the type of distribution and also estimating the parameters of that distribution for each team.

With OPR we model team X as scoring say 15.3 points every match, so our prediction for a hypothetical match is always 15.3 points.
With a statistical model we would model team X as something like 15.3 +/- 6.3 points. To predict the score for a hypothetical match we choose randomly from the appropriate distribution, and this will obviously be different each time we "play" the hypothetical match.

So with OPR if we "play" a hypothetical match 100 times where OPR(red) > OPR(blue), the final score would be the same every time so red will always win. But if we use a statistical model then red should still win most matches but blue will also win some of the time. Now we have an estimate of the probability of victory for red, which is potentially more useful information than "red wins", and can be used in further reasoning.

MLE is just an approach for getting the parameters from match data. For simplicity I assume a Gaussian distribution, use linear regression as an initial estimate of each team's mean and linear regression on the squared residuals as an initial estimate of each team's variance.

Quote:

Originally Posted by Ether (Post 1276340)
In this context, I'm assuming "the binary matrix" refers to the 2MxN design matrix [A] of the overdetermined system.

Do you then use QR factorization directly on the binary matrix to obtain the solution, or do you form the normal equations and use Cholesky?

Yes, I mean the design matrix.

I've implemented many numerical algorithms over the years and the main lesson it taught me is not to write them yourself unless absolutely necessary!
So for linear regression I solve the normal equation using Octave (similar to MATLAB). I don't see any meaningful difference between my results and other published sources on CD.

IKE 22-05-2013 16:57

Re: An improvement to OPR
 
Quote:

Originally Posted by Citrus Dad (Post 1274581)
I agree that the OPR is better than nothing--it still had a 0.91 correlation with our offensive scouting data. However, it tends to miss the outlier teams that may be hard to pick out otherwise. Our OPR was 27 points less than our actual offensive average--more than 50% off. But if your team has sufficient resources to calculate the OPR on the fly then you probably have enough to do full scouting. On the other hand, you might be relying on the OPR calculated with one of the apps tracking the competition, in which case then that's the best you have.

.....

Outliers happen. They tend to be worse if teams have a lot of variation, and the fewer the samples, the worse it is...

Being more than 50% off is rarer, but not unheard of. I once saw an OPR of 15 assigned to a team that had only competed in 2/8 matches. And they didn't score 60 points in those two matches...

Another killer of OPR is if a team that reliably does very well has a bad match with your team. For instance, in Archimedes, 469 ended up with an OPR over 80 points. they had a match where their shooter had an issue right from the start and I believe they only scored climb points. Unfortunately to their partners, the OPR calculations will likely penalize those other teams..

For these reasons, and many others, it is very important to scout.

OPR does accurrately though show that some teams are worth less than they score on average. Yep, you heard me right, there are many teams that are worth less than their average score. This is especially true of slow that used the middle shooting position to start their climb. While they would frequently get their 30 points, they would often cost the alliance many missed shots from cyclers that were 75%+ at taht position but 50%- from outside shooting positions. Yes, the climber did score 30 points, but the other two partners that usually put up 30 disc points, and only got 20 results in a -20 total points from them. Some of this if it occurs on a regular basis will get attributed to the climber team. This will also explain an imbalance if you summed the auton, disc points, and climbing OPR.

Basel A 22-05-2013 17:58

Re: An improvement to OPR
 
Quote:

Originally Posted by IKE (Post 1276698)
OPR does accurrately though show that some teams are worth less than they score on average. Yep, you heard me right, there are many teams that are worth less than their average score. This is especially true of slow that used the middle shooting position to start their climb. While they would frequently get their 30 points, they would often cost the alliance many missed shots from cyclers that were 75%+ at taht position but 50%- from outside shooting positions. Yes, the climber did score 30 points, but the other two partners that usually put up 30 disc points, and only got 20 results in a -20 total points from them. Some of this if it occurs on a regular basis will get attributed to the climber team. This will also explain an imbalance if you summed the auton, disc points, and climbing OPR.

This also explains the potential for negative OPR in a category. Such a team might have a negative teleop OPR.

IKE 22-05-2013 18:37

Re: An improvement to OPR
 
Quote:

Originally Posted by Basel A (Post 1276709)
This also explains the potential for negative OPR in a category. Such a team might have a negative teleop OPR.

Correct. Negative OPRs are much more rare now that penalties get added to the other teams score, but they do occur (sometimes for just reasons, sometimes not).

The phenomenon I talked about above is similar to what can occur with the +/- system in basketball. Sometimes a superstar doesn't score a lot of points due to getting double-teamed but his open teammates then score a bunch of points. If you only look at stats, it doesn't tell the whole story.

FRC 33 uses OPR to figure out schedule strength and to double check some of our Stats data. Ultimately I trust the stats more than I do OPR, but especially this year, I found a handful of errors in our scouting team data.

Like Citrus Dad, I generally found OPR to be within 15% of a teams average contribution. However there would be several teams with large deltas. Often this was due to a team not working for a while and then hitting a whole bunch of points. FCS teams would also create havoc in OPR. They would have a 80 point match, and then a 20 point match. Then and 80, then a 20... OPR math depends on a team being reasonably consistent. This behaviour will either dramatically over-predict or under predict...

Ether 22-05-2013 19:17

Re: An improvement to OPR
 
1 Attachment(s)
Quote:

Originally Posted by MikeE (Post 1276688)
for linear regression I solve the normal equation using Octave


Octave uses a polymorphic solver, that selects an appropriate matrix factorization depending on the properties of the matrix.

If the matrix is Hermitian with a real positive diagonal, the polymorphic solver will attempt Cholesky factorization.

Since the normal matrix N=ATA satisfies this condition, Cholesky factorization will be used.


Quote:

MLE is just an approach for getting the parameters from match data. For simplicity I assume a Gaussian distribution, use linear regression as an initial estimate of each team's mean and linear regression on the squared residuals as an initial estimate of each team's variance.
The solution of the normal equations is a maximum likelihood estimator only if the data follows a normal distribution. I was wondering what was the theoretical basis for assuming a normal distribution.



MikeE 22-05-2013 22:59

Re: An improvement to OPR
 
Thanks for the information Ether. It spurred me to check the details in the Octave documentation

Quote:

Originally Posted by Ether (Post 1276717)
I was wondering what was the theoretical basis for assuming a normal distribution.

There is no theoretical or empirical* basis for assuming a normal distribution, it's just a matter of convenience and convention. For the purposes of estimating mean, minimizing the squared-error will give the right result for any non-skewed underlying distribution.

Unfortunately I don't have access to reliable per robot score data otherwise we could establish how well a Gaussian distribution models typical robot performance. (I did check my team's scouting data but it varied too far from the official scores to rely on.) If anyone would like to share scouting data from this season I'd be very interested.

In my professional life I work on big statistical modeling problems and we still usually base the models on Gaussians due to their computational ease, albeit as Gaussian Mixture Models to approximate any probability density function.

* In fact we know for certain that a pure climber can only score discrete values of 0, 10, 20 or 30 points.

Ether 23-05-2013 00:10

Re: An improvement to OPR
 
1 Attachment(s)
Quote:

Originally Posted by MikeE (Post 1276757)
In my professional life I work on big statistical modeling problems and we still usually base the models on Gaussians due to their computational ease...

Yes, computational ease... and speed.

Speaking of speed, attached is a zip file containing a test case of N and d for the normal equations Nx=d.

Would you please solve it for x using Octave and tell me how long it takes? (Don't include the time it takes to read the large N matrix from the disk, just the computation time).


PS:

N and d were created from the official qual Match Results posted by FIRST for 75 regional and district events plus MAR, MSC, Archi, Curie, Galileo, & Newton. So solving for x is solving for World OPR.




Ed Law 02-07-2013 23:56

Re: An improvement to OPR
 
Quote:

Originally Posted by Ether (Post 1276104)
As a short-term solution that sounds like a reasonable approach to try to make the best out of the data that is available.

Going forward, perhaps someone who has Frank's ear and is interested in statistics could make an appeal to him to resolve the Twitter data issues. At the very least, store the data locally (at the event) and don't delete it until it has been archived at FIRST. Then make the data available to the community.



Ether, I changed my mind about scaling numbers that has surrogate match in the vector b before solving Ax=b. I now propose to scale x(auto), x(tele) and x(climb) for each team proportionally so they will add up to the overall OPR.

We can test it afterwards and calculate the b and see how close it is to the missing subscore of the surrogate match.

Citrus Dad 28-08-2013 01:29

Re: An improvement to OPR
 
Quote:

Originally Posted by Ed Law (Post 1276032)
Thank you for pointing out the issue with sum of individual categoty OPR do not add up to total OPR. I don't know exactly what you mean. You made it sound like I do the calculations by hand. I can ask the computer to run it 100 times and I can guarantee you that I will get the same answer every time. :)

I see your explanation about inclusion of the surrogate match scores. I think one check would be to see if the deviations of the total OPR vs the sum of the individual components is larger with the inclusion of more surrogate matches. You may be able to derive a correction factor based on the number of surrogates. (I estimated the average foul scores with a correction factor against our scouting data.)


All times are GMT -5. The time now is 08:53.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi