Go to Post First, corndogs and now cowbells... when will you make the pain stop, Frank?!?!?! - barn34 [more]
Home
Go Back   Chief Delphi > FIRST > General Forum
CD-Media   CD-Spy  
portal register members calendar search Today's Posts Mark Forums Read FAQ rules

 
Reply
Thread Tools Rating: Thread Rating: 4 votes, 5.00 average. Display Modes
  #1   Spotlight this post!  
Unread 06-06-2015, 08:52
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Overview and Analysis of FIRST Stats

Another thread for stat nerds. Again, if you don't know or don't care about OPR and CCWM and how they're calculated, this thread will probably not interest you.
----------------------------------------------------------------

Based on the recent thread on stats, I did way too much study and simulation of what the different stats do in different situations. I've attached a very long paper with all of the findings.

Many thanks to Ether who helped a lot with behind-the-scenes comments and data generation. I think he's off working on some related ideas that I suspect we'll all hear about soon.

The Overview and Conclusions of the paper are included below.

-----------------------------------------------------------------

Overview:

This paper presents and analyzes a wide range of statistical techniques that can be applied to FIRST Robotics Competition (FRC) and FIRST Tech Challenge (FTC) tournaments to rate the performance of teams and robots competing in the tournament.

The well-known Offensive Power Rating (OPR), Combined Contribution to Winning Margin (CCWM), and Defensive Power Rating (DPR) measures are discussed and analyzed.

New measures which incorporate knowledge of the opposing alliance members are discussed and analyzed. These include the Winning Margin Power Rating (WMPR), the Combined Power Rating (CPR), and the mixture-based Ether Power Rating (EPR).

New methods are introduced to simultaneously estimate separate offensive and defensive contributions of teams. These methods lead to new, related simultaneous metrics called sOPR, sDPR, sWMPR, and sCPR.

New MMSE estimation techniques are introduced. MMSE techniques reduce overfitting problems that occur when Least Squares (LS) parameter estimation techniques are used to estimate parameters on a relatively small data set. The performance of LS and MMSE techniques is compared over a range of scenarios.

All of the techniques are analyzed over a wide range of simulated and actual FRC tournament data, using results from the 2013, 2014, and 2015 FRC seasons.

-----------------------------------------------------------------

Conclusions

New improved techniques for incorporating defense into FRC and FTC tournament statistics have been introduced.

New MMSE techniques for estimating model parameters have been introduced.

Most FRC tournaments do suffer from a small data size, causing Least Squares estimates to be overfit to the noisy tournament data which degrades their performance in predicting match outcomes not in the Training set.

MMSE techniques appear to provide limited but significant and consistent improvements in match score and winning margin prediction compared to similar Least Squares techniques.

While incorporating defense into the statistics using MMSE estimation techniques does not result in any decrease in the statistical prediction performance, the advantages in doing so are usually quite small and may make it not worth the effort to do so unless a given FRC season is expected to have substantial defensive components. Occasionally incorporating defense can result in around an 8-12% further reduction in winning margin prediction error (e.g., 2014 casb, 2015 incmp, 2015 micmp tournaments), but this is rare.

MMSE based estimation of the sOPR, sDPR, and sCPR parameters results in the smallest squared prediction error for match scores and match winning margins across all of the studied parameters. MMSE based estimation of OPR parameters often produces results that are quite close.

Least Squares estimates of OPR, CCWM, and DPR using FRC tournament data probably overestimate the relative differences in ability of the teams. MMSE estimates probably underestimate the relative differences.

The small amount of data created in FRC tournaments results in noisy estimates of statistics. Testing set match outcomes from 2013-2015 often had very significant random components to them that could not be predicted by the best linear prediction methods, most likely due to purely random issues that occur in FRC matches.
Attached Files
File Type: pdf OverviewandAnalysisofFIRSTStatsV4.pdf (1.34 MB, 331 views)
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #2   Spotlight this post!  
Unread 06-06-2015, 13:07
nuclearnerd's Avatar
nuclearnerd nuclearnerd is offline
Speaking for myself, not my team
AKA: Brendan Simons
FRC #5406 (Celt-X)
Team Role: Engineer
 
Join Date: Jan 2014
Rookie Year: 2014
Location: Hamilton, Ontario, Canada
Posts: 446
nuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant future
Re: Overview and Analysis of FIRST Stats

Thanks William. That's a super useful paper. My takeaway is that while stats that incorporate defence might help a little, good old (LS based) OPR seems to be just about as good in most cases.

I have another area of study you could look into. I like to use OPR during tournaments for two purposes:

1) to derive a strength of schedule (basically the predicted winning margin). This is useful for deciding where best to spend limited resources on strategy planning, and alliance-mate repairs.

2) as a first order sort for pick list order

Both of these use cases suffer from even more drastic lack of data points. For the SOS, I use previous tournament data, and for the pick list, we've usually only completed 70% of the matches. For these use cases, it would be valuable to know:

for 1) Which predictor does best when the training set and the testing set are from different tournaments.

For 2) which predictor does best with 70% or less of the matches in the training set. (Maybe this is where MMSE solutions will shine)

Lastly, I'm intrigued by the possibility of incorporating scouting data (such as individual team point counts) into the MMSE calculation as a way to verify scouting records, but i have to think about that more.

If you can respond to any of these questions, I'd be obliged, otherwise thanks again for sharing your work!
Reply With Quote
  #3   Spotlight this post!  
Unread 06-06-2015, 13:26
AGPapa's Avatar
AGPapa AGPapa is offline
Registered User
AKA: Antonio Papa
FRC #5895
Team Role: Mentor
 
Join Date: Mar 2012
Rookie Year: 2011
Location: Robbinsville, NJ
Posts: 323
AGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

This is a really well written paper, thanks for putting it together!

I have some questions about how to choose VarD/VarO and VarN/VarO since I'm unfamiliar with MMSE estimation. How would you go about choosing these values during/before an event?

Quote:
Originally Posted by Page 31
The (VarD, VarN) numbers show the values of VarD/VarO and VarN/VarO in the MMSE search that produced the best predicted outcome on the Testing data.
Does this method lead to the same overfitting that using the training data as the testing data did with the LS estimators? Choosing the apriori variances after the fact to get the best results seems wrong, or is the effect actually too small in reality to be a factor? It seems like each set of training data also needs to find what variances work best, and then apply them to the testing data, instead of "searching" for the best values and applying them after the fact.


Quote:
Originally Posted by Page 44
// pick your value relative to sig2O, or search a range.
// 0.02 means you expect defense to be 2% of offense.
From this I'd expect that the values for VarD/VarO to be largely dependent on the game, yet the data shows that the "best" values depend very little on the game. For example, in the 2014 Newton Division the best values for VarD/VarO for sCPR was 0.10, but for 2014 Galileo it was 0.00! The complete other side of the search range! How can two divisions in the same year have such different values?
__________________
Team 2590 Student [2011-2014]
Team 5684 Mentor [2015]
Team 5895 Mentor [2016-]
Reply With Quote
  #4   Spotlight this post!  
Unread 06-06-2015, 13:42
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
This is a really well written paper, thanks for putting it together!

I have some questions about how to choose VarD/VarO and VarN/VarO since I'm unfamiliar with MMSE estimation. How would you go about choosing these values during/before an event?



Does this method lead to the same overfitting that using the training data as the testing data did with the LS estimators? Choosing the apriori variances after the fact to get the best results seems wrong, or is the effect actually too small in reality to be a factor? It seems like each set of training data also needs to find what variances work best, and then apply them to the testing data, instead of "searching" for the best values and applying them after the fact.




From this I'd expect that the values for VarD/VarO to be largely dependent on the game, yet the data shows that the "best" values depend very little on the game. For example, in the 2014 Newton Division the best values for VarD/VarO for sCPR was 0.10, but for 2014 Galileo it was 0.00! The complete other side of the search range! How can two divisions in the same year have such different values?
Yes, these are fair points. They're only picking 1 or 2 values for an entire tournament, but yes, they are searching. I might ideally suggest that the values be searched for an entire season (where the game is the same) to find the 1 or 2 values that are best for the entire season's worth of tournaments. It might also show how things are different from year to year.

They do vary from tournament tournament because again there just isn't enough data in a tournament to settle on a "true" underlying value.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #5   Spotlight this post!  
Unread 06-06-2015, 13:51
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by nuclearnerd View Post
1) to derive a strength of schedule (basically the predicted winning margin). This is useful for deciding where best to spend limited resources on strategy planning, and alliance-mate repairs.

2) as a first order sort for pick list order

Both of these use cases suffer from even more drastic lack of data points. For the SOS, I use previous tournament data, and for the pick list, we've usually only completed 70% of the matches. For these use cases, it would be valuable to know:

for 1) Which predictor does best when the training set and the testing set are from different tournaments.

For 2) which predictor does best with 70% or less of the matches in the training set. (Maybe this is where MMSE solutions will shine)
This whole thing came about because I wrote an App for FTC tournaments where I wanted live estimates of stats and predicted match outcomes. The app predicts future match results based on stats so teams can know which matches are likely to be hard vs. easy. It also does this for sub-parts of the game (e.g., 2 matches from now, our opponents are really good at autonomous but not at the end game, so maybe we should play autonomous defense [in FTC], etc.).

OPR doesn't even exist when the # of matches played is less than the number of teams/2, and then suddenly it exists but is noisy, and then it progressively gets less noisy as more matches are played. So I was looking for a way to show stats, and it seemed like the stats should slowly incorporate information as matches are played.

The app currently predicts match scores and winning margin, but I'd also like to incorporate a "probability of victory" measure to show what kind of confidence exists.

The MMSE approach allows for estimated stats regardless of how many matches are played. I'll try to run some sims with 0-100%of matches played to see how well things work over time.

It also occurred to me to try to predict the match outcomes for the simulated tournaments where the underlying stats are completely known just to see the limits of how well match prediction could be if perfect knowledge of the underlying parameters existed.

Another thing that I could do would be to simulate how picking alliances based on the estimated stats would do vs. picking them based on the underlying stats. For example, if the top 3 teams are picked based on the various estimates (LS OPR, MMSE OPR, MMSE sCPR, etc.) and they are compared with the top 3 teams in the simulated tournaments where the underlying actual data is known (the actual underlying O and D), how many fewer points will the alliance end up scoring on average? This might be the real question that folks want to know... Gotta run now: more later.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 06-06-2015 at 17:28.
Reply With Quote
  #6   Spotlight this post!  
Unread 06-06-2015, 18:05
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
From this I'd expect that the values for VarD/VarO to be largely dependent on the game, yet the data shows that the "best" values depend very little on the game. For example, in the 2014 Newton Division the best values for VarD/VarO for sCPR was 0.10, but for 2014 Galileo it was 0.00! The complete other side of the search range! How can two divisions in the same year have such different values?
One more quick thought on this, with more to come later:

Both 2014 galileo and newton sCPR searches picked VarN/VarO=3.

If VarD/VarO = 0, then the total match variance would be 3*VarO + VarN = 6*VarO.

If on the other hand VarD/VarO = 0.1, then the total match variance would be 3*VarO + 3*VarD + VarN = 6.3*VarO.

So while this looks like a really different result, we're only talking about a change in about 5% of the overall variance in a match that could be predicted with VarD/VarO being 0.0 or 0.1. Instead, it might be helpful to increase the step size in the VarN/VarO search, which is currently 1 (!), so each step in that search could cause a much greater change in the match variance.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #7   Spotlight this post!  
Unread 06-06-2015, 20:17
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

The attached png file shows some interesting data.

This is for the 2014 casa tournament with simulated data using the model from the paper with Var(O)=100 (or stdDev(O)=10), Var(D)=0, and Var(N)=3*Var(O). The tournament had 54 teams, so each team got to play 1 time every 9 total matches. The tournament had 108 total matches, or 12 matches played by every team.

Each of the first 4 plots shows the estimated OPRs vs. the number of matches played per team (so X=1 means 9 total matches, X=2 means 18 total matches, etc.). The data points from 1-12 on the X axis correspond to 1 match per team, ... up to 12 matches per team (the whole tournament). The 13th point on the X axis is the actual underlying O values.

Plot 1 corresponds to the traditional Least Squares (LS) OPRs, which is also the MMSE solution where Var(N) is estimated to be equal to 0. Note that there are no OPR values until each team has played 4 matches, as that's the number of matches needed to make the matrix invertible.

Plot 2 corresponds to the MMSE OPR estimates where Var(N) is estimated to be equal to 1* Var(O). As the actual Var(N)=3*Var(O), this is underestimating the noise in each match.

Plot 3 corresponds to the MMSE OPR estimates where Var(N) is estimated to be equal to 3* Var(O) (the "correct" value).

Plot 4 corresponds to the MMSE OPR estimates where Var(N) is estimated to be equal to 10* Var(O), greater than the actual noise.

Plot 5 shows the percentage error each curve has in estimating the actual underlying O values.

Comments:

The LS OPR values start out crazy and then settle down a bit. Looking at the step from X=12 (the final OPRs) to X=13 (the "real" O values), you can see that the final OPRs have more variance than the real O values. This means that the final OPRs are still overestimating the variance of the abilities of the teams.

Look at the X=1 points for Plots 2-4. The MMSE estimates start conservatively with the OPRs bunched around the mean and then progressively expand out. Plot 4 shows the noise overestimated (the most conservative estimate), so the OPRs start out very tightly bunched and stay that way. Plot 2 starts out wider, and Plot 3 starts out in the middle.

Interestingly, you can see that each X=1 point for the MMSE plots have 3 teams with the same estimate. This makes sense, as after having played 1 match, the 3 teams on each alliance are indistinguishable from each other and it requires more than 1 match played by each team to start separating them.

Look at the X=12 (the final estimates) vs X=13 (the "real" O values) points for Plots 2-4. Plot 2 looks like it's still over estimating the variance, Plot 3 has it about right, and Plot 4 has underestimated the true variance even at the end of the tournament (you see the Plot 4 OPRs expand out from X=12 to X=13). [Edit: checking the numbers for the run shown, the variances of the OPRs computed by LS, MMSE 1, MMSE 3, and MMSE 10 were respectively 164, 138, 102, and 47, confirming the above comment. The MMSE 3 solution using the "right" Var(N) estimate is quite close to the true underlying variance of 100. Over multiple runs, the MMSE 3 solution is slightly biased under 100 on average, showing that more matches are needed for it to converge to the "right" variance. All of the techniques do eventually converge to the right solution and variance if the tournament is simulated to be much greater than 108 matches.]

In Plot 5, the performances of the different techniques get close to each other as the tournament nears completion. They should all converge as the number of matches grows large as the LS and MMSE solutions will eventually converge to each other. But they are off by quite a bit early on. Even though the MMSE 1 solution with Var(N) underestimated at 1*Var(O) is underestimating the Var(N), it still gives pretty good results.
Attached Thumbnails
Click image for larger version

Name:	OprComparison.PNG
Views:	55
Size:	63.8 KB
ID:	19093  
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 06-06-2015 at 20:40.
Reply With Quote
  #8   Spotlight this post!  
Unread 07-06-2015, 00:45
nuclearnerd's Avatar
nuclearnerd nuclearnerd is offline
Speaking for myself, not my team
AKA: Brendan Simons
FRC #5406 (Celt-X)
Team Role: Engineer
 
Join Date: Jan 2014
Rookie Year: 2014
Location: Hamilton, Ontario, Canada
Posts: 446
nuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant futurenuclearnerd has a brilliant future
Re: Overview and Analysis of FIRST Stats

Super cool. It definitely looks like MMSE OPR gives better early predictions, at least in that data set. Now I need to write an app to give us live calculations during an event...

Did you get a chance to look at using training data from previous events? For instance, does MMSE OPR values from the last regional accurately predict a teams performance at CMP? What should we give for the MMSE parameter estimates in this case?
Reply With Quote
  #9   Spotlight this post!  
Unread 07-06-2015, 07:23
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by nuclearnerd View Post
Did you get a chance to look at using training data from previous events? For instance, does MMSE OPR values from the last regional accurately predict a teams performance at CMP? What should we give for the MMSE parameter estimates in this case?
This is a great question but will take me a lot more time to do. I'd have to figure out which teams from CMP played in which regionals, then get all of the parameters computed from all of the regionals, and then do the study. I'll put this on my todo list but this would take a while even if I made it my top priority.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #10   Spotlight this post!  
Unread 07-06-2015, 09:57
AGPapa's Avatar
AGPapa AGPapa is offline
Registered User
AKA: Antonio Papa
FRC #5895
Team Role: Mentor
 
Join Date: Mar 2012
Rookie Year: 2011
Location: Robbinsville, NJ
Posts: 323
AGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by wgardner View Post
In Plot 5, the performances of the different techniques get close to each other as the tournament nears completion. They should all converge as the number of matches grows large as the LS and MMSE solutions will eventually converge to each other. But they are off by quite a bit early on. Even though the MMSE 1 solution with Var(N) underestimated at 1*Var(O) is underestimating the Var(N), it still gives pretty good results.
Thanks Will, the fact that even choosing a bad value for Var(N) gives decent results alleviates a lot of my concerns about searching for Var(N) and Var(D) after the fact.

It’s also impressive at how much better MMSE techniques are for when an event is underway and not a lot of matches have been played. This is helpful for predicting the outcome of the second day of matches (and thus seeing who is likely to be a captain).
Is this behavior typical for all stats or just OPR? Could you run a similar plot for sCPR? (since that stat seems to do slightly better than OPR).


Additionally, how would you implement the techniques described in the “Advanced MMSE Estimation” section? What would you change in the pseudocode to, for instance, change a team’s apriori Oi?
__________________
Team 2590 Student [2011-2014]
Team 5684 Mentor [2015]
Team 5895 Mentor [2016-]
Reply With Quote
  #11   Spotlight this post!  
Unread 07-06-2015, 13:05
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
Thanks Will, the fact that even choosing a bad value for Var(N) gives decent results alleviates a lot of my concerns about searching for Var(N) and Var(D) after the fact.

It’s also impressive at how much better MMSE techniques are for when an event is underway and not a lot of matches have been played. This is helpful for predicting the outcome of the second day of matches (and thus seeing who is likely to be a captain).
Is this behavior typical for all stats or just OPR? Could you run a similar plot for sCPR? (since that stat seems to do slightly better than OPR).
Plots are attached for 2 runs: one with the true Var(D)=0.1 & Var(N)=3, and one for Var(D)=0.0 & Var(N)=3. (these are relative to Var(O)).

The top row is with the estimated Var(D)=0 and the estimated Var(N)=0, 1, 3, and 10 as before. So the top left corner is regular OPRs and the top row is MMSE OPRs.

The middle row is the same but with Var(D) estimated at 0.1 and the bottom row is the same with with V(D) estimated at 0.2.

Note that with Var(D)>0 and Var(N)=0, the results are always the same, the "vanilla" LS sCPR. That's shown in the left middle. Even worse than the OPR, it doesn't really start having values until the rank of the matrix is 2*#teams-1 which is at the 7th match per team. It is VERY overfit which is why it starts noisy and stays noisy.

The bottom left shows the plots of the percentage of the combined O+D (or O in the case when D=0) left after prediction. It's saturated to be no worse than 100%, though the LS OPR and sCPR are worse than 100% when the number of matches is small, meaning that the prediction error has more variance than the original set of parameters (!). The black curve is the LS OPR and the red curve is the LS sCPR which is so overfit that it's worse than nothing until the very last match is played.

Quote:
Additionally, how would you implement the techniques described in the “Advanced MMSE Estimation” section? What would you change in the pseudocode to, for instance, change a team’s apriori Oi?
The regular MMSE equation is like the equation shown in Appendix B for EPR, except that 1/sig2C I there is the inverse of the correlation matrix of the expected parameters. If the expected means change, you don't add Oave at the end but a vector of your expected means. If the expected variances change for the noise or the parameters, then you change the covariance matrices.

Basically, there's a general equation for arbitrary expected mean vectors and covariance matrices of both the parameters and the noise, so you can run the estimation algorithm given any set of expected mean vectors and covariance matrices.
Attached Thumbnails
Click image for larger version

Name:	Opr_sCprComparisonWithD=0.0AndN=3.0.PNG
Views:	41
Size:	150.9 KB
ID:	19109  Click image for larger version

Name:	Opr_sCprComparisonWithD=0.1AndN=3.0.PNG
Views:	23
Size:	146.9 KB
ID:	19110  
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 07-06-2015 at 14:46.
Reply With Quote
  #12   Spotlight this post!  
Unread 07-06-2015, 15:07
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
Additionally, how would you implement the techniques described in the “Advanced MMSE Estimation” section? What would you change in the pseudocode to, for instance, change a team’s apriori Oi?
Following up on this, the attached image is snipped from the Wikipedia Page on MMSE Estimation about half way down.

In this equation, x is the parameter you're trying to estimate (like O), z is the noise (like N), xhat is your estimated parameters, xbar is the expected mean, Cx is the covariance matrix of x, and Cz is the covariance matrix of the noise.

For example, in my MMSE equation for the OPRs, xbar is just Oave (but you could have it be a vector with team specific expectations). A* xbar is just the average match outcome which I have as 3*Oave (but again, if you expect xbar to be team specific then that would cause non-constant match mean scores). Cz is just sig2n * I and Cx is just sig2o * I. I plugged these in and simplified the equations. But if things are more complicated (like with EPR), then you just plug in whatever complicated assumptions you have and go from there.

It would be neat if we could study the best predictor of a team's OPR at championships from the OPR they had in their last regional before championships using data from previous years. We'd probably come up with a mean and variance of this best predictor, and then we could plug these in and have some expectations for what championships would look like even before the first match was played. Then as matches are played, the values update to include the new information using the MMSE equation with changing A and Mo.
Attached Thumbnails
Click image for larger version

Name:	MmseEqns.PNG
Views:	23
Size:	6.6 KB
ID:	19111  
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #13   Spotlight this post!  
Unread 09-06-2015, 21:20
Ether's Avatar
Ether Ether is offline
systems engineer (retired)
no team
 
Join Date: Nov 2009
Rookie Year: 1969
Location: US
Posts: 8,038
Ether has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond reputeEther has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats


The dust seems to have settled in this thread so I thought I'd toss this out for discussion.
Attached Files
File Type: zip a priori data adjustment.zip (78.4 KB, 15 views)
Reply With Quote
  #14   Spotlight this post!  
Unread 10-06-2015, 06:19
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by Ether View Post

The dust seems to have settled in this thread so I thought I'd toss this out for discussion.
Neat idea!

I think this is essentially minimizing another "mixture of errors" like the original EPR proposal. In this case, you're minimizing the sum of the standard error measure that leads to your normal parameters plus the squared error between the parameters and your a priori expectations of them, with a weighting factor added in to adjust the importance of the original measure and your new second measure. If you form your error measure this way, then take the derivative of the error squared, set the result equal to zero, and solve for the parameters, you end up with the equation that you're solving.

That's another way to incorporate a priori expectations of the means, like the MMSE estimates do. And if you keep the weights constant, it would progressively go from a priori info at the start of a tournament to incorporating mainly match info later in a tournament, again like the MMSE estimates do. About the only thing that would be hard to do using this method would be incorporating a priori variance estimates, but then I suspect that this "feature" of the MMSE estimates has limited practical utility anyway.

Did you do any testing of this method to see how it does and how well it compares?

If you were so inclined, I'd love to see a database with the following data for Championships for the past 3 years:

. Ar, Ab, Mr, Mb for the CMP divisions (which we already have because you already generated them: thanks again!)

. OPR, OPRm1, and OPRm3 for all of the regional tournaments that all of the CMP teams played in that year, and perhaps the mean and variance of each of these statistics at the respective tournaments. I'm calling OPRm1 and OPRm3 the MMSE estimates with Var(N)/Var(O) estimated at 1 and 3 respectively.

Ideally, there might be csv files with a row for each team in each CMP division (in the same order as the columns of the corresponding CMP Ar and Ab matrices) and columns with a 0 if the team didn't play in that week or a number if the team did play (e.g., OPR, or mean OPR for that tournament that week, etc). So, for example, for 2014 archi, there might be the following files, each containing 100 rows (one per team) by 7 columns (one per week):

ARCHI_OPR
ARCHI_OPRm1
ARCHI_OPRm3
ARCHI_OPR_mean
ARCHI_OPRm1_mean
ARCHI_OPRm3_mean
ARCHI_OPR_var
ARCHI_OPRm1_var
ARCHI_OPRm3_var


Any chance you'd be up for generating this data?


Given this data, we could test out different methods for predicting the match outcomes of the CMP divisions. I suggest that it would be interesting to try to predict

A. The results of CMP matches completely based off the results from previous tournaments.

B. The results of the last 3/4 of the CMP matches based off the results from previous tournaments AND the first 1/4 of the CMP matches.

C. The results of the last 1/2 of the CMP matches based off the results from previous tournaments AND the first 1/2 of the CMP matches.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 10-06-2015 at 10:44.
Reply With Quote
  #15   Spotlight this post!  
Unread 10-06-2015, 11:16
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

I think this might just be mmse the more I think about it. Plug Oave in for your a priori means and stdevN /stdevO in for your weights and solve and I think you just get the mmse equations. I'm out now and typing on my phone but will look into it more later.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
Reply


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 21:39.

The Chief Delphi Forums are sponsored by Innovation First International, Inc.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi