Go to Post I will never forget Dean's opening lines: "I want you to know that I am here under protest. I believe that everything you are doing here is repugnant and ethically wrong. And now I am going to tell you why..." And for the next 35 minutes, that is exactly what he did. - dlavery [more]
Home
Go Back   Chief Delphi > FIRST > General Forum
CD-Media   CD-Spy  
portal register members calendar search Today's Posts Mark Forums Read FAQ rules

 
Reply
Thread Tools Rating: Thread Rating: 4 votes, 5.00 average. Display Modes
  #16   Spotlight this post!  
Unread 10-06-2015, 11:52
jtrv's Avatar
jtrv jtrv is offline
github.com/jhtervay
AKA: Justin
FRC #2791 (Shaker Robotics)
Team Role: College Student
 
Join Date: Jan 2013
Rookie Year: 2012
Location: Latham, NY
Posts: 148
jtrv is a name known to alljtrv is a name known to alljtrv is a name known to alljtrv is a name known to alljtrv is a name known to alljtrv is a name known to all
Re: Overview and Analysis of FIRST Stats

hi all,

i'm not nearly on your level of mathematics, as i'm only about to graduate high school. but for my final project of my AP statistics class, i did a study on FRC playoff performances by alliances ranked 1-8.

there are a lot of problems with my study and i'm not afraid to admit it in terms of hypothesis testing and whatnot, so i'll just provide some statistics that i found interesting. if anyone wants the python script i used to download the data i can provide it.

The #1 seed alliances won 71% of regionals, districts, and district championships this year (so it does not include world champ data nor exhibition data).

The #1 seed alliances failed to advance past quarterfinals in just 4 of 109 events.

For the second place alliance, the #1 and #2 seeds made up 54% of those. This does not account for which alliance finished first.

The #8 alliances did not win a single event. Also, the #8 alliances made it to finals only 4 times.

The #7 alliance won a single event (Finger Lakes / NYRO). The #7 alliances finished second place only twice.

The #3, #4, #5, and #6 seeds were victorious in 5, 2, 4, and 5 events overall, respectively.

Thus, the #1 and #2 alliances contributed to 84.4% of event victories.

The mean points per game for all 8 alliances (total number of points worldwide divided by total number of games worldwide) were:

145, 108, 90, 80, 75, 75, 70, 60. (these are rough numbers as i'm reading them off a png image and i'm too lazy to download the excel file again, heres some more stats on points per game overall: https://i.imgur.com/dgIYLrQ.png )

some interesting stuff i guess. I'd be interested to do more with OPR/CCWM/CPR/EPR and whatnot once I get more experience with maths. might be a little bit though
__________________
2791 (2012-2017)
Reply With Quote
  #17   Spotlight this post!  
Unread 10-06-2015, 12:19
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by wgardner View Post
I think this might just be mmse the more I think about it. Plug Oave in for your a priori means and stdevN /stdevO in for your weights and solve and I think you just get the mmse equations. I'm out now and typing on my phone but will look into it more later.
Yep, I just ran through the equations and Ether's doc is just MMSE estimation where the expected means are given and where the weights are the ratios of the standard deviation of the match noise over the standard deviation of the variance of your estimate of the mean. So Ether's document is just another way of describing MMSE with a priori info!
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #18   Spotlight this post!  
Unread 10-06-2015, 22:03
AGPapa's Avatar
AGPapa AGPapa is offline
Registered User
AKA: Antonio Papa
FRC #5895
Team Role: Mentor
 
Join Date: Mar 2012
Rookie Year: 2011
Location: Robbinsville, NJ
Posts: 323
AGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

I did some tests on how apriori information can improve the estimation. I wanted to see how these stats would do in predicting the outcomes for the last day of an event (to project the standings). The following data is from the four 2014 Championship Divisions. For each division 150 qual matches were played on the first two days, leaving 17 on the last one. Since the event was basically complete, LS OPR did pretty well. I suspect that for an event with more matches on the last day (like a District Championship) it would perform worse.

I used the LS “World OPR” as the prior for one of the tests. For another I averaged Oavg and World OPR (essentially regressing them to the mean). All of the MMSE calculations were done with VarN/VarO at 2.5

LS OPR (no prior): 69.12%
MMSE OPR (Oavg prior): 69.12%
MMSE OPR (World OPR prior): 72.06%
MMSE OPR (regressed World OPR prior): 72.06%

Adding prior information improved the predictions by a couple of matches, I’ll try this again later for a District Championship and see how it turns out.
Attached Files
File Type: zip MMSE.zip (98.4 KB, 1 views)
__________________
Team 2590 Student [2011-2014]
Team 5684 Mentor [2015]
Team 5895 Mentor [2016-]
Reply With Quote
  #19   Spotlight this post!  
Unread 11-06-2015, 06:24
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
I did some tests on how apriori information can improve the estimation. I wanted to see how these stats would do in predicting the outcomes for the last day of an event (to project the standings). The following data is from the four 2014 Championship Divisions. For each division 150 qual matches were played on the first two days, leaving 17 on the last one. Since the event was basically complete, LS OPR did pretty well. I suspect that for an event with more matches on the last day (like a District Championship) it would perform worse.

I used the LS “World OPR” as the prior for one of the tests. For another I averaged Oavg and World OPR (essentially regressing them to the mean). All of the MMSE calculations were done with VarN/VarO at 2.5

LS OPR (no prior): 69.12%
MMSE OPR (Oavg prior): 69.12%
MMSE OPR (World OPR prior): 72.06%
MMSE OPR (regressed World OPR prior): 72.06%

Adding prior information improved the predictions by a couple of matches, I’ll try this again later for a District Championship and see how it turns out.
Thanks, AGPapa! I wonder if you could do the following:

1. Could you report the average squared error in the scores and/or winning margin, in addition to the probability of correct match outcome?

2. Your runs were using 150 values as the "training set" and 17 as the "testing set". Could you run them with other splits, like using 50 or 100 matches as training and the remaining part as testing? It would be interesting to see how the different techniques perform as the tournament progresses.

Cheers.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 11-06-2015 at 06:41.
Reply With Quote
  #20   Spotlight this post!  
Unread 11-06-2015, 09:38
AGPapa's Avatar
AGPapa AGPapa is offline
Registered User
AKA: Antonio Papa
FRC #5895
Team Role: Mentor
 
Join Date: Mar 2012
Rookie Year: 2011
Location: Robbinsville, NJ
Posts: 323
AGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by wgardner View Post
Thanks, AGPapa! I wonder if you could do the following:

1. Could you report the average squared error in the scores and/or winning margin, in addition to the probability of correct match outcome?

2. Your runs were using 150 values as the "training set" and 17 as the "testing set". Could you run them with other splits, like using 50 or 100 matches as training and the remaining part as testing? It would be interesting to see how the different techniques perform as the tournament progresses.

Cheers.
Here's the average squared error in the match scores.

LS OPR (no prior): 3555.75
MMSE OPR (Oavg prior) : 3088.70
MMSE OPR (World OPR prior): 2973.91
MMSE OPR (regressed World OPR prior): 3000.04


I'll work on the 50 and 100 match splits and put them up later.

EDIT: I think I should elaborate a bit more on how I'm doing the priors. I've been adding a constant to everyone's World OPR so that the average of the OPRs is equal to Oavg. Essentially I'm saying that every robot improved by a constant. (These constants are very small, only Archimedes was greater than 1.5 points)
__________________
Team 2590 Student [2011-2014]
Team 5684 Mentor [2015]
Team 5895 Mentor [2016-]

Last edited by AGPapa : 11-06-2015 at 09:46.
Reply With Quote
  #21   Spotlight this post!  
Unread 13-06-2015, 10:47
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Check out the attached image. It has a lot of data in it. Here's the explanation:

I simulated a tournament using the 2014 casa structure with the underlying O components having mean 25 and stdev 10, as in most of my sims. I set the D component to 0 (no defense) and the match noise component N to 3 (so random variability roughly equals the variability due to the 3 offensive teams).

The top left plot shows the LS OPRs after each team has played 1-12 matches. X=13 corresponds to the actual, underlying O values.

Across the top row is the standard MMSE OPRs with Var(N)/Var(O) estimated at 1, 3, and 5. You can see that the OPRs start progressively tighter at X=1 and then branch out, with 3 and 5 branching out more slowly.

So the top row is basically the same stuff from the paper, just MMSE OPR estimation. It's assuming an apriori OPR distribution with Oave mean and varying standard deviation for all of the OPRs.

Now, the plots on the 2nd and 3rd rows are MMSE estimation but where we have additional guesses about what the specific OPRs should be before the tournament starts (!!). For example, this could happen if we try to estimate the OPRs before the tournament from previous tournament results.

The 2nd row assumes that we have an additional estimate for each team's OPR with the same standard deviation of all of the OPRs. So for example, the underlying Os are chosen with mean 25 and stdev 10. Row 2 assumes that we have additional guesses of the real underlying O values, but the guesses have random noise added to them with a stdev of 10 also. So they're noisy guesses, similar to what we might have if we predicted the championship OPRs from the teams' OPRs in previous tournaments.

The 3rd row is the same, but the extra guess has a standard deviation of only 0.3 of the stdev of the actual underlying O. In this example, we have another guess of the real O values with standard deviation of only 3 before the tournament starts.

The left center plot shows how well the techniques do at estimating the true, underlying O values over time. You can see the clumping on the left based on how much knowledge was known ahead of time.

The left bottom plot shows how well the techniques do at estimating the match scores. Note that with Var(N)/Var(O)=3, the best we should be able to do is 50% so the fact that we're just a bit under 50% is an artifact of this being only 1 simulation run. Again, you can see the clustering on the left of the plot based on how much apriori info was known.

For the most part, the results are pretty similar at the end of the tournament, but you can clearly see the advantage of the apriori inforomation at the start of the tournament.

If all you ever use OPRs for is for alliance selection at the end of the tournament, then there's not much advantage to going with MMSE over LS. But if you would use live stat information to plan out your strategy in upcoming matches during a tournament, then MMSE could provide you with benefits.

So, the question is: how good of an estimate can we get for the OPRs before a tournament starts based on previous tournament data?
Attached Thumbnails
Click image for larger version

Name:	AprioriExample.PNG
Views:	33
Size:	115.7 KB
ID:	19127  
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #22   Spotlight this post!  
Unread 13-06-2015, 19:17
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 992
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

By using priors, you are entering in Bayesian statistics. This is very interesting analysis, and is the actual way that we do statistical analysis. It's been quite a while since I studied the Bayesian technique, but here's a primer.
__________________
Reply With Quote
  #23   Spotlight this post!  
Unread 14-06-2015, 07:30
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by Citrus Dad View Post
By using priors, you are entering in Bayesian statistics. This is very interesting analysis, and is the actual way that we do statistical analysis. It's been quite a while since I studied the Bayesian technique, but here's a primer.
Yes. In fact the MMSE techniques described in the paper are also called "Bayesian Linear Regression" by some. Compare the equations for mu_n in that link with the MMSE equations in the paper and you'll see that they're basically the same.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #24   Spotlight this post!  
Unread 14-06-2015, 15:59
AGPapa's Avatar
AGPapa AGPapa is offline
Registered User
AKA: Antonio Papa
FRC #5895
Team Role: Mentor
 
Join Date: Mar 2012
Rookie Year: 2011
Location: Robbinsville, NJ
Posts: 323
AGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
I'll work on the 50 and 100 match splits and put them up later.
I've calculated the errors in match score predictions for every 10th split from 50 to 150 matches.

The X axis in the attached chart is the number of matches in each division used as training set. The Y axis is the square root of the averaged squared error per match in the testing set (the remaining matches).

You can see that adding apriori estimations makes the predictions better across the board.


Interestingly, the MMSE (Oavg prior) estimation starts out pretty good at 50 matches (a little under a third of the way through). I wonder why Will's simulated results are different? You can see that the gradual improvements still exist, just as in the simulation (except for the last datapoint. I think that on the last day teams played differently than before).
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	26
Size:	41.3 KB
ID:	19138  
__________________
Team 2590 Student [2011-2014]
Team 5684 Mentor [2015]
Team 5895 Mentor [2016-]

Last edited by AGPapa : 14-06-2015 at 16:16.
Reply With Quote
  #25   Spotlight this post!  
Unread 14-06-2015, 16:23
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
I've calculated the errors in match score predictions for every 10th split from 50 to 150 matches.

The X axis in the attached chart is the number of matches in each division used as training set. The Y axis is the square root of the averaged squared error per match in the testing set (the remaining matches).

You can see that adding apriori estimations makes the predictions better across the board.


Interestingly, the MMSE (Oavg prior) estimation starts out pretty good at 50 matches (a little under a third of the way through). I wonder why Will's simulated results are different? You can see that the gradual improvements still exist, just as in the simulation (except for the last datapoint. I think that on the last day teams played differently than before).
Thanks AGPapa! As always, I have a few questions:

Can you elaborate on exactly what you mean by Oavg prior and World OPR prior? Does Oavg prior mean that you're setting them all to the same average a priori guess? Does World OPR prior mean that you're setting the a priori guesses to their final OPRs (as if you knew ahead of time what the OPRs actually were)?

I might expect that using the actual Worlds OPRs as priors would be uniformly good, as you're telling the algorithm ahead of time what the final guess would be (if I'm understanding you correctly, which I might not be). But I'm surprised that just setting the priors to be the overall average gives such uniform results.

Since your sim is probably easy to modify now (?), could you start it all off at 10 matches instead of 50?

Also, it doesn't surprise me that things get a little noisy at the very end because at that point you're only averaging the few matches left. They could just happen to be easy to predict or hard to predict and now there's only a few of them, so it seems entirely reasonable that they might move a bit from the trendline.
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #26   Spotlight this post!  
Unread 14-06-2015, 16:46
AGPapa's Avatar
AGPapa AGPapa is offline
Registered User
AKA: Antonio Papa
FRC #5895
Team Role: Mentor
 
Join Date: Mar 2012
Rookie Year: 2011
Location: Robbinsville, NJ
Posts: 323
AGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by wgardner View Post
Thanks AGPapa! As always, I have a few questions:

Can you elaborate on exactly what you mean by Oavg prior and World OPR prior? Does Oavg prior mean that you're setting them all to the same average a priori guess? Does World OPR prior mean that you're setting the a priori guesses to their final OPRs (as if you knew ahead of time what the OPRs actually were)?

I might expect that using the actual Worlds OPRs as priors would be uniformly good, as you're telling the algorithm ahead of time what the final guess would be (if I'm understanding you correctly, which I might not be). But I'm surprised that just setting the priors to be the overall average gives such uniform results.

Since your sim is probably easy to modify now (?), could you start it all off at 10 matches instead of 50?

Also, it doesn't surprise me that things get a little noisy at the very end because at that point you're only averaging the few matches left. They could just happen to be easy to predict or hard to predict and now there's only a few of them, so it seems entirely reasonable that they might move a bit from the trendline.
Ah, sorry for the confusion! "World OPR" refers to the OPR prior to the Championship event. It combines all regionals into one big "A" and "b" matrices and solves it. I just took that number from Ed Law's spreadsheet. It doesn't take any data from matches that haven't been played yet.

Oavg is just setting the priors to the average score of matches in the training set divided by 3.

I've attached the chart going back to match 10.
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	20
Size:	40.9 KB
ID:	19139  
__________________
Team 2590 Student [2011-2014]
Team 5684 Mentor [2015]
Team 5895 Mentor [2016-]

Last edited by AGPapa : 14-06-2015 at 16:54.
Reply With Quote
  #27   Spotlight this post!  
Unread 14-06-2015, 17:12
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
Ah, sorry for the confusion! "World OPR" refers to the OPR prior to the Championship event. It combines all regionals into one big "A" and "b" matrices and solves it. I just took that number from Ed Law's spreadsheet. It doesn't take any data from matches that haven't been played yet.

Oavg is just setting the priors to the average score of matches in the training set divided by 3.

I've attached the chart going back to match 10.
Thanks! That clears things up a bit. And these are all still with Var(N)/Var(O)=2.5 like before? Can you post a link to Ed Law's spreadsheet? (Sorry, I'm really an FTC guy coming in late to the CD FRC forums.)

Compare your chart with the bottom left chart of this (from a few posts ago, and from a simulated 2014 casa tournament). The blue lines at the bottom of that chart are if the OPRs are known to within a standard deviation of 0.3 times the overall standard deviation of all of the OPRs, and the red/pink/yellow lines are if the OPRs are known to within 1.0 times the standard deviation of all of the OPRs. Your bottom chart looks like it could be somewhere between those two. (Also note that my chart is percent of the variance of the error in the match result prediction, whereas yours is the absolute (not percent), standard deviation (not variance) of the error in the match result prediction, so they're just a bit different that way. As the stdev is the square root of the variance, I would expect that the stdev plot to be flatter than the variance plot, as it seems to be.)

Could you compute the standard deviation of the error in your "OPR prediction" (i.e., compute the OPR from worlds minus the OPR from the previous regional tournaments, and take the standard deviation of the result)? And then compare that to the standard deviation of all of the OPRs from all of the teams at Worlds (i.e., just compute all of the OPRs from Worlds for all of the teams and compute the standard deviation of those numbers). It would be interesting to know the ratio of those two numbers and how it compares to the plots in my simulated chart where the ratios are 1 and 0.3 respectively.

And I guess while I'm asking: what was the standard deviation of the match scores themselves?
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 14-06-2015 at 17:29.
Reply With Quote
  #28   Spotlight this post!  
Unread 15-06-2015, 15:58
AGPapa's Avatar
AGPapa AGPapa is offline
Registered User
AKA: Antonio Papa
FRC #5895
Team Role: Mentor
 
Join Date: Mar 2012
Rookie Year: 2011
Location: Robbinsville, NJ
Posts: 323
AGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond reputeAGPapa has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by wgardner View Post
Thanks! That clears things up a bit. And these are all still with Var(N)/Var(O)=2.5 like before? Can you post a link to Ed Law's spreadsheet?
Yes, still with VarN/VarO = 2.5. This was chosen fairly arbitrarily, better numbers probably exist.

Here's Ed Law's database. He has the LS OPRs for every regional/district since 2009. He's also has the "World OPRs" for each team in each year in the Worldrank tab.

Quote:

Compare your chart with the bottom left chart of this (from a few posts ago, and from a simulated 2014 casa tournament). The blue lines at the bottom of that chart are if the OPRs are known to within a standard deviation of 0.3 times the overall standard deviation of all of the OPRs, and the red/pink/yellow lines are if the OPRs are known to within 1.0 times the standard deviation of all of the OPRs. Your bottom chart looks like it could be somewhere between those two. (Also note that my chart is percent of the variance of the error in the match result prediction, whereas yours is the absolute (not percent), standard deviation (not variance) of the error in the match result prediction, so they're just a bit different that way. As the stdev is the square root of the variance, I would expect that the stdev plot to be flatter than the variance plot, as it seems to be.)
Thanks, that makes a lot of sense.

Quote:

Could you compute the standard deviation of the error in your "OPR prediction" (i.e., compute the OPR from worlds minus the OPR from the previous regional tournaments, and take the standard deviation of the result)? And then compare that to the standard deviation of all of the OPRs from all of the teams at Worlds (i.e., just compute all of the OPRs from Worlds for all of the teams and compute the standard deviation of those numbers). It would be interesting to know the ratio of those two numbers and how it compares to the plots in my simulated chart where the ratios are 1 and 0.3 respectively.

And I guess while I'm asking: what was the standard deviation of the match scores themselves?
I'm having some difficulty understanding what you're asking for here, but here's what I think you're looking for.

std dev of Previous OPR (LS) - Champs OPR (LS): 18.9530
std dev of Champs OPR (LS): 25.1509
std dev of match scores: 57.6501


I want to point out that the previous OPR is not an unbiased estimator of the Champs OPR. Champ OPRs are higher by 2.6 on average. (In my MMSE calculations I added a constant to all of the priors to try and combat this).


EDIT: I think we can use this to find a good value for VarN/VarO.

Var(Match Score) = Var(O) + Var(O) + Var(O) + Var(N)
Assuming that Var(Champs OPR) = Var(O) then we can solve the above equation and get that Var(N) = 1425.8, so Var(N)/Var(O) is 2.25. Now this is only useful after the fact, but it confirms that choosing VarN/VarO to be 2.5 wasn't that far off.
__________________
Team 2590 Student [2011-2014]
Team 5684 Mentor [2015]
Team 5895 Mentor [2016-]

Last edited by AGPapa : 15-06-2015 at 16:05.
Reply With Quote
  #29   Spotlight this post!  
Unread 15-06-2015, 17:13
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

Quote:
Originally Posted by AGPapa View Post
Yes, still with VarN/VarO = 2.5. This was chosen fairly arbitrarily, better numbers probably exist.

Here's Ed Law's database. He has the LS OPRs for every regional/district since 2009. He's also has the "World OPRs" for each team in each year in the Worldrank tab.



Thanks, that makes a lot of sense.



I'm having some difficulty understanding what you're asking for here, but here's what I think you're looking for.

std dev of Previous OPR (LS) - Champs OPR (LS): 18.9530
std dev of Champs OPR (LS): 25.1509
std dev of match scores: 57.6501


I want to point out that the previous OPR is not an unbiased estimator of the Champs OPR. Champ OPRs are higher by 2.6 on average. (In my MMSE calculations I added a constant to all of the priors to try and combat this).


EDIT: I think we can use this to find a good value for VarN/VarO.

Var(Match Score) = Var(O) + Var(O) + Var(O) + Var(N)
Assuming that Var(Champs OPR) = Var(O) then we can solve the above equation and get that Var(N) = 1425.8, so Var(N)/Var(O) is 2.25. Now this is only useful after the fact, but it confirms that choosing VarN/VarO to be 2.5 wasn't that far off.
Yes, that all makes sense and looks good. The ratio of the stdev of the opr guess/ the stdev of the oprs themselves was 18.95/25.15 = 0.75 which is between the values of 1.0 and 0.3 that I used in my sims, so it seems consistent that your plot shape is sort of between those two.

I also renormalize the oprs to agree with the new average, as you described.

I found that the variance of the ls oprs was often a bit above the variance of the true underlying o values because of the slight overfitting, so if anything you might be slightly underestimating the variance of n. 2.5 - 3.0 is probably a good guess.

Thanks for all of the great data!
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.
Reply With Quote
  #30   Spotlight this post!  
Unread 29-02-2016, 01:33
Citrus Dad's Avatar
Citrus Dad Citrus Dad is offline
Business and Scouting Mentor
AKA: Richard McCann
FRC #1678 (Citrus Circuits)
Team Role: Mentor
 
Join Date: May 2012
Rookie Year: 2012
Location: Davis
Posts: 992
Citrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond reputeCitrus Dad has a reputation beyond repute
Re: Overview and Analysis of FIRST Stats

We're digging into your stats paper. It's quite useful for thinking about this issue.

However you wrote on p. 9:

Defense - Average of Opponent’s Match Scores
One simple way to measure a team’s defensive power is to simply average their opponent’s match scores.

That's not correct--it measures the average offensive output of the opponents in those matches. This is the error also of the assertion that OPR-CCWM=DRP as a defensive metric. The DPR is actually the OPR of the opposing teams in those matches played. A measure of defense will measure how much that average output differs from the average offense of those opponents in other matches.

(I'll have other thoughts as I go through this paper.)
__________________
Reply With Quote
Reply


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 03:59.

The Chief Delphi Forums are sponsored by Innovation First International, Inc.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi